读取word文档得使用com组件:Microsoft Word 12.0 object library
由于office的本不同,组件的版本也会不同,我电脑上装的office2007
所以版本是12.0
使用该组件提供的类和方法来读取Word文档
首先在项目中添加com引用:
然后可以用Microsoft.Office.Interop.Wrod.Application来访问word文档
using System; using System.Collections.Generic; using System.Linq; using System.Text; using Word = Microsoft.Office.Interop.Word; namespace Read_doc { class Program { static void Main(string[] args) { //Word.ApplicationClass doc = new Microsoft.Office.Interop.Word.ApplicationClass(); try { Word.Application app = new Microsoft.Office.Interop.Word.Application(); Word.Document doc = null; object unknow = Type.Missing; app.Visible = true; string str = @"E:\1.doc"; object file = str; doc = app.Documents.Open(ref file, ref unknow, ref unknow, ref unknow, ref unknow, ref unknow, ref unknow, ref unknow, ref unknow, ref unknow, ref unknow, ref unknow, ref unknow, ref unknow, ref unknow, ref unknow); string temp = doc.Paragraphs[1].Range.Text.Trim(); Console.WriteLine(temp); } catch (Exception ex) { Console.WriteLine(ex.Message); } } } }
doc.Paragraphs[i]表示分段读取
如果想读取一句或者整篇内容则可以使用doc.Sentences[i]或doc.content;