Author Options:

Any recommendations for/against PDF to Word converters? Answered

I need to convert an existing PDF (LaTeX source) document into MS Word format.

A simple Google search (http://www.google.com/#hl=en&q=convert+pdf+to+word) brings up a number of options, both Web based and installable. Having never done this before, I have no sense of good, bad, dangerous, crappy, whatever.

I'd very much appreciate input from any experts who have done this before (Jake?), especially Macophiles.


Actually the pdf to word conversion depends on much how the pdf was created. If it was created from images then the conversion may fail with many converters. Then in that case, you can only turn to OCR program to extract text from pdf. Above it's my experience of this issue.
Now normally I use this pdf to word converter, it can deal with the conversion at relatively good quality that a large part of content from pdf can be preserved in output word files. Though it’s not shareware but it's at low price so just free try it!

Thank you very much! I'll give it a try, and maybe add it to my list of tools.

Thanks! At $129, it's a bit pricey for me. I think the basic problem I'm having is that Word doesn't have any idea how to deal with math equations (e.g., the particle decays and names in my CV). So the converters just don't have anything into which to translate the PDF.

Have you tried converting it to rtf or html and then to .doc?

Hmmm...that's an interesting option. The HTML generator I have (LaTeX2Html) generates little GIFs or PNGs for all of the "equation" stuff. That would end up in a hypothetical Word document as images, rather than actual fonts. It may end up with me hand-copying the thing into Word :-(

Hmm, how critical is formatting? Must it be 10 point same fonts, same margins, etc or is it slightly flexible?

I suppose it depends upon what type of pdf it is?
If it's a scanned image you'd need OCR or similar, if it's a conversion from Word, HTML, Excel it'd be different. What have you got?


My original file is LaTeX, which I processed into DVI and thence used dvipdf to get the PDF version (and dvips for the native PostScript version).

:-) Thanks. Yup, that first link showed up on my Google search as well. It (like the downloadable utility GoodHart found) can't handle the special characters and equations from LaTeX. This is the main reason we don't use Word for scientific work. I'm not sure what that second link is. It looks to me like an intro to LaTeX (with which I am already quite familiar :-).

It's a LaTex pdf, I sought one out to test the first link. L

Also bear in mind that it isn't possible to convert to and from PDF with 100% same results...you can get awfully close, though, and if you've got all the content you can generally fix any goopy formatting.

However, I recommend importing the PDF into the program, and not converting it with some online converter. That's how I've done it, anywho. :-)

If you're willing to boot into ubuntu see this: http://embraceubuntu.com/2007/04/10/convertimport-from-pdf-and-keep-the-formatting/

(Sorry not to be more helpful; generally I'm converting the other way and I use Wordperfect rather than MS Word...)

I'm on MacOSX normally, and our servers at SLAC are RedHat Linux.

Ive never coverted a PDF to a word file. I have set up a word file to resemble a pdf before, where by using a combination of inserted jpeg pictures and copied text. Ive sampled many of the googled converters for word to pdf, aswell as publisher to pdf, and although they go the wrong way to what your looking for, the results were never accurate enough for proper use. I think one can assume that the same can be expected.. Either way, I think your easiest/safest option is to make a less 'good' version from scratch by copying and pasting.

This is straight text (see my reply to Lemonie, above). At the risk of self-promotional spamming, the actual file is http://www.slac.stanford.edu/~kelsey/CV/resume.pdf .

Reproducing it in Word means I have to learn all of Word's crappy layout stuff, which LaTeX just does in a natural and elegant way...

Question: does it have to be MS Word, or does it merely need to be editable (in a popular/convenient format)?

It has to be MS Word. The job-search Web site dice.com requires that resumes be uploaded as MS Word .doc files, not as PDF.