Converting a PDF to Excel

Posted by: visuvius

Converting a PDF to Excel - 29/01/2003 17:34


Does anyone know how to do this without having to buy Adobe Acrobat 5.0? There is a rather large price sheet in PDF format here at work that i needed converted to excel. The only method i've seen is the one here.

We have Acrobat 4.0 here, but that page says you need 5.0.
Posted by: AndrewT

Re: Converting a PDF to Excel - 29/01/2003 17:52

I don't know how to do this but I do have full Acrobat 5.0 installed at work. I'll be there in about 8 hours from now and I'm happy to try and convert it for you using those instructions if you're still stuck.
Posted by: muzza

Re: Converting a PDF to Excel - 30/01/2003 03:31

If you have some OCR software you could print it, scan it and format it. It would take a while though.
Could you request the info as a spreadsheet from the supplier?
Posted by: visuvius

Re: Converting a PDF to Excel - 30/01/2003 09:28

We have requested it as a spreadsheet and they've been pretty unhelpful on that front. I hope this isn't a testament to their future service.

I hadn't thought of the scan and formating angle. I suppose that would be a littl better than manualy entereing in the info.
Posted by: tfabris

Re: Converting a PDF to Excel - 30/01/2003 11:49

I hadn't thought of the scan and formating angle.

With the right software tricks, you could probably get the PDF to print to a bitmap image file instead of a printer, then get the OCR software to read that file. Thus saving the analog/realworld step and getting a better image and a more accurate OCR.

One other question...

It's not as simple as selecting all the text in the PDF document, hitting "Edit/Copy", pasting it into a text file, and importing that text file into Excel (with the proper formatting instructions)... is it?
Posted by: jasonc

Re: Converting a PDF to Excel - 30/01/2003 13:32

In reply to:

It's not as simple as selecting all the text in the PDF document, hitting "Edit/Copy", pasting it into a text file, and importing that text file into Excel (with the proper formatting instructions)... is it?

Depends on the Document Security set by the person who created the document, but is techinically possible.
Posted by: lectric

Re: Converting a PDF to Excel - 30/01/2003 13:33

It should be that simple.... You will still likely have to do some cleaning up after, but it is a much better way.
Posted by: lectric

Re: Converting a PDF to Excel - 30/01/2003 13:34

Oh, and I'd paste into wordpad first, to set up the delimiters better.
Posted by: tfabris

Re: Converting a PDF to Excel - 30/01/2003 13:42

Screw the intermediary program.

I just typed some random columns of data into a text editor, hit edit/copy, and then hit edit/paste in excel and it dropped in perfectly.

So if the text in the PDF document can be selected and copied to the clipboard, just paste directly into excel.
Posted by: lectric

Re: Converting a PDF to Excel - 30/01/2003 13:45

That works assuming it's standard length data. If some of the fields have extra spaces or tabs, it could be a problem. Then again, you're going to have to fix it regardless, so I gess the middle step isn't really necessary.
Posted by: AndrewT

Re: Converting a PDF to Excel - 30/01/2003 14:55

It's not as simple as selecting all the text in the PDF document, hitting "Edit/Copy", pasting it into a text file, and importing that text file into Excel (with the proper formatting instructions)... is it?

You should be able to do pretty much what you said. visuvius even linked to some instructions specially geared towards exporting Acrobat tables to a .txt file but it's not that simple in this instance.

I spent 10-mins today trying with his file and there's something wierd going on;
- Acrobat's 'edit/copy' errors with something like "cannot copy selection to clipboard"
- The special table select then save-as .txt file method just produces a file full of non-printing characters

Illustrator 10 warns of a missing font in the .pdf and makes a mess of displaying it which I think is a major clue. I suspect this file was generated by a non-Adobe application and there's a compatibility problem with it's file format.

I'll be having another play with it @ work tomorrow
Posted by: wfaulk

Re: Converting a PDF to Excel - 30/01/2003 15:02

    Acrobat's 'edit/copy' errors with something like "cannot copy selection to clipboard"
IIRC, that's because it's specifically denied. Check File->Document Security.
    I suspect this file was generated by a non-Adobe application
Check File->Document Properties->Summary.