Acrobat Help

Posted by: Dignan

Acrobat Help - 17/12/2002 08:59

A common issue with Acrobat: file size.

I've been scanning in documents (B&W, letter sized). I then import them into Acrobat. These files will go on my company's web site. Unfortunately I can't get a proper size/quality ratio. I have example files from another company that they emailed to us. For example, I have a very nice quality, crisp B&W, 10 page pdf from them that is only 500KB. That's great, and I'd like to achieve that. However, the best I can do is a poor quality 6 page document that's about 400KB. So basically, I'm doing about 1/4 of the result I want. Unfortunately, I don't know Acrobat well enough.

I've been using Irfanview to batch convert the images with countless different settings. I've shrunk the sizes of the images to a variety of sizes, I've tried changing them from 2 to 256 colors, and I've tried saving them as BMP's, JPEG's, and TIFF's. Suprisingly, the BMP's worked best, probably because that's what they were originally, but the JPEG's were the largest by an extremely wide margin.

What am I doing wrong here?
Posted by: tfabris

Re: Acrobat Help - 17/12/2002 09:04

That 500k small PDF file... is it a scanned bitmap image? Or just plain text?

Bitmap images take more storage space than typed text. Did you ever consider OCR? Or typing?
Posted by: Dignan

Re: Acrobat Help - 17/12/2002 09:12

Oops, I kept trying to attach them untill I realized they're too big.

I'm not sure about that. Would there be something in the file information that would tell me if it was or not?

*edit*
Here they are. the one called "cuna.pdf" is the good one, and mine is the crappy one.
Posted by: robricc

Re: Acrobat Help - 17/12/2002 09:20

Are you capturing the pages?
Tools -> Paper Capture -> Capture Pages...
Posted by: Dignan

Re: Acrobat Help - 17/12/2002 09:27

Hmm, interesting. No, I wasn't (I'm downloading the plugin at the moment, it wasn't installed). I was opening the first image of the series, then telling it to insert additional pages (for some reason, when you try to open multiple images, they each open as seperate pdf files). This is in Acrobat 5. In 4, there was an image option under the File>Import menu, but that has been removed for some reason.
Posted by: robricc

Re: Acrobat Help - 17/12/2002 09:29

Capturing the pages is what converts the text on the scanned page from a graphic to text using OCR. This should save you a bunch of space.
Posted by: matthew_k

Re: Acrobat Help - 17/12/2002 09:37

What you really want to do is get a copy of the files, and print them out to adobe acrobat. The file quality/size will be much better, and you'll be able to highlight text and search text. I'm not sure how well the OCR option rob mentioned is, that might be happy medium.

Matthew
Posted by: Dignan

Re: Acrobat Help - 17/12/2002 10:47

Wow, that paper capture thing did not do well. My document ended up looking like a ransom note

How to I print to Acrobat? I can't find any options in my other programs.
Posted by: wfaulk

Re: Acrobat Help - 17/12/2002 10:58

Get a MacOS X machine or purchase Acrobat. I think that there are a few other less costly options, but I don't remember them right now.

Of course, you could write your docs in TeX and use its (now) built-in PDF output option.
Posted by: Dignan

Re: Acrobat Help - 17/12/2002 11:02

Despite your love of Macs, I don't think the file that I'm basing mine against was created on one, so it must be possible I'm going to try to contact them and find out how they did it.
Posted by: Ezekiel

Re: Acrobat Help - 17/12/2002 11:07

When you install Acrobat (at least in v.4 which I have) it will install two printers - PDF Writer and Acrobat Distiller. You choose them from inside the application and configure like any other printers. PDF Writer is more straightforward. If it gives you problems, try Distiller. Distiller can also act independantly and convert postscript (.ps) to .pdf, although I've never used it this way.

Once you've got your .pdf of the graphic you can insert it as a whole page into another .pdf document.
HTH.

-Zeke
Posted by: matthew_k

Re: Acrobat Help - 17/12/2002 11:14

Perhaps I wasn't quite clear enough... Just to clarify, you need to print the text document to acrobat, not the scaned pages. The scanning is what is throwing you off. PDFs of scanned pages are better than nothing, but they're nothing compared to a good PDF, as you're finding out.

Matthew
Posted by: Dignan

Re: Acrobat Help - 17/12/2002 11:17

Oh of course. I would definitely convert the original documents if I had them, but this is about 1000 pages in binders we're talking about. Not all of it will be PDFed, but a couple hundred will (in seperate files, of course).
Posted by: wfaulk

Re: Acrobat Help - 17/12/2002 11:30

Note that Acrobat and Acrobat Reader are two different products. As Zeke said, Acrobat (which is not free) can produce PDFs via virtual printer drivers.

However, the reason that you're having problems, to reiterate what every one else has said and to try to be extra-clear, is that you are simply embedding pictures of documents within a PDF file. The reason that PDFs are so small is that they generally only include the text and just enough ``mark-up'' to make them look right. It has no way to significantly reduce the pictures you're embedding, though.

Edit: Oops. Forgot to replace my xxx placeholder with Zeke (as I forgot who I was ``quoting''.
Posted by: Dignan

Re: Acrobat Help - 17/12/2002 11:49

Okay, I'm going to try to read up on what's going on a little more. I understood what people were trying to tell me, but I was trying to get across the fact that I was creating files the same way that the other organization created theirs. That is what is confusing me, the fact that I'm creating the same type of file yet can't get the same results.

*edit*
I just got off the phone with the person who created the good file. From what I gathered, he didn't do anything special. None of this text file stuff or anything. ALL he was doing was opening Acrobat, scanning in the files directly into the program, then saving them. He specifically said he was not saving it in any sort of text form (so that the text is searchable or editable). I'm now back where I started. My dad may have to settle for large file sizes.

ps-his only answer was that they had a better scanner. that doesn't seem like a satisfactory explanation to me.
Posted by: robricc

Re: Acrobat Help - 17/12/2002 11:54

If the text does not have to be searchable or editable, then why use PDF. For your useage it seems like the wrong format. Perhaps a low-color GIF or, dare I say, JPEG would be a better choice.

If it's just going to be read off the web (not downloaded) this may be a good choice. See what I did with our print catalogs here.
Posted by: Dignan

Re: Acrobat Help - 17/12/2002 12:15

This is true. I think the desire is for a "Document". Putting up a bunch of gifs wouldn't be appealing to my boss
Posted by: David

Re: Acrobat Help - 17/12/2002 12:23

The reason your scans are larger might be because of the type of image and the cleanliness of the scan. Are your documents more 'busy' than the others you're comparing to?

Make sure that the background of the image is plain white. If there are variances of grey, then use the levels or birghtness/contrast controls in the scanner software to remove it. The more contrast, the better the compression will be.

Scan at around 200dpi if the document is intended to be printed out again in bitmap (ie. 2-colour) mode - greyscale will reduce readability and increase file size. Save the file in a non-lossy format; TIFF is best (non-print oriented formats such as BMP don't maintain size/dpi settings).

When you save the PDF, you should get options for the compression type and dpi downsampling. Set it to JPEG at a suitable quality. Turn off downsample, as your scan is already at an optimum resolution.

*edit* Just realised that you can't use JPEG for bitmaps. Select ZIP or CCITT in the monochrome bitmap images section.
Posted by: Dignan

Re: Acrobat Help - 17/12/2002 12:35

Those sound reasonable. I was worried about the level of gray. My files do appear greyer than theirs. I suppose he wasn't too far off, but it wasn't really the quality of the scanner, so much as the correct levels.

*edit*
Bingo. Gray area. I looked at the exposure levels in the scanner software. I ran two tests on an area of the scan, one with the "highlights" bar all the way to the right, which maxed out the gray levels, and one just below the peak of color. Both TIFFs were the same size, but when made into PDFs, the first one was 290KB, whereas the second was only 30. Thanks Peter.

*edit edit*
Sorry if I've been sounding impatient, but I'm grapling with the thought that I might have to re-scan 130 awkwardly-bound pages. That is if Irfanview can't make the necessary adjustments.