data:image/s3,"s3://crabby-images/9d035/9d0350c264a54eafb49439785062f53a2cd794c4" alt="How can I reduce the file size of a scanned PDF file? How can I reduce the file size of a scanned PDF file?"
The Question: I have a 72.9MB PDF file that I need to shrink into under 500KB.
The file was a JPEG image that I had scanned, and then converted to pdf.
Solutions Sample (Please watch the whole video to see all solutions, in order of how many people found them helpful):
== This solution helped 27 people ==
for rewriting scanned pdfs:
#!/bin/sh
gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -
dCompatibilityLevel=1.3 -dPDFSETTINGS=/screen -dEmbedAllFonts=true
-dSubsetFonts=true -dColorImageDownsampleType=/Bicubic -
dColorImageResolution=72 -dGrayImageDownsampleType=/Bicubic -
dGrayImageResolution=72 -dMonoImageDownsampleType=/Bicubic -
dMonoImageResolution=72 -sOutputFile=out.pdf $1
You could customise it a bit to make it more reusable but if you only have one
pdf, you could just replace $1 with your pdf filename and bung it in a
terminal.
== This solution helped 18 people ==
I usually use ps2pdf to do this (easier syntax), something like this:
ps2pdf -dPDFSETTINGS=/ebook BiggerPdf SmallerPDF
I use the following python script to reduce the size of all the pdf files in a
dir in a production server (8.04). So it should work.
#!/usr/bin/python
import os
for fich in os.listdir('.'):
if fich[-3:]=="pdf":
os.system("ps2pdf -dPDFSETTINGS=/ebook %s reduc/%s" % (fich,fich))
== This solution helped 1 person ==
If converting to djvu would also be ok and if no colors are involved, you could
try the following:
Convert the pdf to jpg files using pdfimages -j
If you get pbm files instead, you should do the intermediate step:
for FILENAME in $(ls *.pbm); do convert $FILENAME ${FILENAME%.*}.jpg ;done
The convert command is from the imagemagick package.
Then use to make tif's out of it.
In a last step you go to scantailors out direcory (where the tif's are located)
and apply to that directory.
This should reduce the filesize drastically without big quality loss of the
text. If you want finer control over the ocr-backend, you may try djvubind --
no-ocr and use ocrodjvu to add the ocr layer afterwards.
If you have color's in your document things get a bit more complicated. Instead
of you could use didjvu and in scantailor
you have to change to mixed mode and select sometimes color-images manually.
== This solution helped 136 people ==
aking1012 is right. With more information regarding possible embedded images,
hyperlinks etc.. it would be much more easier to answer this question!
Here are a couple of script and command-line solutions. Use as you see fit.
*
*
*
pdftk/
*
With thanks & praise to God! With thanks to the many people who have made this project possible! | Content (except music & images) licensed under cc by-sa 3.0 | Music: | Images: & others | With thanks to user v2r ( user user525719 ( user user179584 ( user tamimym ( user student ( user someonr ( user Serge B. ( user scoobydoo ( user oxidworks ( user Oli ( user muru ( user mlitty ( user Michael D ( user mbroshi ( user Marius4674 ( user MadMike ( user Kalpit ( user Javier Rivera ( user don.joey ( user Bob ( user ape ( user Android Dev ( and the Stack Exchange Network ( Trademarks are property of their respective owners. Disclaimer: All information is provided "AS IS" without warranty of any kind. You are responsible for your own actions. Please contact me if anything should be amiss at Roel D.OT VandePaar A.T gmail.com.
0 Comments