Showing posts with label pdf. Show all posts
Showing posts with label pdf. Show all posts

Sunday, 6 January 2019

How to compress scanned PDF?

One problem with scanned pdfs is that the size is so bloated. One good solution to compress such files is this one script:

#!/bin/sh

gs  -q -dNOPAUSE -dBATCH -dSAFER \
    -sDEVICE=pdfwrite \
    -dCompatibilityLevel=1.3 \
    -dPDFSETTINGS=/screen \
    -dEmbedAllFonts=true \
    -dSubsetFonts=true \
    -dColorImageDownsampleType=/Bicubic \
    -dColorImageResolution=120 \
    -dGrayImageDownsampleType=/Bicubic \
    -dGrayImageResolution=72 \
    -dMonoImageDownsampleType=/Bicubic \
    -dMonoImageResolution=120 \
    -sOutputFile=out.pdf \
     $1


Here the compression rate can be changed by tweaking resolution values. I found the above gives a good compression without sacrificing the text quality.


Thursday, 16 July 2015

PDF join files and replace strings using pdftk

pdftk *.pdf cat output combined.pdf
 ===========

You can try to modify content of your PDF as follows
  1. Uncompress the text streams of PDF
    pdftk file.pdf output uncompressed.pdf uncompress
  2. Use sed to replace your text with another
    sed -e "s/ORIGINALSTRING/NEWSTRING/g" <uncompressed.pdf >modified.pdf
  3. If this attempt was successful, re-compress the PDF with pdftk
    pdftk modified.pdf output recompressed.pdf compress


Sources:
https://www.pdflabs.com/docs/pdftk-cli-examples/
http://stackoverflow.com/a/9872494/4151875