MyThinkPond

On Java, Python, Groovy, Grails, Spring, Node.js, Linux, Arduino, ARM, Embedded Devices & Web

  • Recent Posts

    November 2010
    M T W T F S S
    « Oct   Dec »
    1234567
    891011121314
    15161718192021
    22232425262728
    2930  
  • Subscribe Options

  • Awards

    JavaCodeGeeks
  • Most Valuable Blogger @ DZone
  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 173 other followers

  • Follow MyThinkPond on WordPress.com
  • Blog Stats

    • 364,698 hits
  • General Options

Extract all images in PDF file in a directory (batch extract images)

Posted by Venkatt Guhesan on November 28, 2010

Sometimes you need a way to extract all images in a PDF but then you have a directory of files and you need to extract them iteratively.

Prerequisites:

1. Install Cygwin or linux environment with Perl support.

2. Install ImageMagick.

3. Install GhostScript.

Afterward run the following script:


#!/bin/perl

my $directory = $ARGV[0];
opendir (DIR, $directory) or die $!;
while (my $file = readdir(DIR))
{
if ($file =~ m/\.pdf/)
{
my $newfile = $file;
$newfile =~ s/\.pdf/_%01d\.jpg/;
print "Processing " . $file . " ; newfilename: " . $newfile . "...\n";
`convert -density 150 $file $newfile`;
}
}

How to invoke:
scriptname path_to_pdf_files

Cheers.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: