Instructables

DIY High-Speed Book Scanner from Trash and Cheap Cameras

FeaturedContest Winner

Step 78: Download Page Builder

Picture of Download Page Builder
Aaron Clarke wrote the software to process the output of this book scanning system. It reads in all the images, allows you to set a crop, corrects for irregular lighting, and outputs PDF.

Currently, this is alpha software. It makes a number of assumptions. It requires a powerful machine to work. You will be best off with at least 2 gigabytes of ram and up to ten gigabytes of free hard disk space. At some point, this will change, but likely not very soon.

While it is very easy to tell the software what to do, it takes a while to process so much image data. Page Builder may take more than 3 hours to process a 300 page book. Currently, we have to make a book into a couple smaller PDFs -- the reason being the way we make PDFs from Matlab. If anyone has Matlab code for good PDF printing, please contact us.

Download Page Builder for XP here.
Download Page Builder for Vista here.
Mac users will need a copy of Matlab, as we can't get a standalone version to work. The XP and Vista copies both include the source script, which was developed on a Mac and works fine..

UPDATE 2009/04/20: If you are having page order issues, please try this version, which also includes some imaging enhancements.

Only the XP version has been extensively tested.

Page Builder is Free Software. The sources are available. We are graduate students and have extremely limited time to support this software. So little time that we actually have no time. It is our hope that other people will help shoulder some of the development costs of this software.

 
Remove these adsRemove these ads by Signing Up
you15 years ago
Unfortunately, the software requires MATLAB, which is not free.
Consider the following alternative, until we get a free solution:
1) Using XnView to crop, resize, adjust brightness/contrast as a batch job. Download XnView from http://www.xnview.com/en/screenshots.html
As I was searching for a free solution, I found this XnView. I really liked the simplicity and speed of XnView. After the initial learning curve; I was able to create a batch process under the tools menu. At times, I found it even more convenient than PhotoShop Automated script (I’m not a graphics guru)
2) Rename the files for the left and the right camera, and merge them into one folder.
I first used spreadsheet to aid in renaming the files; later, I created script for this process. See my instructions here: http://www.mind2b.com/component/content/article/9-info/8-renaming-or-renumbering-camera-or-image-files.
3) I used Adobe Acrobat to create my PDF.
Perhaps someonelse can suggest a good free alternative.

rjwarpath you15 years ago
A good Free PDF printer program is Primo-PDF.
spamsickle5 years ago
I'm using an open-source product called ImageMagick to do the conversions. I've written a Perl script which accepts information about my scans (the names and ranges of the left- and right-page scans, the offsets and sizes of the portions of the scan I want to save, the page number to start with for output, etc.) which generates the script to run ImageMagick. The script generated looks something like this: convert.exe PICT2283.JPG -crop 2850x1760+200+120 -rotate 270 1.pdf convert.exe CIMG0001.JPG -crop 2700x1850+200+180 -rotate 90 2.pdf convert.exe PICT2284.JPG -crop 2850x1760+200+120 -rotate 270 3.pdf convert.exe CIMG0002.JPG -crop 2700x1850+200+180 -rotate 90 4.pdf convert.exe PICT2285.JPG -crop 2850x1760+200+120 -rotate 270 5.pdf convert.exe CIMG0003.JPG -crop 2700x1850+200+180 -rotate 90 6.pdf convert.exe PICT2286.JPG -crop 2850x1760+200+120 -rotate 270 7.pdf convert.exe CIMG0004.JPG -crop 2700x1850+200+180 -rotate 90 8.pdf convert.exe PICT2287.JPG -crop 2850x1760+200+120 -rotate 270 9.pdf convert.exe CIMG0005.JPG -crop 2700x1850+200+180 -rotate 90 10.pdf The names of the scans for the left and right pages are different in my setup because I'm using two different models of camera. While it COULD be done with two similar cameras, you'd either have to guarantee that the left range and the right range didn't overlap, or keep them in separate directories until they were converted. I get the area to crop by loading a couple of images into Photoshop, but any image software that will tell you where your cursor is (in pixels) and the dimensions of your selection could be used. ImageMagick crops, rotates, and converts the JPG to PDF in less than 2 seconds per image. I can convert 100 images in about 3 minutes. Once I have all the images converted to PDFs, I use another free tool called PDFTK to stitch them together into a book. Once again, to spare myself typing, I have a Perl script to generate the command line for me. It works for me. I can convert a 1500-page book in less than an hour (once the scanning is done, and the images are loaded on my computer), and (after I get my numbers from Photoshop and generate the script) it runs unattended. I've found ImageMagick and PDFTK are handy tools to have, and (as you might be able to tell from my bare-bones bookscanner) I'm a fan of using what I already have.