Thursday, October 23, 2008

Second Thoughts on First Impressions

It was one month and one day ago that I mused about my first impressions of the Fujitusu ScanSnap S300 document scanner. To say that it has exceeded my expectations would be a gross understatement.

The above print screen from the ScanSnap manager shows that 8,335 pages have been scanned!

In just one month.

Only one paper jam. Only one sheet of paper got "scrunched up" a bit because of it. No real damage done. Impressive, indeed.

2.1 gigabytes of data generated in 1,363 files of varying sizes, ranging from 1 to 200 pages per file.

Confession time. I've been scanning the "easy" stuff first! The stuff that is already somewhat organized. Also a five-volume set of books on the Berlin family - soft-cover, spiral bound. Easy enough to remove the binding without causing damage. They aren't listed in the online catalog of the Allen County Public Library so the books will be donated to them. Maybe someone else can get some use out of them. And several other books that will be passed on to other researchers. I certainly don't want to get into any trouble over copyright issues, so to put that issue to rest, the scanned copies are strictly for my own use, they won't be given to or shared with anyone else, except perhaps for a few pages.

The scanning is amazingly quick. I'm using the laptop computer to do the scanning since it is newer, faster, and has USB-2 capability. The output is quite acceptable but, to state the obvious, the quality of the output is dependent upon the quality of the pages being scanned. Just think of the adage "Garbage In - Garbage Out" which is true with most things! On average, it takes about half as long to generate the searchable pdf file as it does to scan the documents (i.e., an hour of scanning will take about half-an-hour to make the files searchable). If the pages are double sided text it will take twice as long to make them searchable.

The software that comes with the ScanSnap creates the searchable pdf files without the need to have the full version of Adobe Acrobat. If you have pages with very small type or with the older style fonts (think old newspapers) the software doesn't do that great a job with the OCR but I've been impressed with how well it does overall. The software also allows you to add or remove pages from a pdf file (if it was created by the software) so you can scan in small batches and then combine those into one file. I'm not sure how much data the scanner will store in memory so if I'm scanning a large number of pages I usually do a maximum of about 30 pages to one file. After all the pages for that particular document are scanned, I merge them into one file, then delete the small files. It's really not as complicated as it sounds.

To make some space for the scanned stuff, I had to spend a couple of days cleaning up the hard drive of my desktop computer. Basically I created an Archive folder on my laptop and on the external drives used for backups and moved a lot of older files to that folder. They are still accessible to me, just not on the desktop computer.

Well, I still have lots of files left to scan, though there are now some "blank spots" on the shelves! Progress is being made... and, believe it or not, it really isn't that tedious. One advantage to having the files digitized is that, in most cases, the information is easier to read. Another is if the pdf files are made searchable then the text can be copied and pasted into a word document or into my genealogy database, saving on some typing.

If you are thinking about digitizing your genea-documents, I highly recommend this little scanner. It is a bit pricey for a scanner, but in my opinion, well worth the money! It won't replace your flatbed scanner for scanning photographs since that isn't what it is meant to do. But it does what it was designed to do - scan documents - and does it very well.


Janet Iles said...

Congratulations on the scanning that you have accomplished during the last month.

Msteri said...

Wow! I scanned about 1000 pieces in a month and was proud of that. Sounds like your scanner is a winner!