Practical Computer Advice
from Martin Kadansky

Volume 8 Issue 12
December 2014
Going Paperless, Carefully Part 2: Use Document Scanning to Reduce Your Mountains of Paper!

Are you drowning in paper files? Do you wish you could reduce your enormous piles of paper down to a manageable size? Are you running out of room for your paper files? One source I've read estimates that after the first look, we never look at 80% of our paper records again.

While I can't offer an easy or complete solution, I can share my experience and advice on one particular technique that can help: Using a document scanner to turn paper documents into electronic files.

I've got two five-drawer metal file cabinets where I store a combination of personal and business records. A few years ago I noticed that some of those drawers were getting so full that it took a bit of effort to pull out a file, and even more effort to squeeze it back in. I found myself thinking that I will probably need to buy a third file cabinet relatively soon. Then I found myself resolving that I would not let that happen!

Back in April 2013 I wrote a newsletter about a technique I had started using a few months earlier: Instead of printing new receipts and other information on paper, I had started "printing" them into PDF files and saving them in my computer instead. This method has helped me significantly slow the growth of my paper files, but it has done nothing to reduce the amount of paper I had already accumulated, nor does it do anything about the important papers that I still create and that arrive in the mail.

I recently decided to tackle the next obvious steps:
  • Weed out the papers I really don't need to keep, and
  • Start scanning the remaining important papers into my computer (and then discarding or shredding them) whenever possible, starting with the papers that are easiest to scan and easiest to replace if something goes wrong, and then slowly branching out from there to other specific categories of papers.
My regular printer/scanner/copier was not the right tool for this project

I have an HP OfficeJet J6480, which is a combination printer/scanner/copier/fax. It's a good printer (which can even print double-sided!), and it copies and faxes well. However, when I have occasionally used it to scan originals into my computer, I've noticed that:
  • Scanning pages one at a time from the flatbed always gives me high-quality results, but it's labor-intensive and can only scan one side of each page.
  • Scanning pages loaded into the automatic document feeder (ADF) often gives me distorted results - some lines of text might be squished, others on the same page might be stretched. I imagine that the 8-year-old rollers may have dusty or brittle areas, making the pages "slip" a little as they move past the scanner.
  • The ADF also can only scan one side of each page, so scanning a two-sided original is awkward and impractical: I would first have to scan all the odd-numbered pages into one file, and then scan all the even-numbered pages into another. Or, I would have to make copies of all the even-numbered pages, assemble a single-sided original, and then scan that in.
Since scanning quality matters to me, and many of my original paper documents are double-sided, I quickly concluded that my printer/scanner/copier was not the right tool for this project.

A good document scanner to the rescue
After some research, I bought a Fujitsu ScanSnap iX500 Color Image Scanner, a document scanner that many users have rated highly. At $420 it's not cheap, but I decided it was worth it to me since I also value my time, and this is a serious, long-term project. So far, I have found that it works surprisingly well.

This newsletter will combine my general advice about scanning original paper documents into your computer with specific advice about using this particular scanner.

The goal
While a scanner (along with good OCR software to perform optical character recognition) can be used to attempt to turn a paper document into a computer document that you can then continue to edit as if you had typed it into your computer in the first place (and in my experience that hardly ever turns out well), that is not the actual goal here.

The goal of using a document scanner is to turn well-chosen portions of a large paper archive into a useful electronic archive of computer documents in your computer in a time- and space-efficient manner, and then reducing the pile of paper by discarding or shredding most or all of those scanned paper documents. A natural result of this is also to become comfortable enough with scanning to continue using it appropriately as new paper documents arrive in order to prevent or minimize any growth in that pile of paper again. However, don't expect that using a document scanner will completely eliminate paper from your life.

A quick look at the iX500 document scanner
This scanner resembles a fax machine:
  • After you open it up and extend the input and output trays, its paper path is mostly straight, resembling a relaxed "L" shape.
  • Compared to a regular scanner, it only has an automatic document feeder (ADF). It does not have a flatbed.
  • After installing the software and connecting it to your computer, you load your pages "face down, head first" into the input tray. When you press the scan button, it starts the software in your computer, pulls each page through the scanner from the bottom of the pile, sends the scanned data into the computer, ejecting each page into the output tray as it goes, preserving the page order.
  • It's fast! It pulls about 25 pages per minute. You'll spend more time waiting for the software to process your scans (and then naming, saving, and reviewing your files on your computer) than you will waiting for the scanner to run through your paper.
  • Unlike regular scanners or multifunction printers (like my HP OfficeJet J6480), which can only scan one side of each page, the iX500 has dual scanners, so it can scan both sides of each page at once, saving an enormous amount of time and effort when you have double-sided originals to scan! And if you're scanning a single-sided original in double-sided mode, the software is pretty good at eliminating the blank "back" pages.
  • You'll have a number of choices regarding what it does with the scanned data. What I've found most useful is to have it make PDF files containing two elements: images (pictures) of the paper documents I've scanned, plus an underlying layer of text (metadata) mechanically generated from its good (but not perfect) OCR software that tries to recognize the text on each page. Note that to get this really useful layer of text, you must turn on the "Convert to Searchable PDF" setting in advance.
The basics of document scanning
Here's an overview of what it's like to tackle document scanning:

1. Weed out any original paper documents you no longer need and don't intend to scan. Discard or shred them, or at least put them in a box for later recycling/shredding, and not back into the file cabinet!

2. Set up the scanner, install the software, download and install any updates.

3. Adjust the scanner software settings (see section below for details) before you do any significant amount of scanning.

4. For each original paper document you want to scan:
  • Remove any staples or paper clips.
  • Try to smooth out any creases, or use the clear plastic Carrier Sheet for small, fragile, or oddly-shaped pages.
  • Load all the pages of one original document into the scanner. Don't load multiple documents unless you want them all to end up in the same PDF file. The iX500 input tray can hold about 50 pages of paper.
  • Start the scan by pressing the iX500's big blue (unlabeled) scan button.
  • The scanner will pull your pages through very quickly.
  • After processing the image (the raw scan data) and the OCR (the layer of text that it can recognize on each page), the software has finished all of the work to create your new PDF file. It then stores it in a temporary folder, and then waits for you to decide what to do with it in the next few steps. Since the software has no idea what you're scanning, it gives this new PDF file a temporary name based on today's date and time (e.g., "2014-12-30-14-27-32.pdf").
  • On the screen you'll see a list of choices - Scan to Folder, Scan to Email, and many more. I click "Scan to Folder" because it's the one I've found most useful. That's the process I'll describe here.
  • The next scanner window (titled "Scan to Folder") shows you temporary name and contents of the PDF, but you can't make any changes to the file (e.g., delete a page, reorder pages, rotate a page, etc.).
  • I recommend that you then change that temporary name to a more reasonable and descriptive name, e.g., "2007-01-15 Smith client letter.pdf".
  • You should also change the destination folder for the PDF to the place where the PDF belongs, ideally in the context of a reasonably-organized scheme, e.g., "Smith project folder."
  • Click Save, which renames the PDF to the name you entered and moves it to the destination folder you chose.
  • Although it's not required, I suggest that you then open that destination folder, then open the PDF, and then confirm that it has scanned properly and looks ok. If not, try scanning again. Depending on the PDF software you're using, you might also make other changes as appropriate. You'll also see how well the scanner has captured a good (but imperfect) image of your document.
  • Assuming the paper document is one you can discard, go ahead and recycle or shred it, or at least put it in a box for later recycling/shredding and not back into the file cabinet!
  • As you scan and discard your original documents, consider also writing a note to yourself and putting it into your paper files to replace the originals you've just scanned, reminding you (or whoever else might be looking) that this particular set of paper documents (receipts, bank statements, client letters, etc.) was scanned in and now resides inside your computer, along with some notes describing the location, e.g., "Bank statements 2007 to 2014: Scanned into computer XYZ, in folder 'Citizens' under 'Paperless documents' on Desktop."
This may sound time-consuming and laborious, but I predict that after you've scanned in even a few originals, you'll start to get the hang of it these steps and they'll go by rather quickly.

Also, given that this process will have you creating valuable, new PDF files, this makes backing up your computer more important than ever!

Getting started
I recommend starting with your easiest-to-scan, lowest-risk, easiest-to-replace paper originals. For me, scanning in old cell phone bills and payroll reports (and then setting them aside to shred soon) helped me learn to use the scanner. It also eliminated an inch of paper, turning that tight and difficult drawer into one that now has plenty of elbowroom!

If you hire someone else to scan documents for you, be careful to work with an experienced professional, and emphasize with them the importance of monitoring the quality of the resulting scanned documents. A colleague told me about a doctor's office that had someone scan in all their paper medical records, which they then disposed of, only to discover later that many of the scanned pages were difficult to read or cropped.

Important iX500 software settings
So far, these are the Settings in the ScanSnap Manager program that I have found useful to change from their default values:
  • "Save" Tab, Image saving folder: This is the folder where the scanner initially saves the scanned file (with a temporary name based on today's date). I recommend you either look at this so you know which folder it will use, or change it to a folder that might better serve your needs. I changed this to a subfolder inside my "paperless" folder, which made this process easier for me to observe and manage.
  • "File option" tab, Convert to Searchable PDF: Turn this checkbox on to activate the OCR (optical character recognition) function, which looks at the pixels on each page, tries to recognize the text that those pixels represent, and then adds that text as an underlying layer (metadata) beneath the raw scanned image in your PDF files. This makes it more likely that when you later search by keyword you'll be able to find your scanned documents. This doesn't always work, but when it does it makes it easier to search for these PDFs.
  • "File option" tab, Target pages: Change this from the default of "First Page" to "All pages." This maximizes the "Searchability" of your scanned PDFs by performing OCR on all of the pages, not just the first one.
  • "Compression" tab, Compression rate: Change this from the default of "3" to the maximum "5." This minimizes the disk space that each PDF file consumes.
These last 3 settings make the software spend a little more time processing the scanned image, but in return you'll get files that will probably take up less disk space and contain more keywords you can search for later.

Also, be sure to click the "Apply" button to save your changes to the Settings!

Choose a standard file format, not a proprietary one
The iX500's software only directly makes PDF (the default choice) or JPG files, and its post-processing software can also do OCR into Word, Excel, PowerPoint, etc. Other scanner software can produce files in standard formats like TIFF, PNG, BMP, and GIF. However, if you're using scanner software that makes "proprietary" or manufacturer-specific non-standard files, i.e., files that only that particular software can open, I recommend avoiding that option at all costs. Here's why: Imagine it's 5 years from now and that scanner and its software don't work on your new computer, which means that you can no longer open all those files you've scanned, making them unusable. Don't make a choice now that you'll regret later.

Choose descriptive file names to make them easier to find later
So far I've found it useful to name my scanned (and "print-to-PDF") files using the date plus a few descriptive words, e.g., "2012.04.30 Citizens Bank statement.pdf". This is the date of the information in the document, not the date I scanned it in. Note that I consistently enter the date portion as "yyyy.mm.dd" (using leading zeroes like "04" for the month and day) so multiple files in the same folder will sort properly in date order. Dashes ("2012-04-30") would also work instead of periods. If you use Microsoft Windows you can't use slashes (/) in your file names; you can use them if you're on Macintosh, but for technical reasons I suggest that you don't.

Also, give a little thought to the descriptive keywords you use in the file name. Ask yourself: "What words am I likely to be searching for in the future to find this file?" Naming the scanned receipt "2014.11.11 Amazon.com - Fujitsu ScanSnap iX500.pdf" is accurate and expedient, but 2 years from now are you going to remember that you bought it from Amazon and not Staples? That its name was "ScanSnap" and not "SnapScan"? Think about what you might be searching for, and (in this case) at least add the keyword "scanner" to the name. In this situation, a little redundancy is a good thing.

Put your scanned files into reasonably organized folders
Here's how I have decided to organize my scanned (and "print-to-PDF") files for now:
  • I created a "Paperless receipts & statements" folder on my Desktop to encompass this project as a whole.
  • Inside that I have two subfolders: Business and Personal.
  • Inside each of those two subfolders I have subfolders by year, 2007 to 2014 so far.
  • Inside each of those "year" subfolders I have subfolders by category. For Business I use "cell phone bills," "retirement contribution receipts," "payroll," and "other." For Personal I use "bank statements," "credit card statements," "investments," and "other." This means that under each of the "year" subfolders I have separate sets of category folders with those same names.
I didn't come up with this scheme at the start, it evolved over time. I may eventually move these files out of this "paperless" folder into my "regular" documents, but for now it's been helpful to have them together as a special project.

You might decide to use a completely different scheme; the simplest way to start is to create a new folder to put these files into. You should expect that in the course of this project you'll probably be scanning in hundreds (if not thousands) of files, so give a little thought now to the folders you'll want in order to organize them into a reasonable scheme that fits your needs.

Limitations and imperfections of the iX500
  • Double-sided scanning will sometimes scan in a mostly (but not completely) blank "back" page that you might consider unnecessary, e.g., caused by a page with a logo at the top but no other text on the page, or some ink that has bled through from the other side, or smudges, etc.
  • Even thought it may look obvious to you, the OCR process may not recognize the text on a given page.
  • Color pages sometimes scan in gray. Simply scanning again can fix this.
  • Some items don't scan well, including light grey text or pictures, notes written in pencil, etc., so they may be faint in the resulting PDF, or completely missing. There are some settings that might bring out those faint features, but it's tricky.
The implications of discarding or shredding your paper originals
You've probably had these paper records for years, even decades. They're familiar and they're very low-tech. You can still access them if your computer isn't working or the power is out. On the other hand, the files you've scanned in are the opposite: They're new to you, you cannot access them without your computer, and you can't hold them in your hand.

After scanning them into your computer, permanently discarding or shredding your original paper records is a big step. While some items can be replaced later (bank statements, receipts from certain vendors like amazon.com, etc.), in many cases the originals cannot be replaced, ever, so there is no going back. This means that you should decide to do this very carefully. This also makes the scanned copies in your computer very valuable and important, which in turn makes regularly backing up your computer even more important than ever!

For certain paper documents, you might decide to scan them in and keep the originals, which effectively creates electronic backup of the paper versions. For example, for legal reasons a scan of an original signed document (contract, tax return, notarized document, business certificate, etc.) might not be considered as good as the original. There's nothing wrong with keeping both, as long as it's a clear choice you're making.

In the course of scanning, you'll probably find yourself waiting for the software to process the just-scanned document into a PDF. Here are some ways you can make good use of that time:
  • After the current paper original has finished passing through the scanner completely, you can load the next original into the input tray even if the software in your computer hasn't finished processing the current one yet.
  • Remove the staples from the next few originals you'll be scanning, and stack them in a "to be scanned" pile. I suggest alternating their orientation on your desk (landscape then portrait then landscape, etc.) to keep them separate.
Additional tips and suggestions
  • Retail receipts can often fade over just a few months because of the thermal paper they're printed on, so scanning the more important ones into your computer as soon as you can (or simply photocopying them if storage space is not an issue) will preserve them.
  • After an original is scanned, put it face down in a "done" pile, far enough away from the "to be scanned" pile so you don't mix them together by accident.
Specific to the iX500:
  • This scanner needs some room around it, both in front and back as well as above, so figure on dedicating a good portion of a table or shelf to it.
  • If you set up wireless scanning, the first time the software looks for the scanner on your network it will ask you for the scanner's connection password. This is not your network's Wifi password. By default it's the last 4 digits of the scanner's serial number.
  • Even though you can configure the scanner to send its data to your computer via your wireless router (and thus unplug the scanner's USB cable from the scanner and your computer, reducing your cable clutter a little), the scanner will still have a power cord. It also needs to be within arm's length of your computer, because you'll be loading the originals in the scanner's input tray, taking them out of the output tray, and operating the scanner software on your computer.
  • If you set up wireless scanning, you won't be using the scanner's USB cable. Do not discard it. Keep it handy (or even tape it to the scanner), because you'll need it again, especially if you have problems with your Wifi or move the scanner to a different computer.
  • If you have a long or fragile or complicated original document, you may not be able to load all of its pages into the scanner at once. Rather than scanning it into separate PDF files, go to the Settings and turn on the "Continue scanning after last page" option on the "Scanning" tab. Now, as it finishes each loaded pile of paper, the software will ask you if it should "Continue Scanning" or if the original is "Finished." This lets you load additional pages that will be added to the scan of the same original and they'll end up in the same PDF file. Be sure to turn this option off again when you return to scanning simpler originals.
Thanks to Jim Connell
I want to include a special thank-you to my colleague Jim Connell of Custom Software (http://www.custom-software.biz), a Microsoft Access database consultant and President of the Society of Professional Consultants (http://www.spconsultants.org), for his helpful advice and insights about document scanning.

Where to go from here
How to contact me:
email: martin@kadansky.com
phone: (617) 484-6657
web: http://www.kadansky.com

On a regular basis I write about real issues faced by typical computer users. To subscribe to this newsletter, please send an email to martin@kadansky.com and I'll add you to the list, or visit http://www.kadansky.com/newsletter

Did you miss a previous issue? You can find it in my newsletter archive: http://www.kadansky.com/newsletter

Your privacy is important to me. I do not share my newsletter mailing list with anyone else, nor do I rent it out.

Copyright (C) 2014 Kadansky Consulting, Inc. All rights reserved.

I love helping people learn how to use their computers better! Like a "computer driving instructor," I work 1-on-1 with small business owners and individuals to help them find a more productive and successful relationship with their computers and other high-tech gadgets.

