OCR

Andrew Thoreson febaen at gmail.com
Fri Apr 4 17:26:51 CDT 2008


On Fri, 2008-04-04 at 14:23 -0500, Billy Crook wrote:
> Meant to send this out last night, but apparently it got stuck in drafts...
> 
> OCR will never be perfect.  And because of that, you will *never know*
> for sure, where it failed.  Once something becomes paper, all it is,
> is an image. I have never heard of OCR being a format of its own.
> It's usually used to 'convert' an image into text, stored as text, or
> convert an image stored as text, put into tags, stored with the image.
> 
> I have been storing all my tax and other documents electronically
> since 2004.  I currently store scannedd documents in PDF format.  I
> would prefer a multipage image format like TIFF, but haven't found a
> good program to do that.  PDF is massively more popular.
> 
> If I can get an electronic copy from the sender I keep that and ditch
> the paper.  Most banks and financial institutions now offer some form
> of electronic document delivery because it saves them money.  This is
> usually PDF; Sometimes html.  I believe the fewer format
> transformations I do on it, the better, so I will save it in whatever
> format I can get it in.  If for ANY reason you think you need to print
> something out just to scan it in, don't.  Use CupsPDF or PDF-Print, or
> something like it.  It shows up as a printer in cups, and when you
> print to it, saves a pdf of what you "printed".
> 
> If I have to scan paper, I currently use a program called gscan2pdf.
> It runs the scanner and can save a multipage pdf file.  Before you
> save, you have the chance to re-arrange the page order, which is handy
> if your ADF (automatic document feeder) skips a page, or jams.  You
> can also rotate pages.  My scanner is attached to the network, so if
> you remind me the day before, I can load it up, and demo the program
> at the lug meeting.
> 
> On Wed, Apr 2, 2008 at 9:22 PM, bewkard <bewkard at gmail.com> wrote:
> > I have finally had it with paperwork.  This last tax season did me in.
> >
> > I've talked to a couple people about using OCR to store documents digitally.
> > I know that a few people on the list do this as well.  I was wondering if
> > anyone could give me some tips about what works and what doesn't work.  Is
> > it better to OCR things?  is it better to scan and save a PDF or some other
> > portable document?
> >
> > Again, TIA
> >
> > Tim
> >
> > _______________________________________________
> >  Kclug mailing list
> >  Kclug at kclug.org
> >  http://kclug.org/mailman/listinfo/kclug
> >
> >
> _______________________________________________
> Kclug mailing list
> Kclug at kclug.org
> http://kclug.org/mailman/listinfo/kclug

Actually, I do recall reading of someone who created a program that
would back up (and later retrieve) files from paper. Of course, you
couldn't store anything very large with it, but text isn't very large.
Might that be a good solution here; or does it have to be human
readable? 



More information about the Kclug mailing list