Forms Processing
Ø A formatted document
containing blank fields that
users can fill in with data.
Ø With paper forms, it is usually
necessary for someone to transfer the data from the paper to a computer database, where the
results can then be statistically analyzed.
Ø Some OCR systems
can do this automatically, but they're generally limited to forms containing
just check boxes. They can't
handle handwritten text.
Ø
Electronic
forms solve this problem by entirely skipping the paper stage.
Ø Instead, the form appears on the user's display screen and the user fills it in by selecting
options with a pointing device or
typing in text from the computer keyboard.
Ø The data is then sent directly to a
forms processing application, which enters the information into a database.
Ø
Electronic
forms are especially common on the World Wide Web because
the HTML language
has built-in codes for displaying form elements such as text fields and check
boxes.
Ø Typically, the data entered into a
Web-based form is processed by a CGI program.
Ø There is hardly a business in the world that
does not use forms to some extent.
Ø Since they are useful for collecting and
recording data from a large number of individuals, forms are often used for
invoices, purchase orders, insurance claims, medical records, tax statements,
credit card applications, and many others.
Ø However, processing forms requires data entry,
which can be costly to a business in terms of both time and money.
Ø Human data entry is costly and prone to error,
which is why many businesses seek an automated solution.
Ø Automated forms processing software relies on
OCR/ICR (optical/intelligent character recognition) technology, and can make
forms processing much more efficient and economical.
Ø Automated forms processing technology is
relatively new, and has not been perfected, but there are ways of optimizing
its performance.
Ø When a document or form is initially scanned
into a computer, it is typically stored as an image file like a TIFF, JPG, or
PDF.
Ø When viewed on the screen, the user can read
these documents just as easily as if they were physically right in front of
them.
Ø However, to the computer, these image files
are just meaningless pictures; the computer cannot 'read' the text from them
like a human.
Ø This is where OCR software comes in. OCR
software performs an analysis of the light and dark areas of an image in order
to locate text; when it is found, it is identified.
Ø OCR is extremely accurate when used with
clear, high resolution PDFs. However, it is not good at recognizing
handwriting.
Ø ICR, intelligent character recognition, is
designed for handwriting recognition, and employs various handwriting-specific
recognition algorithms.
Ø Unfortunately, ICR technology has not advanced
to the point where the user can just run an ICR program on any document and
receive machine-editable text.
Ø Luckily, there are ways to help ICR software
do its job well, and to effectively use it for forms processing.
ICR and Forms Processing
·
One
way that forms processing software solutions help ICR do its job effectively is
to optimize document images for ICR processing.
·
This
often means despecking and deskewing images in order to make the writing as
clear as possible.
·
Automated
forms processing solutions are also generally custom-tailored to their
applications in order to provide as much helpful information as possible to the
ICR engine.
·
For
example, if it is known beforehand that a given field of a form will contain
only numbers, that information can be passed on to the ICR, greatly reducing the
possibility for error.
·
These
specifications are called validation rules, and are crucial to developing an
accurate forms processing solution, since ICR is a new and developing
technology.
·
Generally,
ICR based solutions also include a manual validation step to confirm the
software's results.
·
ICR
based forms processing solutions can be very useful providing they are properly
designed and implemented based on the task they are designed to accomplish.
·
Always
be wary of stock ICR software as it is li kely to be highly inaccurate.
OCR and Forms Processing
ü Forms processing solutions requiring only OCR
are generally much simpler to implement.
ü OCR technology has made huge advances in
recent years, and most OCR software is over 90% accurate.
ü Of course, OCR only works on machine generated
fonts, and is can't be used for handwriting.
ü Using good quality, clean scans of forms
ensures that OCR software can do its job properly. Of course, there is still a
degree of customization needed in order to develop an OCR solution specific to
a given task.
ü CVISION Technologies' Trapeze is a custom form
and document automation solution.
ü CVISION's representatives will work with you
to create the most effective solution for your needs.
No comments:
Post a Comment