Optical character recognition (OCR) involves taking text from something physical, like a paper invoice or a passport, and interpreting the text while converting it into a digital format.
The OCR software industry at large aims to offer time-saving tools geared toward businesses that process a large number of documents, such as invoices. Ongoing advancements result in products that are increasingly user-friendly and efficient. Amazon recently released a product called Textract that seeks to further improve on existing OCR technology.
How Does Textract Work?
Textract automatically extracts words and structured information from documents. Because Textract works with Amazon’s machine learning models, it recognizes and pulls text from materials without user training or parameters that tell the program how to treat it. The models were trained on millions of records before Textract’s release, which means they can reportedly handle almost any kind of document.
Then, after Textract finishes assessing the document and taking text from it, the program delivers a confidence score. It gives users some guidance regarding how they should use the extracted information. For example, a higher confidence score means the extraction is likely more accurate than something associated with a lower one.
People can also apply custom settings for certain types of documents that require exceptional accuracy. For example, if a company uses Textract for tax documents or quarterly reports, it could create a setting that automatically flags any of that type of content associated with a confidence score of less than 95%
Moreover, each piece of extracted text has a bounding box surrounding it, and people can drill down further than the overall confidence score and see the rating for an individual section. Having those insights could help users determine whether Textract works especially well or struggles more often with some kinds of text more than others.
Helping Companies Process Documents Faster
It’s easy to see how Textract could speed things up for companies that process substantial quantities of documents during their usual workflows. OCR technology, in general, is arguably even better when it includes things like built-in document management features.
For example, some OCR tools on the market have dozens of text-search options, enabling people to perform phonic searches for words that sound the same but have different spellings. They can also do wildcard searches and use a query like “apple*” to find words like “apples,” “apple cider” and “apple crumble cake” in one search.
Amazon does not mention such capabilities for Textract yet, but they could arrive in the future, considering the tool is so new. The company briefly discusses the ability to create smart search indexes, but says doing so requires using another Amazon product called Elasticsearch.
So, for now, Amazon focuses on how Textract can work with documents with very little human input. That characteristic means people could save time by devoting more of their workdays to other tasks instead of spending so much time changing or verifying the settings on a tool that processes those documents. People can also set up automated workflows related to Textract and other Amazon Web Services (AWS) tools.
However, people may not want to set their hopes too high before they start using Textract. A journalist used it and found it made mistakes more often than expected. One thing to keep in mind is that although Textract is a product that represents ongoing progress in the OCR industry, it — like most other products — is not perfect.
People interested in using Textract need to create AWS accounts first. It’s also useful to know that Textract availability currently extends to the following AWS regions: northern Virginia, Ohio, Oregon and Ireland. Amazon also provides some suggested use cases in a blog post about Textract, so people can read through those to determine if it fits their business needs.
If people decide to give it a try, they can get started with the free tier. There are two application programming interfaces (APIs) associated with Textract. First, there’s the Detect Document Text API that uses OCR to extract text from a document. Then, the Analyze Document API takes data from tables and forms.
The free tier allows new AWS customers to analyze up to 1,000 pages per month using the Detect Document Text API and up to 100 pages per month using the Analyze Document API, and those amounts apply to the first three months. Otherwise, Textract has a pricing breakdown to review.
Worth Experimenting With for Better Document Efficiency
Since Textract is new, more reviews about it will become available with time. However, it still may be worthwhile to work with on a trial basis to speed up document processing.