Real World Comparison: Using Textract Versus RekognitionDetectImageText

Welcome to our review of the recognition capabilities of two Amazon Web Services AI / ML technologies – Amazon Textract and Amazon Rekognition’s Detect Image Text function. Amazon Web Services supplies world-class AI / ML services, but sometimes it can be difficult to determine which service is best for your use case.

In this post, we’ve done a comparison of Textract and Rekognition to try to understand the best use cases for each. As Amazon explains it, Textract is a “fully managed machine learning service” dedicated to extracting text and other data from documents and images, while Rekognition is a comprehensive object recognition service which can also recognize text in images with the RekognitionImageDetectText call.

We’ll be comparing real-world results between the two services to define the peak performance areas for both and bring out any significant handicaps which could impact the success of your implementations. In all test cases we used the LINE response from both RekognitionDetectImageText and Textract to process the image.

Example 1. Movie Collection Cover

This sample is an image from a legacy asset (a movie collection). Although the overlaid title is crisp and clear, the shadow and blur around the letters is close enough in tone to potentially present recognition problems. The rest of the image is of dated quality and to add more challenge, some of the visible text is heavily stylized.

Text Recognition Results

RekognitionDetectImageText Textract
OEZ (From the “Smokey” part of header)
and (From the header)
THE (from the header)
CB (from the small CB radio)
And it (From the “Bandit” in the header)
THRIS RIRL (From “TRANS AM”)
BANONE (From the license plate)
The (from the header)
RII (from the “AM” above the license plate in “TRANS AM”)
BAN ONE (From the license plate)

As you can see, despite the poor quality of the image, RekognitionDetectImageText successfully detected entire words that were presented in a clear font. Stylized text without clear color contrast with its background presented much more of a problem, including the overlaid title text, which one would expect to be more easily read.

In this test, although the results included many inaccuracies, Rekognition definitely did better in recognizing more words, including correctly recognizing letters embossed into a small image (“CB”).

Example 2. Comic Book Cover

This sample is of an artwork using many fonts with different boldness and on different color backgrounds (including a reverse-color logo). The main text has bright colors, outlined edges and colored shadows, but is overlain with other imagery. This kind of sample should challenge any artificial recognition technology.

Text Recognition Results

RekognitionDetectImageText Textract
MARVEL LEGACY HOME OF THE BRAUE PART 1
695 MARK WAIO
MATTHEW CHRIS SAMNEE WILSON
CALRAL (CAPTAIN)
AAMICA (AMERICA)
SAWWEE’17 (Bottom right corner)
MIN (Bottom right corner)
LEGACY HOME OF THE BRAVE
PART
1
695
MARK WAID
CHRIS SAMNEE
MATTHEW WILSON

Both services recognized many words, with Textract winning the name recognition prize. However, Textract was unable to recognize the stylized text samples, whereas RekognitionDetectImageText did pick those up as text. But Rekognition’s results were very inaccurate, and even the more recognizable text was jumbled, whereas Textract listed them perfectly.

We have to say that neither service was fully up to the challenge of this very difficult test, but Textract is ahead for producing much more usable output.

Example 3. Street Signs

Imagery of road signs is a very applicable use case for real-world AI and machine learning technology. While the image is high-quality, the conditions under which it was taken create some slight fuzziness overall, and the lettered surfaces are all angled with respect to the camera. (*Note that there are two sets of signs, one foreground and one background on the bottom right.)

Text Recognition Results

RekognitionDetectImageText Textract
Grennan Rd (from stop sign facing us)
Brace Rd (from stop sign facing us)
STOP
BRACE RD (from small stop sign in lower right)
GRENNAN RD (from small stop sign in lower right)
Rd (from Grennan Rd stop sign facing us)
Grennan (from stop sign facing us)
Brace Rd (from stop sign facing us)
STOP
BRACE RD (from small stop sign in lower right)
GRENNAN RD (from small stop sign in lower right)

Here both services do very well, but RekognitionDetectImageText delivered perfect accuracy, while Textract missed the name of one of the streets entirely. Interestingly, although all the signs’ letters are in capital-case, both services decided to sentence-case the street names from the set of signs in the foreground, but output capital-case for the much smaller signs in the background.

Example 4. Posing with License Plate

This is a very difficult sample for recognition, as the license plate is rotated to an odd angle and also bent into a slight curve.

400

Text Recognition Results

RekognitionDetectImageText Textract
NO (from the top of the license plate)
OUTATIME
*Textract found nothing on this one.

Here RekognitionDetectImageText, as it is based on a pure object recognition technology, is able to correctly read the large text despite its skewed perspective, and even pick up a couple of letters from the much smaller and blurrier text.   Textract, however, came up completely blank on this one.

Example 5. Angled Ad Copy

This sample represents a very common use case for online imagery. The text is angled and while the outlines are sharp, there’s a lined 3D effect applied to the text which blurs the demarcation between text and background.

Text Recognition Results

RekognitionDetectImageText Textract
AI
THE BEST WAY TO
SLANT OR SKEW TEXT
IN ILLUSTRATOR
*Textract couldn’t find any text in this at all.

Again RekognitionDetectImageText shows its strength in recognizing objects in imagery, returning 100% complete and accurate results. Textract unfortunately did not recognize any text in this image.

Example 6. Level Ad Copy

This sample uses the same stylized text as the Angled Ad Copy sample but aligns it on the level this time.

Text Recognition Results

RekognitionDetectImageText Textract
THE BEST WAY TO
SLANT OR SKEW TEXT
IN ILLUSTRATOR
*Textract still couldn’t find anything here.

We have unexpected results from this one. While RekognitionDetectImageText again perfectly recognized the text, Textract was again unable to recognize any part of it. This is unusual as Textract has been able to recognize level, stylized text in other examples. There is sufficient contrast in the color scheme to differentiate the text. Perhaps the lined 3D effect caused issues, but the cause of Textract’s poor performance on these last two samples is currently unknown.

Conclusions

Textract has a proven track record with scanned documents or similar imagery , which is natural because that is the environment it is designed for. RekognitionDetectImageText and Textract performed comparably well on most samples with Textract doing slightly better with imagery similar to scanned documents. But Rekognition is the clear winner at recognizing stylized text on colored backgrounds, from comic book covers to pictures of real-world objects with lettering.

Overall we can say that RekognitionDetectImageText has the wider range of applicability and better peak performance in specialty use cases, while Textract is a more reliable producer in its area of competence.

Thank you for taking this journey with us! We hope you found the information useful in your decisionmaking about text recognition technologies.

The Nomad Team

Get a live demo

See how Nomad can save you hours
and increase your media ROI

REQUEST DEMO

 

More great articles

Protected: Nomad Media Achieves the AWS Migration and Modernization Competency Designation 

This content is password protected. To view it please enter your password below:

Read Story

Nomad @ NAB Show

Nomad has been invited to present as a Data Science and Analytics partner with Amazon Web Services (AWS) at the NAB…

Read Story

Nomad & Public Safety

State and local government agencies must maintain the delicate balance between budget restrictions and the growing demands for video management…

Read Story

Never miss a minute

Get great content to your inbox every week. No spam.

Error: Contact form not found.

Arrow-up