Professional Services

Tuesdays with Tom: Optical Character Recognition Engine

Read and learn about one of the four components that make up an IDP pipeline, the Optical Character Recognition Engine, with IDP expert, Tom Wilger.

1 minute read

May 17, 2021

I recently wrote a post describing how our team at BP3 builds Intelligent Document Processing (IDP) pipelines in the cloud and the four major components that make up those pipelines.

The first component of an IDP pipeline is the Optical Character Recognition (OCR) engine. This component reads document images, and extracts the characters and the physical position of each word on the page. OCR is not a new idea. The concept can be traced back to the early 1900's when Emanuel Goldberg invented a machine that read characters from photographic images and converted them into telegraph code. Today, through the use of neural networks and deep learning, both open-source and commercial OCR engines make possible extracting of everything from printed characters, to check-boxes to handwritten text from business documents, passports and even street signs.

With BP3’s Sherpa document processing service, we utilize various OCR engines within our IDP pipelines including Tesseract, Azure Computer vision and AWS Textract. This fully managed machine learning OCR and data extraction service provides state-of-the-art capabilities to recognize and extract printed text, check-boxes and even handwriting from unstructured text, tables and printed forms in a variety of languages.

The OCR engine provides the foundational data from which document information is extracted. Understanding the capabilities and limitations of this component is important when creating and managing an Intelligent Document Processing pipeline.

Professional Services Streamline with automation Process innovation & automation Document & process automation

WRITTEN BY

Tom Wilger

CONTACT US

Enhance my business with AI

Advance with expert consulting

Streamline efficiency with automation

Refine workflows with process optimization

Update systems through app modernization

Banking, Finance & Insurance

Government & Public Sector

Pharma & Healthcare

Telecom & IT

Retail, Travel & Hospitality

Professional Services

Manufacturing, Construction & Design

Document & Process Automation

Advanced Computing & AI

User Experience & Support

Business Process Optimization

Organizational Enablement

Application & System Modernization

Agentic Hub

Agentic AI Compliance Monitor

Brazos Design System

Brazos Task Manager

Consulting

AI - Artificial Intelligence

Workload Automation

IDP - Intelligent Document Processing

IA - Intelligent Automation

IPA - Intelligent Process Automation

UX - Enterprise User Experience

Low-Code Development

Application Modernization

End-to-End Support

Training

Blog

News

Use Cases

Company

Careers

Contact Us

ABBYY

Automation Anywhere

AWS

Blueprism

BMC

Broadcom

Camunda

Celonis

IBM

OutSystems

Stonebranch

UiPath

Tuesdays with Tom: Optical Character Recognition Engine

Similar posts

Utilizing Intelligent Document Processing to Simplify Invoicing

IDP: A Deep Dive Into the Benefits and Use Cases

Tuesdays with Tom - What's an Intelligent Document Processing Pipeline?

Want to stay up to date with BP3's insights?

UNITED STATES

UNITED KINGDOM

NETHERLANDS

PORTUGAL