Image.Ocr

Description

This Task takes images and runs them through the Tesseract engine for OCR (optical character recognition). You’ll get a text file containing the results.

Image.Rotate works with the PIL/pillow package internally and supports about 30 different file formats for input and output.

Arguments

  • tesseract_executable (optional) is a string of the full path to your tesseract binary.
# tesseract_executable = '/usr/local/bin/tesseract'  # default
tesseract_executable = '/my/non-default/tesseract'
  • language (optional) is a string of the language code to use for text recognition.
# language = 'eng'  # default
language = 'deu'
  • boxes (optional) is a boolean that determines if batch.nochop makebox gets added to the Tesseract call. May provide better results in some cases.
# boxes = False  # default
boxes = True
# config = ''  # default
config = '--tessdata-dir "/my/special/tessdata"'

Requirements

Python (Non-Standard-Library Packages)

External Executables

API Credentials

None.

Resources