Skip to content

OCR Providers

Luís Manuel Maia edited this page Mar 26, 2020 · 6 revisions

SmartDocumentor includes optical character recognition (OCR) extraction natively, we do include other providers to enhance performance in certain types of documents.

GOOGLE API

Recently we added Google API as a external OCR provider for better performance in natural image documents.

If you have an account and want to start using this provider is very simple just follow the next steps.

1. Go to workspace.config.xml and change OcrEngine to Google.

72.    <Setting Name="OcrEngine" Value="Google" />
73.    <!-- OcrEngineCustomParams: plafondPath; saveGoogleOcr; settingsGoogleOcrPageModelPrefixFieldName -->
74.    <Setting Name="OcrEngineCustomParams" Value=";False" />

Optionally you can set on OcrEngineCustomParams the additional parameters:

  • plafondPath - default is a file named GooglePlafond.json inside SmartDocumentor's installation folder. Ex: C:\Program Files (x86)\DevScope\SmartDocumentor\GooglePlafond.json
  • saveGoogleOcr - if you wish to save original Google API result in the task.
  • settingsGoogleOcrPageModelPrefixFieldName - if you are saving google OCR you can set the key value in the task. Default is _GoogleOcrPageModel_.

2. Get your credentials from Google so that SmartDocumentor can access your account plafond.

From the Google Cloud Platform just download the access keys in the JSON format .

First make sure you have a valid payment method.

gcloudplatform.png

gcloudplatform.png

gcloudplatform.png

Generate credentials.

gcloudplatform.png

gcloudplatform.png

gcloudplatform.png

gcloudplatform.png

Once You have the JSON file you can either rename it GooglePlafond.json and move it to SmartDocumentor's installation folder or pass the path to the file on the OcrEngineCustomParams option.

Clone this wiki locally