Tesseract has unicode (UTF-8) support, and can recognize more than 100
languages "out of the box". It can be trained to recognize other languages.
Tesseract supports various output formats: plain-text, hocr(html), pdf.
If you want to access the files under /media/* or /run/media/* you'll have
to connect the snap to the core snap's removable-media interface:
For versions of Ubuntu between 14.04 LTS (Trusty Tahr) and 15.10 (Wily Werewolf), as well as Ubuntu flavours that don’t include snap by default, snap can be installed from the Ubuntu Software Centre by searching for snapd.
Alternatively, snapd can be installed from the command line:
sudo apt update
sudo apt install snapd
Either log out and back in again, or restart your system, to ensure snap’s paths are updated correctly.
To install tesseract, simply use the following command: