OCR apps you cannot live without!

Hi everyone. Need to extract text from a pdf but it’s an image?
Enter … ocrmypdf.

My day job (not for much longer - yay!) is modifying texts for students with reduced or no vision - it’s frustrating using Okular (pdf and lots of other things viewer) only to find you cannot extract text.

Install ocrmypdf - it’s a command line utility so for example you have a pdf called text.pdf (but really its only contains an image of text, and it is in your Downloads folder say. Open a terminal and:

cd Downloads
ocrmypdf text.pdf output_pdf

The output_pdf is now a fully strippable pdf in Okular! Yay!

DRM pdf? Use LIOS - Linux Intelligent OCR Software.

Just open the pdf in LIOS - it calls them files then then the output in the left pane are termed ‘images’ - recognise all images from the menu then on the centre bottom pane just do any deleting that is necessary, select all, copy and paste into your Text Processor - job done.