Highest Voted 'ocr' Questions - Ask Ubuntu Stack Exchange

98

votes

9 answers

What's the best, simplest OCR solution?

I'd like to scan a good amount of papers I have lying around, with the least possible hassle. I would like to convert them to images using Simple Scan, then convert them to text using OCR. Is there a good OCR app with a GUI that will give me good…

software-recommendation scanning ocr

asked Dec 05 '10 at 10:32

Bou

4,472
5
24
28

47

votes

5 answers

How can I extract text from images?

How can I extract text from images? I am not talking about scanned files, but garden variety images, such as when you take a high-def picture of a blackboard at class, and it is nicely handwritten; or when you photograph a page from a recipe book…

software-recommendation ocr

asked Aug 31 '11 at 08:33

Strapakowsky

11,664
15
36
40

39

votes

1 answer

How do I install a new language pack for Tesseract on 16.04

Just installed gscan2pdf v1.3.9 as well as Tesseract. As for the latter, first it appeared at the bottom of my Installed Software list, but now it seems to be gone, although still working (I think). Anyway, I'm trying to turn a pdf of a scanned…

language-support ocr

asked Jul 01 '16 at 16:37

m.a.a.

615
1
7
14

37

votes

9 answers

How can instantaneously extract text from a screen area using OCR tools?

In Ubuntu 12.10, if I type gnome-screenshot -a | tesseract output it returns: ** Message: Unable to use GNOME Shell's builtin screenshot interface, resorting to fallback X11. How can I select a text from the screen and convert it to text…

12.10 software-recommendation screenshot ocr

asked Apr 11 '13 at 22:11

Erling

457
2
6
9

36

votes

7 answers

How to turn a pdf into a text searchable pdf?

I have a number of scanned documents in pdf and I want to be able to search them. How can I do that? Essentially I have to OCR the pdf and then blend the extracted text back into a new pdf. I have unsuccesfully tried a number of different solutions…

software-recommendation pdf ocr

asked May 29 '14 at 09:37

don.joey

28,402
17
82
104

33

votes

9 answers

Adding OCR info to a PDF

I have a good quality scan of a document; such scan is in pdf format. How can I add ocr information to the pdf, so that it becomes searchable? By searchable I mean that the goal is that when viewing the pdf with evince, CTRL-F actually allows me to…

pdf scanning ocr

asked Jun 07 '12 at 08:56

fdierre

1,003
5
14
24

15

votes

5 answers

How do I edit text in a scanned .jpeg?

I need to upload a scanned image as a PDF document. After scanning the document, I have a .jpeg with small text that I want to edit before converting to PDF for the upload. I have never done this before so I'm really stuck. How can I do this?

software-recommendation pdf scanning ocr

asked Dec 05 '12 at 20:36

Mysterio

11,838
28
85
119

6

votes

1 answer

How do I produce a multi-page sandwich pdf with hocr2pdf?

I used tesseract to produce the special html to use with hocr2pdf starting from a muti-page tif. I tried using hoc2pdf to produce a "sandwich pdf" (image + hidden text layer). Hocr2pdf produces a one page pdf with all the pages superimposed. Is…

pdf ocr

asked Mar 22 '13 at 15:50

To Do

15,172
12
70
116

6

votes

2 answers

document management private users

i am searching for a document management system that supports: can bulk scan documents automatic OCR of scanned documents data storage on my local HD / external server of my choice automatic backups (not that important) proper full text search…

document-management ocr

asked Mar 19 '13 at 00:15

Alex

429
1
4
11

6

votes

1 answer

How can I specify the language to be used by Tesseract when using OCRFeeder

I'm using the OCR-utility of OCRFeeder. OCRFeeder is using the tesseract-engine. I have installed the several language-packs needed for tesseract. How can I set the language such that tesseract will use the right language-file for converting the…

ocr

asked Feb 10 '11 at 18:44

Bernard Decock

466
1
6
16

5

votes

2 answers

Abbyy fine reader like application for Ubuntu 13.04

I have a lot of images and what I want to do is to scan those images and get output in ms word file that can be edited later. For Windows, I have Abbyy fine reader. But I don't want to go back to Windows. Please tell me if there is any application…

files conversion ocr

asked May 19 '13 at 07:52

Faisal Aslam

101
3
14

4

votes

1 answer

How to create high fidelity PDFs with copyable text from scans?

Some companies provide software for Windows with their scanners* that can create PDFs from scanned pages which look exactly like the scanned material (as if it were just full-page images) but the text is recognized and copyable. How can I create…

pdf scanner text ocr

asked Sep 24 '17 at 11:16

Damn Terminal

2,636
7
27
36

4

votes

1 answer

ocrfeeder doesn't detect anything

When I try to detect text on my jpeg, it shows correctly all areas where it suspects text and images, but when I export it to ODT it only creates an ODT with empty text- and imageframes. Do I have to configure tesseract somehow? (I use Ubuntu 14.10…

ocr tesseract

asked Jul 03 '15 at 23:29

rubo77

31,573
49
159
281

4

votes

1 answer

How I prevent hocr2pdf to use a large font from tesseract generated .hocr file?

Tesseract now creates an .hocr file rather than an .html file for ocr output, but this is not exactly what is at issue here. When hocr2pdf uses this output it uses a large text size with small bounding boxes since the upgrade. Most of the text…

ocr

asked Jul 02 '14 at 19:23

user299889

41
5

4

votes

0 answers

How to add OCRed text to original pdf in gscan2pdf?

I am new to gscan2pdf 0.9.31, and just used it to OCR a scanned pdf. After saving the pdf, the OCRed text is stored on the top left corner. However I wish each OCRed character to be added to exactly where it was OCRed from, to make the pdf file…

pdf ocr

asked May 10 '11 at 01:56

Tim

24,657
62
151
245

Questions tagged [ocr]