Xsane and Tesseract OCR

Asked by David Boyd

I have downloaded and installed Tesseract 2.04 OCR. It works well with gscan2pdf but I can't get it to work with Xsane. What OCR command do I need to type into Xsane Setup? Do I need to download anything else to make it work. Please explain in simple terms how to get these two programmes to work together so I can scan with Xsane and then convert to text with Tesseract OCR. I'm using Ubuntu 9.04 Jaunty with an AMD 64 dual core processor.

Question information

Language:
English Edit question
Status:
Answered
For:
Ubuntu tesseract Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Larry Jordan (larryjor) said :
#1

     Personally, I never had that much trouble with gocr, though I don't use it much. Would be nice to hear back from you as to how (much better?) this works once you have it.
     I found a page on using it with xsane at http://ubuntuforums.org/showthread.php?p=4304463 that suggests you may need an additional program to use it. It gives a link to set up with.
     With gocr, the lines are:

OCR Command: gocr
Inputfile option: -i
Outputfile option: -o

     Options for gocr state that -i (file) reads input from (file) and -o (file) sends output to (file) instead of stdout. As for what the command line is to use with Tesseract, I don't have it yet and don't see documentation. I do see some references to a man page (try 'man tesseract' in a terminal and see if you have it).

      Again, please let us know how well it works out for you once you get it set up.

Revision history for this message
Larry Jordan (larryjor) said :
#2

     Personally, I never had that much trouble with gocr, though I don't use it much. Would be nice to hear back from you as to how (much better?) this works once you have it.
     I found a page on using it with xsane at http://ubuntuforums.org/showthread.php?p=4304463 that suggests you may need an additional program to use it. It gives a link to set up with.
     With gocr, the lines are:

OCR Command: gocr
Inputfile option: -i
Outputfile option: -o

     Options for gocr state that -i (file) reads input from (file) and -o (file) sends output to (file) instead of stdout. As for what the command line is to use with Tesseract, I don't have it yet and don't see documentation. I do see some references to a man page (try 'man tesseract' in a terminal and see if you have it).

      Again, please let us know how well it works out for you once you get it set up.

Revision history for this message
Larry Jordan (larryjor) said :
#3

     Sorry, additional program you need is called xsane2tess.... still, it is referenced in the forum post.

Revision history for this message
David Boyd (daboyd) said :
#4

Thanks Larry. I looked up the page http://doc.ubuntu-fr.org/xsane2tess and as I can't read French don't understand how to get the scripts mentioned for xsane2tess. If anyone can translate, again in simple terms, please let me know.

Revision history for this message
Larry Jordan (larryjor) said :
#5

    Oh, sorry; didn't realize it would be in French at that location and thought you might look for it (web search). I found a better location (in English) after doing a search for you; maybe this will solve it more correctly:

http://aur.archlinux.org/packages.php?ID=24702

    There is a link for it at the bottom - looks like it is a script in Bash that they are using, which might make it fairly straightforward to correct (if necessary).

Revision history for this message
David Boyd (daboyd) said :
#6

OK Larry. I went to archlinux.org/packages, downloaded xane2tess tarball. Unpacked it. Now what do I do?

Revision history for this message
Larry Jordan (larryjor) said :
#7

     Looks as though it is set up to work much the same as gocr.. You can test it by using 'xsane2tess -i (input file in graphic format) -o (whatever file name you want for the text file). You should be able to set it up in xsane with OCR Command as xsane2tess and just the -i -o in the other fields.

Can you help with this problem?

Provide an answer of your own, or ask David Boyd for more information if necessary.

To post a message you must log in.