Wednesday, January 23, 2013

INSTALL OCR TESSERACT


Download the Tesseract-ocr package at
http://tesseract-ocr.googlecode.com/files/tesseract-2.00.tar.gz

Download the Tesseract-ocr language data package you want here:
http://code.google.com/p/tesseract-ocr/downloads/list

Choose the different packages you want, named "tesseract-2.00.XX.tar.gz" where "XX"
represents the language you want.
Now open a terminal and go to the folder containing your lastest downloads with the command "cd".
You need to run the next commands to install OCR Tesseract : ($> represents the prompt of your terminal)
$>tar –xvzf tesseract-2.00.tar.gz
$>cd tesseract-2.00
$>./configure
$>make
$>sudo make install
$>tar –xvzf tesseract-2.00.XX.tar.gz ("XX" represents the prefix language)| do it
$>sudo cp ./tessdata/* /usr/local/share/tessdata/ | for each language archive | 
$>rm –rf tessdata

Now that you have OCR tesseract on your computer, you can try it with these commands:
$>tesseract phototest.tif test
$>cat test.txt "test.txt" contains the same text than the picture "phototest.tif".

0 comments:

Post a Comment

Twitter Delicious Facebook Digg Stumbleupon Favorites More

 
Design by Free WordPress Themes | Bloggerized by Lasantha - Premium Blogger Themes | Affiliate Network Reviews