Created
October 2, 2023 04:37
-
-
Save TheJagStudio/d675d42bab61f7d1d27357166fbad545 to your computer and use it in GitHub Desktop.
Hello Alexgarciaarb,
It seems like you're encountering an issue because Tesseract OCR is not installed or not in your system's PATH. To resolve this, you can follow these steps to install Tesseract OCR on Google Colab:
- Install Tesseract OCR in Colab:
!apt install tesseract-ocr
!apt install libtesseract-dev
!pip install pytesseract
- Import the pytesseract module in your Colab notebook:
from pytesseract import image_to_string
- Verify Tesseract installation:
After running the above commands, you can check if Tesseract is correctly installed and accessible by running:
import pytesseract
print(pytesseract.get_tesseract_version())
Make sure there are no errors, and the Tesseract version is displayed.
- Update the Tesseract path:
If the problem persists, you might need to explicitly specify the Tesseract executable path in your Colab notebook. You can do this by adding the following line:
pytesseract.pytesseract.tesseract_cmd = '/usr/bin/tesseract'
Now, try running your code again after these steps. It should resolve the issue, and you should be able to use Tesseract OCR in your Colab environment.
Let me know if you encounter any further issues or if you need additional assistance!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi Jagrat, I have been trying to execute your code and the outcome has not been successful. I'll share with you the error I got to make. I am using Colab. I installed this one !pip install "pytesseract" and "from pytesseract import image_to_string".
the error comes after executing this line code:
text_with_pytesseract = extract_text_with_pytesseract(convert_pdf_to_images)
print(text_with_pytesseract)
"tesseract is not installed or it's not in your PATH. See README file for more information".
Thank you for your help.