Here's some code that tries to solve a problem that I think I've identified with regard to converting tables in images into dataframes that we can work with programmatically.
Right now, I see two main solutions in market. One is to use third-party APIs like Amazon's and Google's products in this area, which can get expensive at scale. Another is to use or build upon complex code that uses image processing libraries like OpenCV to find gridlines and use these to determine table rows and columns.
My hypothesis is that we can use only Pytesseract to read tables, since it provides coordinates of text in images, and tables follow a standard structure (rows and columns). I've been working on the code here accordingly.
Usage is very simple: