I'm a full-time Python Developer
Following are my key abilities
Web Scraping, Web Crawling
Data Analysis, Data Extraction ( From Web, PDF, Image)
Image Processing, Image OCR
I am capable of scraping data from various sources like:
Websites
PDFs
Images etc.
According to your output could be in:
JSON
CSV, Spreadsheet, Excel sheet
Pandas Dataframe, or some other format you prefer.
I will use:
lxml, requests, urllib, scrapy, beautifulsoup, selenium, and so forth relying on the prerequisites and nature of the websites for web scraping
modules like pdftotext and Tesseract when OCR is required for PDF data extration
can perform optical character recognition (OCR) , Image processing utilizing OpenCv, Tesseract, and such to process images, clean/split them, and extract data from images