As a Web Scraping focused Data Engineer, i will be responsible for extracting and ingesting data from websites using web crawling tools. I will own the creation process of these tools, services, and workflows to improve crawl/ scrape analysis, reports and data management. I will test the data and the scrape to insure accuracy and quality. I will own the process to identify and rectify any issues with breaks as well as scale scrapes as needed.
What i have that fits me to fullfil your requirements?
* Experience running large scale web scrapes
* Solid Python knowledge
* Familiarity with Linux/UNIX, HTTP, HTML, Javascript and Networking
* Familiarity with techniques and tools for crawling, extracting and processing data (e.g. Scrapy, pandas, mapreduce, SQL, BeautifulSoup, etc).
* Experience with system monitoring/administration tools
* Experience with version control, open source practices, and code review
* Experience with applications designed to display archived web content
Technology
JavaScript,Python,Google Sheets,Excel,Scrapy,Selenium,UiPath,Beautiful soup
Scraping technique
Information type
- Competitor research
- Contact information
- Content marketing
- Currency & stocks
- Listings
- News & events
- Price comparison
- Products & reviews
- Social media