Banner Image

Skills

  • API
  • Artificial Intelligence
  • Automation Engineering
  • Chatbots
  • Data Extraction
  • Data Management
  • Financial Services
  • JavaScript
  • JSON
  • Management
  • Programming
  • Python
  • Software Development
  • SQL
  • Web Development

Sign up or Log in to see more.

Services

  • Python, Web Scraping and Automation

    $20/hr Starting at $100 Ongoing

    Dedicated Resource

    Hi, I am Mark, a web scraping expert with extensive experience in web scraping techniques and libraries such as Selenium, requests, Beautiful Soup and Scrapy. With me, you can get your data scraped and...

    APIArtificial IntelligenceAutomation EngineeringChatbotsData Extraction

About

Python/Django/Flask, Web scraping and Browser Automation

Hi, I am Mark, a web scraping expert with extensive experience in web scraping techniques and libraries such as Selenium, requests, Beautiful Soup and Scrapy.

With me, you can get your data scraped and tasks automated within the first 24 hours of the contract, using my 5+ years of experience in Python Selenium Bot Development and Web Scraping.

Scraping Projects:
1️⃣ Login Protected Sites e.g: SEMrush
2️⃣ Puzzle Captcha Protected Sites e.g: Google, claimittexas
3️⃣ Real Estate & E-commerce Sites Data e.g: Zillow , ambalaza
4️⃣ Javascript / Dynamic Data Scraping e.g: Genius, Similarweb
5️⃣. Static Data downloading and parsing 100 Million + rows extracted since Febrary 28, 2020.

📋 Python Tech Stack:
1. Python : Core python, Django, Flask
2. BeatutifulSoup - Requests, Cookies, Header, Session
3. Selenium : webdriver, undetected webdriver, customized webdriver
4. SQL : PySql, mysql, mongoDB, JSON, CSV, Excel
5. Pandas : DataFrame

Skills and Experiences:
✔ IP Rotation
✔ Concurrent Threads
✔ Bypassing Captcha and Cloudflare
✔ Fast and Efficient Processing for Bulk Data (Using Python Advanced Data Structures)
✔ HTML Parsing and DOM manipulation
✔ Data Consistency (Free from unwanted strings and AI fillers)
✔ Part of Speech tagging on Millions of Tokens
✔ Grammatical errors checking on 20 Million lines of text.
✔ Explicit Image tagging
✔ HTML to Image Conversion
✔ PDF Extraction

🗄️ Databases:
1. MongoDB
2. PostgreSQL
3. MySQL
4. MS SQL

Output Formats:
JSON, CSV, XLSX, TXT

Bots:
1. Reddit posts aggregator ...................................................| 24 / 7 extractions and organization
2. YouTube ....................................................................................| Automation and statistics collection
3. Browser Automation ..........................................................| Automated Web form submission and actions

Cloud Infrastructure - Linux and Windows VPS:
1. Bind domain with machine
2. User permissions
3. Resource Management - Storage Volumes
4. Cron Jobs Setup
5. Database Backups and Restore
Java: Jsoup, Firefox & Chrome Driver

Communication lines are open 7 days a week via text, voice, or video meetup.
Mon - Thurs 7:00AM - 10:00PM EST
Fri - Sun 8:00AM - 11:00PM EST