Banner Image

All Services

Other

Web Data Analysis

$5/hr Starting at $25

I'm analyzing the Common Crawl database (which collects websites from the World Wide Web) for November-December 2023. I've downloaded and processed a database weighing 9 terabytes (compressed). Now I have a database of over 3 billion URLs + titles + the number of characters on each page. 

Example format: 

https://guru.com 751 Dashboard - Freelancers - Guru. 

I usually use tabulation as a delimiter. Currently, I'm scanning the entire database for the keywords you need. For example, you might need to find pages where something specific is in the URL or you need to find pages with information on a specific topic. Additional analysis of the found pages is also possible. For example, if you need the HTML code of each found page - I can do that, but if there are a lot of pages (100,000 - 300,000 or more), it may take some time (several days). 

The cost of my services: 

If the number of pages found is less than 100,000 - 12 USD; 

If the number of pages found is more than 100,000 - 12 USD + 0.2 USD for every 10,000 pages;

Parsing the HTML code of each page - 4 USD for every 10,000 pages;

Analysis of HTML page code - 3 USD for every 10,000 pages.


About

$5/hr Ongoing

Download Resume

I'm analyzing the Common Crawl database (which collects websites from the World Wide Web) for November-December 2023. I've downloaded and processed a database weighing 9 terabytes (compressed). Now I have a database of over 3 billion URLs + titles + the number of characters on each page. 

Example format: 

https://guru.com 751 Dashboard - Freelancers - Guru. 

I usually use tabulation as a delimiter. Currently, I'm scanning the entire database for the keywords you need. For example, you might need to find pages where something specific is in the URL or you need to find pages with information on a specific topic. Additional analysis of the found pages is also possible. For example, if you need the HTML code of each found page - I can do that, but if there are a lot of pages (100,000 - 300,000 or more), it may take some time (several days). 

The cost of my services: 

If the number of pages found is less than 100,000 - 12 USD; 

If the number of pages found is more than 100,000 - 12 USD + 0.2 USD for every 10,000 pages;

Parsing the HTML code of each page - 4 USD for every 10,000 pages;

Analysis of HTML page code - 3 USD for every 10,000 pages.


Skills & Expertise

AnalyticsData AnalysisHTMLInformation TechnologyScanning

0 Reviews

This Freelancer has not received any feedback.