We usually use Scrapy to harvest data from websites, in complicated cases we implement various tricks and techniques in particular to avoid bans, e.g. rotation of user agents, proxies, captcha solving and so on. We develop only fast multithreading solutions for crawling and able to process harvested data if necessary.