I have a crawler which crawls several different urls with proxies but sometimes the website blocks the crawl. I have a temporary block message. If I relaunch the entire crawler, the same url could be ok.
I wanted to know if it’s possible to retry a start url with a different proxy until the result is good and to go to the next start url. I think the site blocks the crawl because the proxy is blacklisted.
I don’t know if I am clear…
Lets say I have 3 urls to crawl. The first one is blocked at the first crawl. Before to go to the second url, I want to retry again this one with a new proxie.
At the end, I just want to be sure that all urls have been crawled successfully.