You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For next week @atlasharry and @PattonYin wil prepare the crawl of sites for the countries above. We will then start the crawl around December 27, 2024 and finish it within a week or so.
Once the testing is done (#9), we create a release (#23), and start the crawl per this issue.
1. Countries
The countries to crawl are:
2. Sites
The top 525 sites for each country are listed in this repo.
In addition to each countries' top 525 sites we also crawl the United States top 525 list for each country as a general list.
This will lead to 19*525 = 9,975 crawled sites in total.
3. Google Cloud VM
@atlasharry identified the Google Clould VMs for each country:
@atlasharry will take the lead on the crawl and organize how others can help (however it makes sense).
The text was updated successfully, but these errors were encountered: