scraper_dominios

Exploring Github Actions to scrape data from argentina internet domain registrations.

The data is found here, it is named in the form Year Month Day .csv and is updated every night when there is new data in the official bulletin.

The fun part is that the data is updated via a Github Action (found here that runs every night with a cron and commits the new data to the repo if there is any.

Limitations of using Github Actions as scrapers

The biggest limitation is that there is a maximum of 2000 minutes to run jobs in the actions per account. So it is not useful for scrapers that run for a long time or very often but a good option for small scrapers.

The minimum cron interval is 5 minutes (maybe you can overcome this by putting more crons).

You have to install the dependencies every time you run the action (which can be time consuming for large projects), but you can cache the dependencies to save time.

The maximum size of the repo (with the whole story) is 100 GB and a file is 100 MB.

TODO

[ ]: Scrape old domains.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

scraper_dominios

Limitations of using Github Actions as scrapers

TODO

Files

README.md

Latest commit

History

README.md

File metadata and controls

scraper_dominios

Limitations of using Github Actions as scrapers

TODO