Skip to content
This repository has been archived by the owner on Apr 7, 2022. It is now read-only.
/ scraper_dominios Public archive

Exploring Github Actions to scrape data from argentina internet domain registrations.

Notifications You must be signed in to change notification settings

lbellomo/scraper_dominios

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scraper_dominios

Exploring Github Actions to scrape data from argentina internet domain registrations.

The data is found here, it is named in the form Year Month Day .csv and is updated every night when there is new data in the official bulletin.

The fun part is that the data is updated via a Github Action (found here that runs every night with a cron and commits the new data to the repo if there is any.

Limitations of using Github Actions as scrapers

The biggest limitation is that there is a maximum of 2000 minutes to run jobs in the actions per account. So it is not useful for scrapers that run for a long time or very often but a good option for small scrapers.

The minimum cron interval is 5 minutes (maybe you can overcome this by putting more crons).

You have to install the dependencies every time you run the action (which can be time consuming for large projects), but you can cache the dependencies to save time.

The maximum size of the repo (with the whole story) is 100 GB and a file is 100 MB.

TODO

  • [ ]: Scrape old domains.

About

Exploring Github Actions to scrape data from argentina internet domain registrations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages