Skip to content

InvincibleKnight/GSoC_Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

GSoC Student Scraper

This project scrapes a GSoC project archive url and converts this data in a csv file. this project also filters the students of IIT Kanpur from this scraped data.

General Instructions

The default webpage to be scraped is the GSoC-'19 archive if you wish to scrape another Google Summer of Code archive enter the relevant URL when prompted.

The 'student.json' file is used as given.

Install packages

Use the package manager pip to install requests and bs4 packages

pip install requests
pip install bs4

Usage

First enter the 'scrape' directory.

cd scrape

Then execute the following

python scrape_data.py
python sanitize_and_combine_data.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages