GSoC Student Scraper

This project scrapes a GSoC project archive url and converts this data in a csv file. this project also filters the students of IIT Kanpur from this scraped data.

General Instructions

The default webpage to be scraped is the GSoC-'19 archive if you wish to scrape another Google Summer of Code archive enter the relevant URL when prompted.

The 'student.json' file is used as given.

Install packages

Use the package manager pip to install requests and bs4 packages

pip install requests
pip install bs4

Usage

First enter the 'scrape' directory.

cd scrape

Then execute the following

python scrape_data.py
python sanitize_and_combine_data.py

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
scrape		scrape
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GSoC Student Scraper

General Instructions

Install packages

Usage

About

Releases

Packages

Languages

InvincibleKnight/GSoC_Scraper

Folders and files

Latest commit

History

Repository files navigation

GSoC Student Scraper

General Instructions

Install packages

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages