Skip to content

Latest commit

 

History

History
18 lines (9 loc) · 541 Bytes

README.md

File metadata and controls

18 lines (9 loc) · 541 Bytes

HITS_Algorithm

Implementations of the Hubs and Authorities Algorithm (HITS) in Apache Spark and Pig.

Data

Data used is page links of wikipedia pages. Source and description is in the link below:

http://haselgrove.id.au/wikipedia.htm

Overview

Hive directory contains code for reading data into a hive tables and transforming tables into edge list

Pig_Implementation directory contains code for implementing algorithm in Apache Pig

Spark_Implementation directory contains code for implementing algorithm in Apache Spark.