Implementations of the Hubs and Authorities Algorithm (HITS) in Apache Spark and Pig.
Data used is page links of wikipedia pages. Source and description is in the link below:
http://haselgrove.id.au/wikipedia.htm
Hive directory contains code for reading data into a hive tables and transforming tables into edge list
Pig_Implementation directory contains code for implementing algorithm in Apache Pig
Spark_Implementation directory contains code for implementing algorithm in Apache Spark.