Skip to content

Different implementations and comparisons of HITS (Hubs and Authorities) Algorithm in Pig and Spark, using Hive

Notifications You must be signed in to change notification settings

SuYoungHong/HITS_Algorithm

Repository files navigation

HITS_Algorithm

Implementations of the Hubs and Authorities Algorithm (HITS) in Apache Spark and Pig.

Data

Data used is page links of wikipedia pages. Source and description is in the link below:

http://haselgrove.id.au/wikipedia.htm

Overview

Hive directory contains code for reading data into a hive tables and transforming tables into edge list

Pig_Implementation directory contains code for implementing algorithm in Apache Pig

Spark_Implementation directory contains code for implementing algorithm in Apache Spark.

About

Different implementations and comparisons of HITS (Hubs and Authorities) Algorithm in Pig and Spark, using Hive

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published