Skip to content

Latest commit

 

History

History
21 lines (16 loc) · 940 Bytes

README.md

File metadata and controls

21 lines (16 loc) · 940 Bytes

WikiMapper

Parse Wikipedia XML data dumps to create an interactive graph to explore relationships between Wikipedia pages.

This project has been split into seperate smaller projects, in the subfolders explorer/ and dbLoader/. The READMEs for these projects are linked below.

Overall Project Aims

  • Parse the XML file, extracting page names and links
  • Analyse performance using GProf
  • Parallelise the parser.
  • Import data into Neo4j using Neo4j-Admin-Import
  • Setup a third party visual Neo4j graph database explorer
  • Develop a custom graph storage library
  • Develop a custom viewer to visualise the data.

Progress Log

In order for me to keep track of what I am doing on this project. I have tried to keep a progress log. Some sections are missing, or very sparse. Progress Log