/data
contains the original source data for the data mart- Metadata about external data sources can be found in the
/data/sources.csv
file.
- Metadata about external data sources can be found in the
/data-mart
contains the Scripts used to initilize the data mart, as well as the OLAP queries written to execute on the data mart/data-mining
contains the Jupyter notebooks used to explore tha data, and perform data mining operations/deprecated
contains files that were used at some poiunt in the project, but are not currently in use/seeds
contains the seed files for the data mart that are generated using theseed_generate.ipynb
notebook.
- Open the
seed_generate.ipynb
Jupyter Notebook. Run all the cells (should take around 1 minute), this file is responsible for all data staging, and is separated into different code blocks for each step of the staging process. - Use docker to create and run a PostgreSQL Database server.
- Initilizing the data mart:
- For UNIX Systems
- Assign the name of your docker server to the
db
variable at the top of the/data-mart/init.sh
file. - Execute the initialization script by running
bash data-mart/init.sh
This script copies all the data from the generated seed files to the postgreSQL Database. - In the event you wish to wipe the contents of the database, simply run
bash data-mart/
- Assign the name of your docker server to the
- For Windows systems:
- Assign the name of your docker server to the
db
variable at the top of the/data-mart/init.bat
anddata-mart/refresh.bat
files. - Navigate to the
data-mart
directory by executingcd data-mart
- Execute the initialization script by running
init.bat
This script copies all the data from the generated seed files to the postgreSQL Database. - In the event you wish to wipe the contents of the database, simply run
refresh.bat
- Assign the name of your docker server to the
- For UNIX Systems
This data mart was created By Logan Rose, Lilian Ly, and Jonathan Brar for CSI4106 at The University of Ottawa in Winter 2022.