Task 1: Set Up Your Local Development Environment
- Clone the quantium-starter-repo repository using this command:
git clone https://github.com/vagabond-systems/quantium-starter-repo.git
- Download and install Pycharm Community Edition using this link: https://www.jetbrains.com/pycharm/download/#section=windows
- Create a new virtual environment
python -m venv quantium
- Activate your virtual environment
.\quantium\Scripts\activate # Windows
- Install dependencies and add a virtual environment to the Python Kernel python -m pip install --upgrade pip pip install ipykernel python -m ipykernel install --user --name=quantiumj
- Add the dash and pandas packages as dependencies to your virtual environment
pip install dash pip install pandas
- To check your installed dependencies using this command:
pip list
- With your virtual environment active, install all the necessary dash testing dependencies
pip install dash[testing]
- Commit your changes and push them to GitHub. To add a large library in git, first download Git Large File Storage (LFS) using this link: https://git-lfs.github.com/
- Then compressed the library folder in a zip folder. Go to the command prompt and run these command:
git lfs install git lfs track "*.zip" #format of the file or folder git add .gitattributes git add filename.zip git commit -m "Add Zip File" git push origin main
- Submit a link to your repo in the right module.
Task 2: Data Processing
- First, combine multiple daily sales CSV files in one 'csv' file. To do that import packages and set the working directory
import os import glob import pandas as pd os.chdir("D:\quantium\quantium\data") #path of the folder
- Use glob to match the pattern ‘csv’
extension = 'csv' all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
- Combine all files in the list and export as CSV
# combine all files in the list combined_csv = pd.concat([pd.read_csv(f) for f in all_filenames ]) # export to csv combined_csv.to_csv( "combined_daily_sales_data.csv", index=False, encoding='utf-8-sig')
- Remove any row which contains another type of product rather than 'pink morsel' from the product column
df = pd.read_csv("D:\quantium\quantium\data\combined_daily_sales_data.csv") #file path df.head() # droping rows that contains another types of product df = df.drop(df.index[df['product'].isin(['chartreuse morsel', 'gold morsel', 'lapis morsel', 'magenta morsel', 'periwinkle morsel', 'vermilion morsel'])]) df.value_counts('product') # save modified csv file df.to_csv('combined_daily_sales_data.csv')
- Combined "quality" and "price" into a single field, "sales", by multiplying them together
df.info() df['price'] = df.price.str.replace(r'$', '')\ .str.replace(',', '.').astype(float) # replace the $ symbol and convert the price into float df['sales'] = df['price'] * df['quantity'] # multiply the values and combine it into a single field, 'sales' # save modified csv file df.to_csv('combined_daily_sales_data.csv')
- Commit and push your changes, then submit a link to your repo
Task 3: Create a Dash Application
- Import dependecies
from dash import Dash, dcc, html # Using plotly.express import plotly.express as px import pandas as pd
- Reading the csv file data
df = pd.read_csv('D:\quantium\quantium\data\combined_data.csv')
- Time Series Plot with Daily Sales Data in Date Range
fig = px.line(df, x='date', y='sales', range_x=['2018-02-06','2022-02-14'], title='Dash app') fig.show()
- Import dependecies
import plotly.express as px from jupyter_dash import JupyterDash import dash_core_components as dcc import dash_html_components as html from dash.dependencies import Input, Output import pandas as pd
- Loding the csv file data
# the path to the formatted data file DATA_PATH = "D:\quantium\quantium\data\combined_data.csv" # load in data df = pd.read_csv(DATA_PATH) df = df.sort_values(by="date")
- Making region-specific sales data for Pink Morsel
# Build App app = JupyterDash(__name__) colors = { 'background': '#111111', 'text': '#7FDBFF' } fig = px.bar(df, x="date", y="sales", color="region", barmode="group") fig.update_layout( plot_bgcolor=colors['background'], paper_bgcolor=colors['background'], font_color=colors['text'] ) app.layout = html.Div(style={'backgroundColor': colors['background']}, children=[ html.H1( children='JupyterDash App', style={ 'textAlign': 'center', 'color': colors['text'] } ), html.Div(children='Dash: A web application framework for your data.', style={ 'textAlign': 'center', 'color': colors['text'] }), dcc.Graph( id='example-graph-2', figure=fig ) ]) # Run app and display result inline in the notebook app.run_server(mode='inline')
- Create pink morsel test file and submit it