Senior Data Analyst / Data Scientist II Technical Assessment

Expectations

We don’t expect you to spend more than 120 minutes on these two exercises, so just work through the questions within the time limit and see what you complete
In the interest of respecting your free time, we suggest limiting any packages or modules imported to those referenced in the job description
We think the provided data is sufficient to answer some really interesting questions and do not expect you to incorporate any additional data sources
We don’t expect you to be familiar with our filing data - please make assumptions, but be sure to document them in your code so that we can follow your thought process.
When you’re ready, send us back a text file with the refined SQL query (for exercise 1) and an attached file containing the data visualization you generated and the corresponding analysis (for exercise 2).

Exercise One: SQL Refactor

Objective: Refactor this SQL Query so that it is easily readable by your teammates and fix any syntax errors you come across.

Background:

We store our data in a redshift database, and while we’ve done a lot of good work modeling this raw resource to help ourselves and teams across the organization quickly answer basic questions about fundraising patterns on our site through a BI tool, many of the deeper explorations our team does for the organization starts with manipulating the data stored in our redshift cluster using SQL.
This query is a messy version of the query we used to pull the data used in exercise two of this take home. All of the where clauses, and table sources are correct and don’t need to be updated, but this is obviously not a great work product to share with colleagues - it’s hard to scan for quick understanding of the work, has inconsistent syntax, and there are some syntactical errors that would cause this query run to fail.

Suggested Time: 30 minutes

Notes:

If you are not familiar with Common Table Expressions (CTEs) check out this resource
Everyone has different SQL styles and we are intentionally not sharing our internal style guide, so prioritize consistency over a specific style
Highly suggested that you leave comments throughout the file to explain your work, express uncertainty, or ask questions

Exercise Two: Storytelling using a BI Tool

Objective: Draw insights from ActBlue’s (publicly available) FEC filing using a BI tool of your choice e.g Tableau Public, Looker Data Studio, etc.

Materials:

Dataset #1: A sample of contribution data from ActBlue’s FEC filings:

Data_Science_Technical_FEC_Filing_Sample.csv is a sample of records derived from 3 of ActBlue's monthly filings in the 2020 cycle: February Monthly 2020, March Monthly 2020, April Monthly 2020.

Dataset #2: FEC committee data from the 2020 cycle:

Data Science_Technical_FEC_Committee_Data_2020.csv is derived from the Committee Master file for 2019-2020 provided on the FEC Bulk Data page.

Tools: Python or R (required) and BI tool

Instructions:

Execute and present the data integrity checks you run before analyzing a new data set. Additionally, include details about ways that you cleaned the data before analyzing it.
Merge the Data_Science_Technical_FEC_Filing_Sample data with the Data Science_Technical_FEC_Committee_Data_2020 data
Document and present any statistical analysis that you performed on this data (e.g cohort analysis). For data science candidates, we do expect some demonstration of statistical knowledge, modeling, or machine learning
Using a BI tool of your choice visualize the FEC datasets. Then, ask and answer one to two questions of your own to tell us an interesting story with this data!
When using a BI tool to build your dashboard, include screenshot along with dashboard attached to ensure we can open it upon delivery

Suggested Time: 90 minutes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README SrDaQuestions.md

README SrDaQuestions.md

Senior Data Analyst / Data Scientist II Technical Assessment

Expectations

Exercise One: SQL Refactor

Exercise Two: Storytelling using a BI Tool

Files

README SrDaQuestions.md

Latest commit

History

README SrDaQuestions.md

File metadata and controls

Senior Data Analyst / Data Scientist II Technical Assessment

Expectations

Exercise One: SQL Refactor

Exercise Two: Storytelling using a BI Tool