Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add script for making descriptive bar plots #18

Merged
merged 5 commits into from
May 23, 2024

Conversation

michellewang
Copy link
Contributor

@michellewang michellewang commented May 11, 2024

Closes #5.

Feedback welcome, especially about the layout since I don't think it looks great at the moment and we will have to see how it integrates into the dashboard. Also this is using the default colours but we probably want to customize them

Example (whole sample):

Screenshot 2024-05-11 at 7 32 10 PM

@alyssadai alyssadai self-requested a review May 21, 2024 18:46
Copy link
Collaborator

@alyssadai alyssadai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great, thanks @michellewang!

I left some suggestions to simplify the code a bit; feel free to take what makes sense to you.

In terms of layout, I'm sure Eric will have his own opinions (so we may want to wait to implement some of the following until Friday), but some ideas from me:

  • Now that I'm looking at the bar plots, I'm wondering if it's more visually intuitive to have the y-axis be %, and show the actual counts on hover instead. Wdyt?
  • If we do want to keep counts on the y-axis, might be worth having a y-axis title/titles that say "count (1000s)" or similar, and just make the axis tick labels integers, if that makes sense / is not too hard
  • For the demographic variables w/ longer categories (political party, employment status, household origin), maybe we can wrap the x-axis tick labels so they don't extend as far. Here's a function I used to do this in the digest dashboard, for reference (there might be easier ways). If that seems like too much hassle however, we could just add more vertical space between plots so there's no overlap. totally up to you
  • Since q2 is a bit unique but still has long category labels, maybe exceptionally for this plot, we could switch the orientation of x and y such that the likert responses are on the y, and the counts/proportions along the x. Perhaps that could help to a) distinguish it from the other demo variables 2) make it more readable (?)
  • In general, I think a slightly bigger gap b/w subplots on the same row may be nice (the bars themselves can be fairly thin), for more visual whitespace?
  • Minor, but I prefer a white background to the default seaborn (?) purplish grid background for cleaner-looking plots. Maybe there is a simpler white theme we can use?

If you strongly agree with any of the above, maybe you can implement them before merging. For the rest, we can revisit on Friday :)

Also, if you find any of the customization challenging using plotly graph_objects, I suggest checking out plotly express, which is more high-level but should have a simpler syntax (this is what I used for the dashboard)

climate_emotions_map/make_descriptive_plots.py Outdated Show resolved Hide resolved
Comment on lines +12 to +15
COL_DEMOGRAPHIC_VARIABLE = "demographic_variable"
COL_CATEGORY = "category"
COL_N = "n"
COL_PERCENTAGE = "percentage"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These column names can be expected to always be the same in the TSV. For simplicity/code readability, what do you think about getting rid of these global COL_ vars and just referring to the column name directly in the functions below?

return bar_plot


def make_impact_plot_traces(data: pd.DataFrame):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def make_impact_plot_traces(data: pd.DataFrame):
def make_impact_plot_traces(df: pd.DataFrame):

for consistency, since you have df as the input param for the other functions

for category in ["No", "Yes"]:
data_category = data_impact.loc[data_impact[COL_CATEGORY] == category]
traces.append(
go.Bar(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the grouped bar plot especially, you might find it easier to use plotly express instead of the more low-level graph_objects. You should be able to achieve the same goal in much fewer lines (and it may be more straightforward to customize as well). Up to you, but check out https://plotly.com/python/graph-objects/#comparing-graph-objects-and-plotly-express for an example!

return fig


if __name__ == "__main__":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just for testing things out, right? If so, I'd say we comment this out before merging

Copy link
Collaborator

@alyssadai alyssadai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(gonna approve this guy since it's functional !)

@alyssadai
Copy link
Collaborator

Hey @Remi-Gau, pinging you here in case you have more feedback on the sample descriptive plots :)

These will live in a drawer component that can be toggled visible or not in the app, see GIF preview in:

@michellewang
Copy link
Contributor Author

Merging since I won't have time to change this before the meeting tomorrow

@michellewang michellewang merged commit d11a479 into neurodatascience:main May 23, 2024
1 check passed
@michellewang michellewang deleted the 5/histograms branch May 23, 2024 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create function to generate a sample description plot
2 participants