New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[ENH] Add script for making descriptive bar plots #18

Merged

michellewang merged 5 commits into neurodatascience:main from michellewang:5/histograms

May 23, 2024

Contributor

michellewang commented May 11, 2024 •

edited

Loading

Closes #5.

Feedback welcome, especially about the layout since I don't think it looks great at the moment and we will have to see how it integrates into the dashboard. Also this is using the default colours but we probably want to customize them

Example (whole sample):


          add script for making descriptive bar plots

alyssadai self-requested a review

May 21, 2024 18:46


          Merge branch 'main' into 5/histograms

25a4f71

alyssadai reviewed

View reviewed changes

Collaborator

alyssadai left a comment •

edited

Loading

This is looking great, thanks @michellewang!

I left some suggestions to simplify the code a bit; feel free to take what makes sense to you.

In terms of layout, I'm sure Eric will have his own opinions (so we may want to wait to implement some of the following until Friday), but some ideas from me:

Now that I'm looking at the bar plots, I'm wondering if it's more visually intuitive to have the y-axis be %, and show the actual counts on hover instead. Wdyt?
If we do want to keep counts on the y-axis, might be worth having a y-axis title/titles that say "count (1000s)" or similar, and just make the axis tick labels integers, if that makes sense / is not too hard
For the demographic variables w/ longer categories (political party, employment status, household origin), maybe we can wrap the x-axis tick labels so they don't extend as far. Here's a function I used to do this in the digest dashboard, for reference (there might be easier ways). If that seems like too much hassle however, we could just add more vertical space between plots so there's no overlap. totally up to you
Since q2 is a bit unique but still has long category labels, maybe exceptionally for this plot, we could switch the orientation of x and y such that the likert responses are on the y, and the counts/proportions along the x. Perhaps that could help to a) distinguish it from the other demo variables 2) make it more readable (?)
In general, I think a slightly bigger gap b/w subplots on the same row may be nice (the bars themselves can be fairly thin), for more visual whitespace?
Minor, but I prefer a white background to the default seaborn (?) purplish grid background for cleaner-looking plots. Maybe there is a simpler white theme we can use?

If you strongly agree with any of the above, maybe you can implement them before merging. For the rest, we can revisit on Friday :)

Also, if you find any of the customization challenging using plotly graph_objects, I suggest checking out plotly express, which is more high-level but should have a simpler syntax (this is what I used for the dashboard)

climate_emotions_map/make_descriptive_plots.py Outdated Show resolved Hide resolved

climate_emotions_map/make_descriptive_plots.py

Comment on lines +12 to +15

+              COL_DEMOGRAPHIC_VARIABLE = "demographic_variable"
+              COL_CATEGORY = "category"
+              COL_N = "n"
+              COL_PERCENTAGE = "percentage"

Collaborator

alyssadai May 22, 2024

These column names can be expected to always be the same in the TSV. For simplicity/code readability, what do you think about getting rid of these global COL_ vars and just referring to the column name directly in the functions below?

climate_emotions_map/make_descriptive_plots.py Outdated

		return bar_plot


		def make_impact_plot_traces(data: pd.DataFrame):

Collaborator

alyssadai May 22, 2024

Suggested change

      
            def make_impact_plot_traces(data: pd.DataFrame):
          
            def make_impact_plot_traces(df: pd.DataFrame):

for consistency, since you have df as the input param for the other functions

climate_emotions_map/make_descriptive_plots.py

+                  for category in ["No", "Yes"]:
+                      data_category = data_impact.loc[data_impact[COL_CATEGORY] == category]
+                      traces.append(
+                          go.Bar(

Collaborator

alyssadai May 22, 2024

For the grouped bar plot especially, you might find it easier to use plotly express instead of the more low-level graph_objects. You should be able to achieve the same goal in much fewer lines (and it may be more straightforward to customize as well). Up to you, but check out https://plotly.com/python/graph-objects/#comparing-graph-objects-and-plotly-express for an example!

climate_emotions_map/make_descriptive_plots.py Outdated

		return fig


		if __name__ == "__main__":

Collaborator

alyssadai May 22, 2024

This is just for testing things out, right? If so, I'd say we comment this out before merging

alyssadai approved these changes

View reviewed changes

Collaborator

alyssadai left a comment

(gonna approve this guy since it's functional !)

Collaborator

alyssadai commented May 22, 2024

Hey @Remi-Gau, pinging you here in case you have more feedback on the sample descriptive plots :)

These will live in a drawer component that can be toggled visible or not in the app, see GIF preview in:

Implemented sample description drawer #19

michellewang and others added 3 commits

May 23, 2024 17:05


          Merge branch 'main' into 5/histograms

0b90655


          address easy comments/suggestions

97da2f3


          change "household origin" to "SES of household of origin"

6eeb2a9

Co-authored-by: Alyssa Dai <[email protected]>

Contributor Author

michellewang commented May 23, 2024

Merging since I won't have time to change this before the meeting tomorrow

michellewang merged commit d11a479 into neurodatascience:main

1 check passed

michellewang deleted the 5/histograms branch

May 23, 2024 21:14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet