-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converting pie charts into stacked bar charts #83
Comments
It is actually "Confirmed mode" or "Confirmed purpose" as the case may be.
As I said in
I know what stacked bar charts look like. There is an implementation of a visualization using stacked bar charts in #78 and the related paper.
In our case, however, the numeric values are not as important as the proportion -e.g. the proportions of a whole I look forward to your recommended design, along with backing references, on how best to proceed. |
e.g. search for "how should I replace pie charts?", read through all the results and make a recommendation that matches our specific use case and is well regarded in the literature, along with the justification. If you find strong evidence that regular bar charts are the best way to show proportions of a whole, I am willing to see it, but I don't think that is the case. |
While planning out the replacement, you should also consider that we want to eventually include error bars for these values - e.g instead of having 20% e-bike trips and 30% car trips, we will have 15-25% e-bike trips and 25-35% car trips. That is one of the reasons that we are switching away from pie charts - I am not sure how to put error bars on pie charts. |
visualization is about telling a story. you might want to read some of the material we have published on this to see the story that we want to tell, and then figure out how to represent that story in the graphs. Note again that the story is typically not "participants took 2000 e-bike trips" but more that the program achieved a 30% e-bike mode share. e.g. https://www.osti.gov/biblio/1778194 I am also sending you to the in-progress version via email. |
Pie charts vs Bar ChartsOne of the major problems with pie charts is that it is almost impossible to directly compare the sizes of two slices unless the differences are very large whereas with bar charts our eyes compare the end points. Because they are aligned at a common baseline, it’s very easy to assess relative size. This makes it easy to see not only which segment is the largest but also how incrementally larger it is than the other segments. Many articles do suggest using a bar chart as a replacement to pie charts but because our use case show proportions as a whole, stacked bar charts are a better choice. For example - We want to see 'The number of trips for each replaced transport mode for e-bike only'. In this case, we want to know about the transport mode most replaced by e-bike which the biggest stack in a stacked bar chart will tell us at a glance. After going through the materials provided above I have come up with a few alternatives - Percentage, or stacked bar charts:It will be a single bar with each category having a coloured section within the bar accounting for a proportion of the total. The labels can sit inside the chart itself, providing an overall cleaner visual. There is also room for more categories than 3 or 4 as recommended for pie charts. Something like this - If our eventual goal is to add errors to the stacked bar chart then we can do something like this in stacked bar chart - Implementation :I have also experimented with the implementation using python libraries numpy, pandas and matplotlib as mentioned in the article : |
All that sounds good wrt stacked bar charts. But:
|
I would like to take an example and explain the alternatives to pie charts and their pros and cons: Here is a pie chart displaying the distribution of world population in the year 2021. Now we have 3 drawbacks while displaying this information in a pie chart -
For the human eye it is easier to compare two squared surfaces. Therefore, bar chart and stacked bar charts are 2 options. Horizontal bar charts:These charts essentially solve issues 1 and 2. Each feature can be displayed in a horizontal bar. The reader can readily understand the population of one country and compare it to another country. If a large number of countries is to be displayed, the number of horizontal bars will simply be more without affecting the reader's ability to read the data. Tree Maps:They are good for representing hierarchical data. Our use case does not have dependent data e.g work, home, shopping etc. Also they are not user friendly. Donut Chart:It has very similar drawbacks as Pie chart Stacked bar chartThese chart solve issues 1 and 3. Because of squared shapes it is easier to compare surfaces between two countries or between two time frames. In our usecase we do not have a lot of characteristics to display therefore 2 is not really as issue in our case. But there are scenarios where we want to compare 2 pie charts like this. (stacked bar charts are a better choice) As our use case show proportions as a whole, stacked bar charts are a better choice. We can easily get an understanding of how each category contributes to the overall picture. For example - We want to see 'The number of trips for each replaced transport mode for e-bike only'. In this case, we want to know about the transport mode most replaced by e-bike which the biggest stack in a stacked bar chart will tell us at a glance. The stacked bar chart with just be a single bar showcasing each purpose of the trip stacked onto one another.
I will use the implementation in #78 |
@swastis10 this still does not address all my questions and raises new ones New questions:
existing questions:
Concretely, I think you can come up with a design that addresses both new question |
new questions
As stated in https://www.nngroup.com/articles/treemaps/, bar charts should be preferred over Treemaps if possible.
Trends in mode share shifted over time. E-bike mode share dropped as the weather grew colder. But even in the height of winter in Colorado, the e-bike mode share was a respectable 25% Or number of trips for each mode over a time period for different programs. existing questions
There were other alternatives which did not seem valid in our case. For example : Source: https://slidebazaar.com/blog/alternatives-to-pie-charts/ Here, we can see the number of trips for each purpose and also compare it to the specific mode (say e-bike) and the purposes for it being used. |
@swastis10 for this:
we have timeseries visualizations in the public dashboard.
Do we have multiple programs in the dashboard now? How would we handle this with the current metrics? Note that the public dashboard is updated daily.
This is roughly what I had in mind when I said "Concretely, I think you can come up with a design that addresses both new question #2 and existing question number #2 if you think hard enough. " But not exactly. What would you do for studies, for example?
What about non-purpose pie charts? |
@shankari , because we are working with time series of correlated data points; it is a good practice to plot the data as a line plot. Had the data been uncorrelated, then bar charts were a better option over line plots. So timeseries is good.
We do not have multiple programs in the dashboard. (Default only) Are we testing studies only?
|
This was a rhetorical question that was designed to guide you towards a better design, as are almost all my questions from the comment. I know that we don't have multiple programs now - I wrote the dashboard code (both in the CanBikeCO and NREL OpenPATH versions). The goal was to make you think about the second part of that question: "How would we handle this with the current metrics?" We don't have program specific metrics any more. Rhetorical question: What metrics do we have? How would you want to handle them particularly as it comes to comparisons? More concretely: You had said "Or number of trips for each mode over a time period for different programs." as a reason for using stacked bar instead of regular bar graphs. But we don't have multiple programs in the dashboard now. So we will not want to show the number of trips for each mode over a time period for different programs. Rhetorical question: What are some metrics that we might be able to compare?
Read the code. Rhetorical question: Does the code only work with studies?
But this has only one bar. In prior comments, you have said that the advantage of stacked bars over regular bar graphs is that it supports comparisons. But if you don't do any comparisons, then what is the advantage? I am going to wait for one more response and then just give you a design to implement because I feel like we are just discussing in circles and this is taking too much time. |
much better, but still not a very well-researched or well-understood plan. At this point, I will take over and provide a detailed, step-by-step design and you can focus on the implementation. At a high level, we need to support both studies and programs. All the e-bike comparisons that you have outlined are only relevant for programs. "What does an e-bike replace" is not relevant unless you are providing e-bikes to the people who are collecting the data. So the answer to:
is that OpenPATH is used for both studies and programs. Programs have a mode of interest and studies do not. So you cannot just copy-paste analysis results from a program-oriented paper for your design.
This is much closer to what we are looking for, since it is relevant for both programs and studies. Proposed designThe current list of metrics for studies is defined in https://github.com/e-mission/em-public-dashboard/blob/main/frontend/metrics_study.html Common pie charts between studies and programs, grouped into similar bins for comparison: Group #1 (number of trips):
Group #2 (trip mileage):
Group #3 (trip purpose):
Additional pie charts only for programs:
So a basic new design would be:
A better new design would be:
Group 5a:
Group 5b:
This design meets all of the requirements:
You will also need to change the frontend to display the new metrics, update the dropdowns and make sure that the proportions of the boxes match the proportions of the new figures. |
Now tracked in #86 |
As we plan to convert pie charts into bar charts, I have a few suggestions:
Existing pie chart example:
1. Bar Chart:
A standard bar chart will compare numeric values between levels of a categorical variable which in our case is "Confirmed trip". One bar will be plotted for each level of the categorical variable, each bar’s length indicating numeric value. As we are looking at numeric values across one categorical variable("Confirmed mode"), I feel using a bar chart is a better option than using a stacked bar chart. We can have something like -
I will also add proportions on top of each bar or change the y-axis to percentages.
2. Stacked Bar Chart:
In stacked bar chart, we will have only one bar ( representing confirmed mode) divided into a number of sub-bars stacked end to end, each one corresponding to a level of the second categorical variable(different confirmed modes are (Bus, walk, e-bike).
It will also not be straightforward to compare other division's values across the one bar we have (confirmed trip) except the one division plotted against the base line.
It will look something like this -
or
@shankari any thoughts?
The text was updated successfully, but these errors were encountered: