Both articles start with a distribution graph – why aren’t people in the labor force by age. There are problems with each:
Solution: Below is how I would create this graph. Note that the lines make the patterns in the data clearer, and not having to scroll makes the big picture more available.
Perhaps a better way (feel free to comment) would be to separate out each issue on its own like below. We kind of want to see how each reason breaks down per age, so we don’t necessarily need them in the same graph:
In the above version of the distribution graph, you can focus on each labor force status one by one as an evolution over one’s age. “In school” starts high and drops quickly. “Unemployed looking” has a smoother curve after age 20. Disability grows slowly and peaks around 60. Employed starts dropping in the late 40’s and then more precipitously in the mid 50’s. This gives us a better basic understanding of how each labor force status evolves over our aggregate lifetimes.
While we now understand the distribution, we don’t really see any trends over time. A big problem with the AggregatedData post is that it looks at “year” as a fade out. This is interesting as an animation (and a beautiful visual, which is why I was drawn to write about it), but when we’re looking at time, we’re looking at trends. Trends are patterns best represented by lines.
Below is another visual from MacroBlog which attempts to look at trends over time as a stacked bar graph. You can kind of see the aggregate growth in unemployed people ages 16-24, but how much is the orange part of that part changing? Can you tell?
In fact, I suspect from the bars that the trends are fairly flat when compared to “in school”, which dominates the rest in terms of numbers. Therefore, I’m going to look at this using a small-multiples graph:
Here it’s easy to see which reasons for unemployment are growing or shrinking. Beware though, some of the smaller changes look larger than they are (like 0.4% to 0.5% from 2007 to 2013 of 15-24 year olds being retired — lucky bastards). In order to mitigate this, I’ve added the start and end percentages as labels so you can clearly see the underlying data.
Looking at distributions by age is interesting, but we’re not really getting to the point of the data. The question on everyone’s mind is, why is labor force participation shrinking? Age is one of the variables that might contribute to labor force status, but how about we just look at the overall trend by status?
The employment rate is down 4%, and it doesn’t look like it’s coming back. Why? Retirements are certainly climbing, but that only explains 30%+ of the problem. More folks staying in school (or enrolling for the first time) explains another 20% or so of the decline and disability explains another 20%+. The rest is a combination of still-high unemployment, which has already fallen in 2014.
As a complement, we shouldn’t be afraid of tables with the actual data. If the table seems too cumbersome, then coloring can help readers pick out outliers. Below is a chart that shows the change in # of people between 2007 & 2013 by labor force status & age group:
Here we’re seeing the real movement of people through our dimensions. There are more 15-34-year olds, and they’ve had a net negative affect on employment. More people in every age group are saying they’re retired, but the big numbers come after 55. With the colors, I hardly have to explain what’s happening because your eyes naturally go to outliers, even slight ones like the fewer 35-54-year olds counting as employed-absent.
To conclude, Macroblog has some great data and an interesting story. But most of the data explanation comes from the writing, not the graphs. By using better data visualization techniques, you can do better at both discovering and explaining complex stories, hopefully while using fewer words than I just did.