False positives are a staple of QR educators who want to scare their audience into appreciating the importance of being quantitatively literate. This article provides the data to scare the pants off us. Quick test: A test has a false positive rate of 5%. Meaning 5% of people tested will be identified as having the disease even though they do not. If you get a positive test result, how likely is it you are sick? Over half the doctors surveyed on similar questions say it is 95% likely you are sick. Obvious right?
Unfortunately this is crazy wrong! Your likelihood of being ill is much lower, depending on the actual prevalence of the disease in the population. If the disease occurs in 1 out of 1,000 people your chance of actually being sick is a meager 2%! This grabs your students attention, as it should, and makes them sit up and learn about conditional probabilities like their life depended on it. The article gives a quick way of getting the 2%, I encourage you to read it before you try explaining how the probability of being sick given a positive result, differs from the probability of getting a positive result given you are actually sick.
Some more low-lights in case you are not convinced: “We found that nearly 90 percent of the patients received at least one unnecessary test and that, overall, nearly one-third of all the tests were superfluous.” Or: “In another paper, from 2016, my colleagues and I interviewed more than 100 doctors to gauge their understanding of the risks and benefits of 10 common medical tests or treatments. We found that nearly 80 percent of our subjects overestimated the benefits. Strangely, the doctors themselves acknowledged this, with two-thirds rating themselves as not confident in their understanding of tests and probability. Eight out of 10 said they rarely, if ever, talked to patients about the probability of test results being accurate.”
Here is the quiz, you probably won’t be surprised that even with a well laid out example in the article on how to calculate false positives, my students still struggled with question #2:
[1] https://www.washingtonpost.com/news/posteverything/wp/2018/10/05/feature/doctors-are-surprisingly-bad-at-reading-lab-results-its-putting-us-all-at-risk/?noredirect=on&utm_term=.adb14ca26351
]]>
The article, Overhyped Media Narratives About America’s Fading White Majority Fuel Anxiety, does a nice job tackling the implications of divisive rhetoric surrounding this issue. In particular, if we allow more flexible multi-racial identities in defining “white” then we see the majority_minority point disappearing into the distant future.
A study that randomly assigned stories on this issue to be read by white people had startling differences in generating hopefulness or anxiety, depending on on whether the story emphasized diversity, the inclusive definition, or the exclusive definition.
Diversity had a polarizing effect, making Republicans anxious and Democrats hopeful; whereas the inclusive definition made both groups more hopeful. This gives me hope.
Q4 Majority-Minority
So the federal government needs to ensure that students taking out loans are being provided an education that will generate a livable income enabling debt repayment. They do this by tracking student loan defaults for 3 years and penalizing schools with high default rates. Clearly we need to define “high”, but we also need to consider the “3 years”. Why 3 years? Who do they track for 3 years? How is “default” defined versus “delinquency” or “serious delinquency?” All of these definitions and organizational practices play a role in constructing the statistics about and scope of the student loan problem. The graphic above shows only 2.1% of schools having “high” “default” rates after three years, but after five years there are 13.1% of schools in this category! How do they magically avoid being sanctioned during the three year window? By encouraging students to take advantage of deferments provided by the government, which sounds altruistic until you realize that often they are just prolonging the inevitable default once the student has passed safely beyond the three year tracking window.
This article is loaded with subtle quantitative arguments for your students to grapple with and weigh in on. Also basic quantitative literacy questions abound, like trying to determine in the graphic above what the 15.5% represents, “15.5% of what group?” stumped most of my students. They simply responded it represented the share of students defaulting on loans as the graphic states, but that does not answer the question. Which schools see the highest increase in defaults from year 3 to year 5?
Obviously the for-profit schools are the most egregious but the private nonprofit schools have an identical percentage increase over this two year period! I highly recommend the article, my students were engaged in the topic and enjoyed the discussions. Here is the take home quiz.
Q2 Student Loans
https://www.nytimes.com/interactive/2018/08/25/opinion/sunday/student-debt-loan-default-college.html
I had to include these graphics here, it took me a bit to figure out what the scales represent on the vertical axes! LFPR = Labor Force Participation rates. The point is that the recession knocked younger men out of the workforce in “large” numbers and they have not returned. These are good graphics to discuss with your class, ask them what story they think the graphics are telling.
The article, The First Count of Fentanyl Deaths in 2016: Up 540% in Three Years, highlights the growing opioid epidemic in this country. I naturally was attracted to the 540% in the title like a moth to a flame. There are nice graphics that help convey the magnitude of our drug problem.
I like this one in particular because the comparisons are a great example of effective communication with numbers. My students struggled however to correctly interpret the references to car crashes, gun deaths and H.I.V. deaths. It nicely puts the drug epidemic in perspective by letting us know we have surpassed previous “epidemic” death totals. Another great question for your class is whether this comparison is fair given the population has changed over time. What would make for a more accurate comparison?
Here is the quiz:
Q4 Opiods
Please type answers if possible and print this document with extra space so you do not have to mush all your work together. Change font for your answers so they stand out. Thanks!
I know my students always struggle interpreting histograms so this was a great chance to provide them practice and discuss the record setting year of natural disasters by which 2017 may be best remembered.
The distributions also appear to be spreading out or getting more variable. The authors claim “this effect is mainly a reflection that some parts of the world are warming faster than others. There is no evidence that temperatures are becoming more variable in most parts of the world after warming has been accounted for.” This was challenging for me to wrap my head around and looking up the research paper did little to clarify. There was this line from the paper that seemed helpful: “In terms of relative magnitude, an increase in variance for regions of low standard deviation will outweigh a similar decrease for a region of high variation.” This sounds a little like the classic problem of increasing $100 by $20 is a 20% increase, but decreasing $200 by $20 is only a 10% decrease.
I did include some the graphics from the research paper in my weekly take-home quiz, but wasn’t able to offer a fully convincing rationale for this argument. Please post comments if you have a good way of explaining why we can conclude there is no evidence of temperatures becoming more variable after warming has been accounted for. In any case, it was a perfect topic for our discussions of distributions of data, and z-scores!
Q3 Hotter
So the distribution of average daily temperatures (at a given location) has a mean and standard deviation over a period of time. They then convert the average daily temperatures to z-scores by subtracting the mean from each daily value and dividing by the standard deviation. The distributions in Figure a are showing z-scores for different periods by always subtracting the mean from 1958-1970. If instead you subtract means for each period you get these distributions in Figure b and the shift disappears. Mathemagic!
Further math shenanigans yields Figure g showing the global standard deviation getting smaller over time:
In DLS Joel Best discusses organizational practices and how choices must be made that affect the statistics computed. Discuss how organizational practices impact whether we conclude temperatures are getting more or less variable.
]]>
How to put all this water in perspective? The article, A Texas Farmer on Harvey, Bad Planning and Runaway Growth, discusses how long it would take the U.S.to consume the 15 trillion estimated gallons of water that had fallen. The answer they give leads to a surprising rate of water usage per person per day! Good discussion in class about why this rate is so high :O)
How else can we put 15 trillion gallons of water into perspective???
Q2 Harvey Flooding
Link to the article is: https://www.nytimes.com/2017/08/30/opinion/texas-farmers-floods-planning.html
]]>
Understanding that super wealthy investors will spend more on art investments when the economy is strong makes intuitive sense, and allows us to discuss correlation in the context of investing. Asking students to translate time-series line graphs like the ones above into more traditional scatterplots is a great exercise!
The authors discuss whether art “leads or lags” the market, and this provided a difficult question for the take home quiz. Students struggled with this phrase more than I anticipated. Once again the concept of co-variation proves much more subtle and complicated than most of us appreciate.
Q9 Art Market
[1] http://fortune.com/2016/09/22/investors-sothebys-art-market/
]]>
It starts easily enough, 1.7 million people over 26.2 miles is about 65,000 per mile or 32,000 people per mile on each side. Already the “each side” is starting to clog up my students’ short term memory, making the next calculations even harder to follow. A mile is then divided into full blocks (one-eight of a mile), or 16 short blocks, each 110 yards long. We have quickly introduced 2 rates in terms of people per mile (or people per mile both sides), and 4 different units for distance. The full block is given as a fraction of mile while the short block is introduced as the number in a mile, and only the short blocks are converted to yards.
Now the author does try to simplify all of this by saying: “Divide those 32,000 people per mile by 16, and we get 2,000 spectators in a block 110 yards long. Allow 2 to 3 feet of space per viewer. To get 2,000 spectators in those 110 yards would require packing them in, shoulder to shoulder, 12 to 18 people deep!” But once again so much is packed into these sentences that my students were not able to process everything meaningfully. The density of 2-3 feet per viewer was particularly challenging, especially translating that into 12-18 people deep. Before proceeding, and definitely before assigning this article :O), compute the 12-18 people deep from the 2-3 feet per viewer.
I asked the following questions and my students struggled mightily. The first one could have been solved with a simple proportion (1.7 million is 1.7 times 100,000)! The second question where I ask them to run the computation in reverse was like watching people try to run a marathon in reverse, pretty comical.
Quiz 7 Marathon Numbers
[1] http://www.chicagotribune.com/news/opinion/commentary/ct-chicago-marathon-numbers-spectators-false-perspec-1013-20161012-story.html
]]>
I like this map showing just the areas that voted for Clinton in 2016, basically small islands surrounding every major city!
So do liberals choose to live in metropolitan areas and conservatives choose the great outdoors/suburbia? Or do cities create a liberal mindset? These are the questions addressed in the article, What’s Your Ideal Community? The Answer is Political. Maybe wide open spaces give the sense that the environment is just fine and government is really only needed for running the military. I know when we lived in rural western New York the volunteer fire department took care of the local community, and people owned guns because no one was going to come take care of a rabid raccoon for you.
People living in cities are acutely aware of the need for bigger government with strong public works and governmental infrastructure:
“a large transit agency to move people around, intricate parking rules to govern scarce spaces, a garbage truck armada to keep the streets clean. New York City, with its 24,000 restaurants and bars, needs a system of publicly posted health grades. A town with two restaurants may not. New York needs some colossal bridges connecting Manhattan and Brooklyn. A smaller community doesn’t need public-works projects on that scale. New York requires a large police force. A rural resident may need self-reliance when the closest officer is 10 miles away.”
In addition people in cities share public spaces, whereas suburbia is all about private land and yards, with local control of schools and social services that benefit only those who can afford to live in these communities. City dwellers are more exposed to diversity and appreciate the value added to their lives that such diversity brings. Income inequality is also prominently on display on the city streets with homeless people panhandling from the very wealthy. This close proximity of the haves and have-nots makes for a more liberal mindset concerning welfare. There are not many homeless people in rural Maine, so citizens away from city centers possibly don’t appreciate the need for such services.
On the take home quiz I first ask my students to grapple with this question of which came first, the city or the liberal. I was surprised that many of them struggled with the chicken or the egg argument. The first graphic below exposed their weaknesses when confronted with a sophisticated x-axis, and points out the need for using articles like this with QR courses.
Q8 Political Nbhds
[1] http://www.nytimes.com/2016/11/04/upshot/whats-your-ideal-community-the-answer-is-political.html
]]>
This was a great article in contrasts and also it highlighted how changing measures (for income and poverty) impact the statistics we collect. A new supplemental poverty measure (see definition of SPM in quiz below) radically changed the distribution of of people by income-to-poverty threshold ratios:
Yes I couldn’t resist this graphic of changing ratios. Try to figure out who is poor based on these ratios :O)
Q5 Income Inequality
Please type answers when possible and leave adequate space for computation work.
[1] http://www.nytimes.com/2016/09/14/business/economy/americas-inequality-problem-real-income-gains-are-brief-and-hard-to-find.html
]]>