Showing posts with label linear models. Show all posts
Showing posts with label linear models. Show all posts

Wednesday, September 5, 2018

Social Problem Analysis: Suicides from 1999 to 2016 and Beyond

To show what can be done with data collected a public source such as the Centers for Disease Control, I produced a video about suicides. Suicide has been on the public radar recently because of the scrutiny it has received in the press. And the concerns expressed in the press about the rising numbers and rate of suicide.

When I was in college, I was taught that suicide was a largely based on the victim's emotional state. That the answers of how to address it were to be found in psychology and psychiatry. That may have been true then, but the data that I analyzed, that you will see in this movie, says that suicide may now be more of a socio-economic problem.

Here is the link to the video:



You tell me what you think after you've seen it.


Thursday, August 2, 2018

Relationship Between Carbon Dioxide Concentration and Global Temperature


For a bit of drama, I produced a short movie to show the relationship between global carbon dioxide concentration and average world wide temperature. Data obtained from NOAA.


I was surprised how close the relationship is between the two measures. It is rare that two measures correlate so closely. 

Here's an updated version of the video.



Tuesday, July 31, 2018

US Bureau of Economic Analysis Data: Year to year increases in personal income from 1929 to 2017

Let's move from public health to public policy and US economics.

Over the last several years there has been a lot of discussion in the news regarding increases -- or the lack thereof -- of year to year income. To do this, I went to the US Bureau of Economic Analysis (be a.gov) website and pull-down all the data regarding year-to-year differences in personal income from 1929 to 2017.


The fuzzy line in the middle of the drawing is zero, so everything above is an increase and everything below is a decrease income. By clicking on the image, you'll be able to see the full size image. In addition, the value of the dollar has been normalized to 2017 dollar value. If that were not the case, the increase in the early 1940s would have been substantially greater than the base wages at the time.

What's interesting about this curve beyond the fact that it's incredibly complex (with a trend line that is a 6th order polynomial) is that it appears to have several curves attached together.

The years of the Depression, World War II and immediate Post World War II including the Korean War were incredibly turbulent times economy. So I decided to take a look at two time frames. The Eisenhower years up to the beginning of the second term of the Nixon Administration. And the second time frame from the beginning of the second term of the Nixon Administration to 2017.

Here's the data from the first time frame from 1951 to 1974.


Although this is not the best fitting trend line, it does make the point I want to make and that is, year to year wages are increasing during this time period.

Let's take a look at the time frame from 1972 to 2017


Personal income over this time frame instead of increasing as it was during the earlier years shown above, personal income is decreasing. And when you consider that those who have the greatest amount of wealth (Highest 1%) receive an increasing percentage of the year to year increase. (In  1993 the highest 1% received 45% of the benefit of the increase in personal income. In 2017, the 1% received 65% of the increase in personal income.)

20 Year View: Last 10 years, Projected 10 years

Let's take a look at the last 10 years and project forward 10 years using the linear trend line equation shown in the graph above.


The last point on the graph showing an actual data point is 2016-2017. All the points to the right of 2016-2017 are projected values based on the linear trend equation y=-0.1852x+10.684.

What is important are not the individual values, but projected trend for the future. The trend line is approaching the fuzzy line which is zero growth in year to year personal income. 

I'm not an economist. I had one year of college economics. ( I seriously considered changing my major to economics.) I continue to read books about economics and economic trends. From what I know, this could be a formula for unrest. This is something for social scientists to contemplate. If you have read Thomas Piketty's Capital, these curves and the projection for the future should not be surprising, but expected.

What I do find interesting is that those people who probably are the most dissatisfied with our economic situation are those who have experienced both the steady rise in year to year personal income (probably grew up experiencing the increasing trend considered it the norm, something to be expected) and the steady loss in year to year personal income.

I may have something more to say on this topic in the future. But for now, I leave it you to contemplate and consider what might happen in the future.







Friday, July 27, 2018

Drugs Deaths: 1999 to 2016 and Predicting Outcomes in Future Years



The Centers for Disease Control (CDC) has a comprehensive online database known as Wonder (https://wonder.cdc.gov) that is accessible to all. So if you have public health related questions, the data to answer them can be found in Wonder.

Unless you've been living under a rock, you know that deaths from drug overdoses particularly opioid related deaths have been steady increasing. I am interested in not only in the number of deaths, but the rate of increase and what that suggests for the future. I believe you will find the results of my analysis both interesting and troubling, particularly for the future.

Here's a chart showing the number deaths from 1999 to 2016:











From 1999 to 2016 the drug related deaths increased from 19,128 to 63,632. I calculated multiple trend lines. The best fitting model is the polynomial you see above (based on the R squared value, the closer to 1, the better the fit). That means that the rate of increase is accelerating.

There's a problem with using actual number of deaths when determining trends. The US population is increasing and the number of deaths does not account for that. The number of deaths provides us with an understanding of just how bad the problem is, but it's not the measure to use when constructing a predictive model.

The CDC calculates what they call a "crude rate," that is, the number of deaths in any year per 100,000. The crude rate allows use control for population growth in our analysis.

Let's look at the graph of the data using "crude rate" as the measure:



The curves are similar. The best fitting trend line is a polynomial, however it does not differ greatly from a linear trend line. Anyway you view this, the trends are concerning.

Let's use the trend line to make predictions about the future: from 1999 to 2025.
























This is using the trend line equation to predict the crude rate into 2025. This shows a steady increase in the drug related death rate. With rate of increase, the number of deaths per year would exceed 100,000 per year in 2023 to 2024. If this model holds, expect at least 250,000 drug related deaths during the 2020s. However, as bad as this is, it can get worse. Consider the analysis discussed below.

The some of the earlier years may be hiding a trend line that during more recent times may be much worse. Allow me to show you:







Instead of looking at all years, I decided to look at the more recent years and determine if the trend line had developed a steeper rise in recent years. The calculated trend line from 2008 to 2016 shows a much steeper rise than the trend line calculated from 1999 to 2016. Using this calculated trend line and extrapolating to 2025, this is what appears:
























Using the equation derived from the 2008 to 2016 data, the picture that arises is much more concerning. In fact the crude rate in 2025 is twice the rate predicted by the trend line equation derived from the 1999 to 2016 data. This suggests that the number of drug related deaths would be near 500,000 by the mid 2020s and that the number of drug related deaths during the 2020s would be closer to 1 million to 1,500,000 where the number of deaths per year would be no less than 100,000 and possibly up to 150,000 each year. Most of these deaths would come about as a result of opioid overdoses.

I have read reports from others who suggest that 500,000 drug related deaths for the 2020s would make for a terrible crisis. All indications are that this crisis will be far, far worse. Powerful synthetic and more deadly opioids such as Fentanyl and Carfentanil have shown increasing usage. They are cheap, easily produced and easily smuggled.

Approximately 1,264,000 soldiers have died in all of America's wars spanning the Revolutionary War, the Civil War, World War II and to today. We could see the same number of deaths or more in the 2020s large as a result of opioid overdoses.

Caveats

As a rule, the more years of data available, the more confident you can be in the results. Thus in spite of what appears to be a clear acceleration in the crude rate, one should have greater confidence in the trend line equation, our predictive model, derived from the data collected between 1999 to 2016 than the equation derived from the data collected from 2008 to 2016.

However, the data from 2008 to 2016 cannot be ignored. Although there is less of it, it is the more recent data and may be indicative of an intensification of underlying processes driving towards an increasing death rate. This is the difference between predictive and explanatory: the difference in the trend lines is only suggestive. But it's probably worth the effort to determine see if the causal forces driving a possible dramatic shift in the rate of increase have somehow changed. That is out of my area of expertise. 

So I leave that to the experts ... I'm just running the numbers. 

remoteprogrammingguru@gmail.com