Wednesday, September 5, 2018

Social Problem Analysis: Suicides from 1999 to 2016 and Beyond

To show what can be done with data collected a public source such as the Centers for Disease Control, I produced a video about suicides. Suicide has been on the public radar recently because of the scrutiny it has received in the press. And the concerns expressed in the press about the rising numbers and rate of suicide.

When I was in college, I was taught that suicide was a largely based on the victim's emotional state. That the answers of how to address it were to be found in psychology and psychiatry. That may have been true then, but the data that I analyzed, that you will see in this movie, says that suicide may now be more of a socio-economic problem.

Here is the link to the video:



You tell me what you think after you've seen it.


Monday, September 3, 2018

Behavioral Analysis of Network Traffic to Detect Intruders

I'm providing this video as an introduction to some of the work I have done on network traffic analysis as a means to detect intruders in real time. It is a forensic analysis of network traffic in real time that detects intruders using proprietary statistical traffic analysis methods that I developed. There's some aspects of my work on predictive statistical models in this work. And a real time, automated intruder defense capability as well.

Unfortunately, I can't describe how the system operates. That is proprietary information. I can say that it uses an approach that detects specific network activities that should either not be present or violate baseline patterns of behavior. All this is done on a per connection basis and does not affect transmission speeds.

I'll let you imagine how this system works once you've seen the video.

Here's the video.


Thursday, August 2, 2018

Relationship Between Carbon Dioxide Concentration and Global Temperature


For a bit of drama, I produced a short movie to show the relationship between global carbon dioxide concentration and average world wide temperature. Data obtained from NOAA.


I was surprised how close the relationship is between the two measures. It is rare that two measures correlate so closely. 

Here's an updated version of the video.



Tuesday, July 31, 2018

US Bureau of Economic Analysis Data: Year to year increases in personal income from 1929 to 2017

Let's move from public health to public policy and US economics.

Over the last several years there has been a lot of discussion in the news regarding increases -- or the lack thereof -- of year to year income. To do this, I went to the US Bureau of Economic Analysis (be a.gov) website and pull-down all the data regarding year-to-year differences in personal income from 1929 to 2017.


The fuzzy line in the middle of the drawing is zero, so everything above is an increase and everything below is a decrease income. By clicking on the image, you'll be able to see the full size image. In addition, the value of the dollar has been normalized to 2017 dollar value. If that were not the case, the increase in the early 1940s would have been substantially greater than the base wages at the time.

What's interesting about this curve beyond the fact that it's incredibly complex (with a trend line that is a 6th order polynomial) is that it appears to have several curves attached together.

The years of the Depression, World War II and immediate Post World War II including the Korean War were incredibly turbulent times economy. So I decided to take a look at two time frames. The Eisenhower years up to the beginning of the second term of the Nixon Administration. And the second time frame from the beginning of the second term of the Nixon Administration to 2017.

Here's the data from the first time frame from 1951 to 1974.


Although this is not the best fitting trend line, it does make the point I want to make and that is, year to year wages are increasing during this time period.

Let's take a look at the time frame from 1972 to 2017


Personal income over this time frame instead of increasing as it was during the earlier years shown above, personal income is decreasing. And when you consider that those who have the greatest amount of wealth (Highest 1%) receive an increasing percentage of the year to year increase. (In  1993 the highest 1% received 45% of the benefit of the increase in personal income. In 2017, the 1% received 65% of the increase in personal income.)

20 Year View: Last 10 years, Projected 10 years

Let's take a look at the last 10 years and project forward 10 years using the linear trend line equation shown in the graph above.


The last point on the graph showing an actual data point is 2016-2017. All the points to the right of 2016-2017 are projected values based on the linear trend equation y=-0.1852x+10.684.

What is important are not the individual values, but projected trend for the future. The trend line is approaching the fuzzy line which is zero growth in year to year personal income. 

I'm not an economist. I had one year of college economics. ( I seriously considered changing my major to economics.) I continue to read books about economics and economic trends. From what I know, this could be a formula for unrest. This is something for social scientists to contemplate. If you have read Thomas Piketty's Capital, these curves and the projection for the future should not be surprising, but expected.

What I do find interesting is that those people who probably are the most dissatisfied with our economic situation are those who have experienced both the steady rise in year to year personal income (probably grew up experiencing the increasing trend considered it the norm, something to be expected) and the steady loss in year to year personal income.

I may have something more to say on this topic in the future. But for now, I leave it you to contemplate and consider what might happen in the future.







Saturday, July 28, 2018

Firearm Deaths and the Relationship to Firearm Ownership in the United States 1999 to 2016

For those readers who live outside of the US and who follow US controversies, probably one of the most mysterious and mystifying controversies is the US focus on firearms and their value. For better or for worse, the US Supreme Court has interpreted that the Second Amendment of the Constitution states that firearm ownership is an individual right. The entire reason for existence of organizations such as the National Rifle Association (NRA) is to insure that firearm ownership be as widespread as possible.

Having said that, I'm going to get into the politics of firearms and firearm ownership. What I shall do is present the results of the analysis of data that I have collected regarding firearm deaths from 1999 to 2016. The data comes from the CDC-Wonder and from Statista.com. The data was categorized by state and the analysis was on normed data (rates and percentages, not on raw numbers).


Death by Firearm

In the United States firearm deaths fall by in large into two categories: 1) Suicides and 2) Homicides. The CDC has three additional categories: Unintentional, Undetermined and Legal Intervention/Operations of War. Firearm deaths falling into these three categories are negligible.

The overall suicide rate by firearm in the United States is 7.4 per 100,000 population. It ranges from a high of 14.0 per 100,000 (Wyoming) and a low of 1.8 per 100,000 (Massachusetts). Percent of overall suicides by firearm is 52.2%. Ranges from a high of 70.7% (Mississippi) to a low of 20.4% (Hawaii).

The overall homicide rate by firearm in the United States is 3.3 per 100,000 population. Ranges from a high of 9.8 per 100,000 (Louisiana) to a low of 0.6 per 100,000 (New Hampshire). Percent of homicides by firearm overall is 48.3%. Ranges from a high of 74.8% (Alabama) to a low of 30.5% (South Dakota).

Suicides and Homicides by Firearm: Relationship to Household Firearm Ownership

From Statista.com I was able to find by state the percent of households that own one or more guns. I was interested to determine whether or not suicides and homicides have any relationship to firearm ownership or in this case, the percentage of households in a state that own one or more firearms. The estimated per capita firearm ownership in the US is about 91%. Meaning, for every 10 people there are 9 firearms. And this does not vary widely from state to state. Firearm ownership in the US tends to be concentrated in a relatively few number of households. 

In the US the average number of households that own one or more guns is 37.6%. Ranges from a high of 57.9% (Wyoming) and a low 6.7% (Hawaii).

Relationship of Household Firearm Ownership to Suicides

I decided to examine whether or not there was a relationship between these two measures in two ways. First, determining whether there is a correlation between the two factors and second, dividing the states into two equal groups, one where the percentage of household ownership was above the median and where the percentage of household ownership was below the median and determine whether there was a significant difference in the suicide rates.

Correlations

The calculated correlation: (Pearson's r)
  • Between all suicides no matter how performed (crude rate by state) and percent of households that own 1+ firearms: .67 (p < .05).
  • Between crude rate (by state) of firearm suicides and percent of households that own 1+ firearms: .84 (p < .05)
People are more likely to complete a suicide attempt when they're part of a household that owns a firearm. And when that suicide is carried out by a firearm, there's even a stronger relationship between household ownership and the rate of suicide.

High and Low Household Ownership

Dividing the states into two groups of high and low household firearm ownership, I found the following
  • Low household ownership suicide rate: 5.8 per 100,000.
  • High household ownership suicide rate: 9.0 per 100,000
    • t test significant (p < .05)
Therefore, states with a higher percentage of households that own firearms have a significantly higher rate of suicide.

Relationship of Household Firearm Ownership to Homicides

As a preview, the relationship between firearm ownership and homicides differs from what I found with suicides. I examined the data the same way as I did with the suicides and I found no relationship. 

Here are my findings ...


  • Homicide rate by household ownership: Correlation, .03 (Non significant)
  • High and low rates of household ownership:
    • High group, 3.5 per 100,000
    • Low group, 3.2 per 100,000 
      • t test (non significant)

Conclusions

  1. Having a gun in your household may be bad for your health. Not by the hand of another, but by your own hand. 
  2. Homicides and firearms: Whether you live in a high percentage ownership state or a low ownership state, your chances of being a homicide statistic by way of a firearm are about same. 
The US has 87th highest homicide rate (out of 219) in the world. We have the 48th highest suicide rate (but we are lower than Sweden).

Comments?

remoteprogrammingguru@gmail.com














Friday, July 27, 2018

Drugs Deaths: 1999 to 2016 and Predicting Outcomes in Future Years



The Centers for Disease Control (CDC) has a comprehensive online database known as Wonder (https://wonder.cdc.gov) that is accessible to all. So if you have public health related questions, the data to answer them can be found in Wonder.

Unless you've been living under a rock, you know that deaths from drug overdoses particularly opioid related deaths have been steady increasing. I am interested in not only in the number of deaths, but the rate of increase and what that suggests for the future. I believe you will find the results of my analysis both interesting and troubling, particularly for the future.

Here's a chart showing the number deaths from 1999 to 2016:











From 1999 to 2016 the drug related deaths increased from 19,128 to 63,632. I calculated multiple trend lines. The best fitting model is the polynomial you see above (based on the R squared value, the closer to 1, the better the fit). That means that the rate of increase is accelerating.

There's a problem with using actual number of deaths when determining trends. The US population is increasing and the number of deaths does not account for that. The number of deaths provides us with an understanding of just how bad the problem is, but it's not the measure to use when constructing a predictive model.

The CDC calculates what they call a "crude rate," that is, the number of deaths in any year per 100,000. The crude rate allows use control for population growth in our analysis.

Let's look at the graph of the data using "crude rate" as the measure:



The curves are similar. The best fitting trend line is a polynomial, however it does not differ greatly from a linear trend line. Anyway you view this, the trends are concerning.

Let's use the trend line to make predictions about the future: from 1999 to 2025.
























This is using the trend line equation to predict the crude rate into 2025. This shows a steady increase in the drug related death rate. With rate of increase, the number of deaths per year would exceed 100,000 per year in 2023 to 2024. If this model holds, expect at least 250,000 drug related deaths during the 2020s. However, as bad as this is, it can get worse. Consider the analysis discussed below.

The some of the earlier years may be hiding a trend line that during more recent times may be much worse. Allow me to show you:







Instead of looking at all years, I decided to look at the more recent years and determine if the trend line had developed a steeper rise in recent years. The calculated trend line from 2008 to 2016 shows a much steeper rise than the trend line calculated from 1999 to 2016. Using this calculated trend line and extrapolating to 2025, this is what appears:
























Using the equation derived from the 2008 to 2016 data, the picture that arises is much more concerning. In fact the crude rate in 2025 is twice the rate predicted by the trend line equation derived from the 1999 to 2016 data. This suggests that the number of drug related deaths would be near 500,000 by the mid 2020s and that the number of drug related deaths during the 2020s would be closer to 1 million to 1,500,000 where the number of deaths per year would be no less than 100,000 and possibly up to 150,000 each year. Most of these deaths would come about as a result of opioid overdoses.

I have read reports from others who suggest that 500,000 drug related deaths for the 2020s would make for a terrible crisis. All indications are that this crisis will be far, far worse. Powerful synthetic and more deadly opioids such as Fentanyl and Carfentanil have shown increasing usage. They are cheap, easily produced and easily smuggled.

Approximately 1,264,000 soldiers have died in all of America's wars spanning the Revolutionary War, the Civil War, World War II and to today. We could see the same number of deaths or more in the 2020s large as a result of opioid overdoses.

Caveats

As a rule, the more years of data available, the more confident you can be in the results. Thus in spite of what appears to be a clear acceleration in the crude rate, one should have greater confidence in the trend line equation, our predictive model, derived from the data collected between 1999 to 2016 than the equation derived from the data collected from 2008 to 2016.

However, the data from 2008 to 2016 cannot be ignored. Although there is less of it, it is the more recent data and may be indicative of an intensification of underlying processes driving towards an increasing death rate. This is the difference between predictive and explanatory: the difference in the trend lines is only suggestive. But it's probably worth the effort to determine see if the causal forces driving a possible dramatic shift in the rate of increase have somehow changed. That is out of my area of expertise. 

So I leave that to the experts ... I'm just running the numbers. 

remoteprogrammingguru@gmail.com

Predictive vs. Explanatory Models


Before discussing specific topics, I want to be clear about the difference between predictive and explanatory models.

  • Predictive models predict outcomes. 
  • Explanatory models both predict outcomes and define the causal relationships or processes that underly those predictions. They explain why things happened as they do.
Often times the predictive model does appear to explain the predicted outcome. It may suggest a possible causal relationship. However, suggestion of a causal relationship is just that, a suggestion.

remoteprogrammingguru@gmail.com