Why is Statistics Important? (7 Reasons Statistics Matters!)
Data collection, analysis, interpretation, and presentation are the main areas of focus in the study of statistics.
An unprecedented amount of data is being generated and gathered as technology becomes more and more integrated into our daily lives.
Statistics plays a crucial role in utilizing this data for the following purposes:
1. Gain a comprehensive understanding of the world around us.
2. Make informed decisions based on the data.
3. Predict future outcomes using data.
In this article, I will discuss seven reasons why statistics are essential in modern life.
Reason 1: Utilizing Descriptive Statistics for a Deeper Understanding
Descriptive statistics are used to describe a chunk of raw data. There are three main types of descriptive statistics:
-Summary statistics
- Charts
- Tables
Each of these can help in gaining a better understanding of the available data.
For instance, Imagine we have a dataset representing the batting averages of 500 cricket players in a particular league. By employing descriptive statistics, we can:
- Calculate the average batting average and the standard deviation to understand the overall performance level of the players.
- Visualize the distribution of batting averages through a histogram or boxplot, which can help identify patterns and outliers in player performance.
- Comprehend the distribution of batting averages by creating a frequency table, which shows the frequency of players falling into different batting average ranges.
Using descriptive statistics, we can understand the batting average of the cricket player much more easily than just staring at the raw data.
Reason 2: Recognizing Misleading Charts
Charts are more common than ever in today’s publications, including magazines, newspapers, and online articles. Unfortunately, charts can be deceptive if the underlying data is not understood.
For example, suppose some journal publishes a study that finds a negative correlation between GPA and ACT scores for students at a certain university.
However, this negative correlation only occurs because students who have both a high GPA and ACT score may go to an elite university, while students who have both a low GPA and ACT score do not get admitted at all.
In the general population, there is a positive correlation between ACT and GPA, but the sample suggests a negative correlation.
Reason 3 : To Make Better Decisions Using Probability
Probability is one of the most significant subfields of statistics. This area of study examines how likely certain events are to occur.
By having a basic understanding of probability, you can make more informed decisions in the real world.
Consider a high school student who is aware that their chances of being admitted to a specific university are 10%. Using the formula for the probability of at least one success, this student can find the probability that they’ll get accepted to at least one university they apply to and adjust the number of universities they apply to accordingly.
Reason 4 : To Be Wary of Confounding Variables
The idea of confounding variables is one of the key concepts you will learn in statistics.
These uncontrolled factors can skew the results of an experiment and produce unreliable conclusions.
For example, suppose a researcher collects data on ice cream sales and shark attacks and finds that the two variables are highly correlated. Does this mean that increased ice cream sales cause more shark attacks?
That seems unlikely. The more likely cause is the confoundingly variable temperature . More people buy ice cream, and more people swim in the ocean when the weather is warmer.
Reason 5: To Understand P-Values in Research
Another important concept that we will learn about in statistics is p-values. The textbook definition of a p-value is:
A p-value is the probability of observing a sample statistic that is at least as extreme as your sample statistic, given that the null hypothesis is true.
For example, suppose a factory claims that they produce tires that have a mean weight of 200 pounds. An auditor hypothesizes that the true mean weight of tires produced at this factory is different from 200 pounds so he runs a hypothesis test and finds that the p-value of the test is 0.04.
Here is how to interpret this p-value:
If the factory does indeed produce tires that have a mean weight of 200 pounds, then 4% of all audits will obtain the effect observed in the sample, or larger, because of random sample error. This tells us that obtaining the sample data that the auditor did would be pretty rare if indeed the factory produced tires that have a mean weight of 200 pounds.
Thus, the auditor would likely reject the null hypothesis that the true mean weight of tires produced at this factory is indeed 200 pounds.
Reason 6: To Understand Correlation
Another important concept that we will learn about in statistics is correlation, which tells us the linear association between two variables.
The value for a correlation coefficient always ranges between -1 and 1 where:
- -1 indicates a perfectly negative linear correlation between two variables
- 0 indicates no linear correlation between two variables
- 1 indicates a perfect positive linear correlation between two variables
By understanding these values, we can understand the relationship between variables in the real world.
For example, if the correlation between advertisement spending and revenue is 0.87, then we can understand that there is a strong positive relationship between the two variables. As you spend more money on advertising, we can expect a predictable increase in revenue.
Reason 7: To Make Predictions About the Future
Another important reason to learn statistics is to understand basic regression models such as:
- Simple Linear Regression
- Multiple Linear Regression
- Logistic Regression
Each of these models allows us to make predictions about the future value of some responsible variable based on the value of certain predictor variables in the model.
For example, multiple linear regression models are used all the time in the real world by businesses when they use predictor variables such as age, income, ethnicity, etc. to predict how much customers will spend at their stores.
Similarly, logistics helps use predictor variables like total demand, population size, etc. to forecast future sales.