Hello everyone! Today I’ll be writing a bit about descriptive and inferential statistics.
Data Scientists invest a lot in the pre-processing of data. This requires a good understanding of statistics. Statistics is a branch of mathematics that studies the collection, presentation, analysis, and interpretation of data. Statistical modeling lies at the heart of Data Science and Analysis!
Two main statistical methods are used in data analysis: descriptive statistics, and inferential statistics. Descriptive statistics is solely concerned with the properties of the observed data, and it does not rest on the assumption that the data may come from a larger population. In machine learning, the term ‘inference’ is sometimes used instead to mean ‘make a prediction, by evaluating an already trained model’.
Some of the main concepts that fall under descriptive statistics are as follows:
- Measures of central tendency (Mean, Median, Mode, Percentiles and Quantiles)
- Measures of variability (Range, Variance and Standard Deviation, Standard Error)
- Measures of association between two variables (Covariance, Correlation Coefficient)
Inferential statistics, on the other hand, is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, e.g., by testing hypotheses and deriving estimates. The observed data set is assumed to be sampled from a larger population, and the goal is to draw conclusions about that population based on the sample. In other words, inferential statistics is used to make inferences about a population based on a sample of data.
Inferential statistics involves using probability theory and statistical inference techniques to estimate population parameters, test hypotheses, and make predictions about future outcomes. It involves analyzing data to determine whether any observed differences or relationships are statistically significant, meaning they are unlikely to have occurred by chance.
Some of the main concepts that fall under inferential statistics are as follows (all falling under the topic of Test Statistics and Statistical Significance):
- Confidence Intervals
- Chi-Squared Tests
- A|B Testing
Don’t forget to comment and let me know what you think!