Real-life Application of Statistics and Probability

Winning or losing a lottery is one of the most interesting examples of probability. In a typical Lottery game, each player chooses six distinct numbers from a particular range. If all the six numbers on a ticket match with that of the winning lottery ticket, the ticket holder is a Jackpot winner- regardless of the order of the numbers.

Probability helps in analyzing the best plan of insurance which suits you and your family the most. For example, you are an active smoker, and chances of getting lungs disease are higher in you. So, instead of choosing an insurance scheme for your vehicle or house, you may go for your health insurance first, because the chance of your getting sick are higher. For instance, nowadays people are getting their mobile phones insured because they know that the chances of their mobile phones getting damaged or lost are high.

Many politics analysts use the tactics of probability to predict the outcome of the election’s results. For example, they may predict a certain political party to come into power; based on the results of exit polls.
Sub-Topics List:
- Random Variables
- Values of a Random Variable
- Probability Distribution for a Discrete Random Variable and its Properties
- Probabilities Corresponding to a Given Random Variable
- The Mean and Variance of a Discrete Random Variable
- Normal Random Variables
- Regions Under the Normal Curve Corresponding to Different Standard Normal Values
- Conversion of Normal Random Variable to a Standard Normal Variable and Vice Versa
- Probabilities and Percentiles Using the Standard Normal Table
- Parameter and Statistics
- Sampling Distributions of Statistics
- The Mean and Variance of the Sampling Distribution of the Sample Mean
- Central Limit Theorem
- Sampling Distribution of the Sample Mean Using the Central Limit Theorem
- The T-Distribution
- Identifies Percentiles Using the T-table
- Length of a Confidence Interval
- Sample Size Using the Length of the Interval
- a) null hypothesis; (b) alternative hypothesis; (c) level of significance; (d) rejection region; and (e) types of errors in hypothesis testing
- Null and Alternative Hypotheses on a Population Mean
- Z-Test
- T-Test
- Using the Central Limit Theorem
- Rejection Region
- Unknown Variance
- Known Variance
- Hypothesis Testing for Population Mean with Known and Unknown Population Standard Deviation
- Rejection Value and Rejection Region
- Hypothesis Test for a Population Proportion
- Population Proportion
- Test-Statistic
- Independent and Dependent Variables
- The Value of the Dependent Variable Given the Value of the Independent Variable
- Bivariate Data
- Scatter Plot
- Pearson’s Correlation Coefficient
- Regression Slope Intercept

Facts are stubborn, but
statistics are more pliable.
Mark Twain
Sub-Topics
Random Variables
A discrete random variable is one which may take on only a countable number of distinct values such as 0,1,2,3,4,…….. Discrete random variables are usually (but not necessarily) counts. If a random variable can take only a finite number of distinct values, then it must be discrete.
If a random variable can take only a finite number of distinct values, then it must be discrete. Examples of discrete random variables include the number of children in a family, the Friday night attendance at a cinema, the number of patients in a doctor’s surgery, the number of defective light bulbs in a box of ten.
Continuous
A continuous random variable is one which takes an infinite number of possible values. Continuous random variables are usually measurements. Examples include height, weight, the amount of sugar in an orange, the time required to run a mile. A continuous random variable is not defined at specific values.
When do we use Continuous? : A continuous random variable is a random variable where the data can take infinitely many values. For example, a random variable measuring the time taken for something to be done is continuous since there are an infinite number of possible times that can be taken.
Values of a Random Variable
Probability Distribution for a Discrete Random Variable and its Properties
The function f(x) p(x)= P(X=x) for each x within the range of X is called the probability distribution of X. It is often called the probability mass function for the discrete random variable X.
Probabilities Corresponding to a Given Random Variable
The Mean and Variance of a Discrete Random Variable
For a discrete random variable X, the variance of X is obtained as follows: var(X)=∑(x−μ)2pX(x), where the sum is taken over all values of x for which pX(x)>0 so the variance of X is the weighted average of the squared deviations from the mean μ, where the weights are given by the probability function pX(x) of X.
Normal Random Variables
Regions Under the Normal Curve Corresponding to Different Standard Normal Values
Conversion of Normal Random Variable to a Standard Normal Variable and Vice Versa
Probabilities and Percentiles Using the Standard Normal Table
Parameter and Statistics
Sampling Distributions of Statistics
The Mean and Variance of the Sampling Distribution of the Sample Mean
Formula: μM = μ
Central Limit Theorem
Sampling Distribution of the Sample Mean Using the Central Limit Theorem
The T-Distribution
Identifies Percentiles Using the T-table
Length of a Confidence Interval
Sample Size Using the Length of the Interval
(a) null hypothesis; (b) alternative hypothesis; (c) level of significance; (d) rejection region; and (e) types of errors in hypothesis testing
Null and Alternative Hypotheses on a Population Mean
Z-Test
When the population variance is known, the z-test is used. The test statistic is assumed to have a normal distribution, and nuisance parameters such as standard deviation should be known in order for an accurate z-test to be performed.
Formula:

T-Test
When the population variance is unknown, however, the t-test is used. It is a type of inferential statistic used to determine if there is a significant difference between the means of two groups, which may be related in certain features. It is mostly used when the data sets, like the data set recorded as the outcome from flipping a coin 100 times, would follow a normal distribution and may have unknown variances.
Formula:

Using the Central Limit Theorem
It is important for you to understand when to use the central limit theorem. If you are being asked to find the probability of the mean, use the CLT for the mean. If you are being asked to find the probability of a sum or total, use the CLT for sums. This also applies to percentiles for means and sums.
Formula:

Rejection Region
After the test statistic in a significance test is realized (that is, a posteriori) and compared to the rejection region, the test decision is either correct or in error-there is no “probability” of correctness. However, within this limitation in the interpretation of classical statistical testing, p-values are a frequently used method for quantifying the “weight of evidence” against the working hypothesis.
Unknown Variance
If the variance is unknown, the t statistic is used in place of the z statistic.
Formula:

Known Variance
Formula:

Hypothesis Testing for Population Mean with Known and Unknown Population Standard Deviation
population mean. There are two approaches for conducting a hypothesis test; the critical value approach and the P-value approach.
Rejection Value and Rejection Region
The rejection value at a certain significance level can be thought of as a cut-off point. If a test statistic on one side of the critical value results in accepting the null hypothesis, a test statistic on the other side will result in rejecting the null hypothesis.
Hypothesis Test for a Population Proportion
The null hypothesis is a hypothesis that the proportion equals a specific value, p0. The alternative hypothesis is the competing claim that the parameter is less than, greater than, or not equal to p0.
Population Proportion
Test-Statistic
Independent and Dependent Variables
The Value of the Dependent Variable Given the Value of the Independent Variable
Bivariate Data
For example, if you are studying a group of individuals/students on finding out their average science scores and their age, you have two variables present, which are the independent and dependent variable, (science score and the age of the students).
While if you are just studying one variable which is the science score for the specified students, then we have univariate data.
Scatter Plot
Pearson’s Correlation Coefficient
In the formula, the r refers to the correlation coefficient, N for the number of pairs of scores,
Σxy is for the sum of the products of paired scores, Σx for the sum of x scores, Σy for the sum of y scores, Σx2 for the sum of squared x scores, and Σy2 for the sum of squared y scores.
Regression Slope Intercept


