 Open in App
Not now

# How to Calculate Confidence Intervals in Python?

• Last Updated : 20 Feb, 2022

In this article, we will be looking at the different ways to calculate confidence intervals using various distributions in the Python programming language. Confidence interval for a mean is a range of values that is likely to contain a population mean with a certain level of confidence.

Formula:

`Confidence Interval = x(+/-)t*(s/√n)`
• x: sample mean
• t: t-value that corresponds to the confidence level
• s: sample standard deviation
• n: sample size

## Method 1: Calculate confidence Intervals using the t Distribution

This approach is used to calculate confidence Intervals for the small dataset where the n<=30 and for this, the user needs to call the t.interval() function from the scipy.stats library to get the confidence interval for a population means of the given dataset in python.

Syntax: st.t.interval(alpha, length, loc, scale))

Parameters:

• alpha: Probability that an RV will be drawn from the returned range.
• length: Length of the data set
• loc: location parameter
• scale: scale parameter

Example 1:

In this example, we will be using the data set of size(n=20) and will be calculating the 90% confidence Intervals using the t Distribution using the  t.interval() function and passing the alpha parameter to 0.90 in the python.

## Python

 `import` `numpy as np ` `import` `scipy.stats as st ` ` `  `# define sample data ` `gfg_data ``=` `[``1``, ``1``, ``1``, ``2``, ``2``, ``2``, ``3``, ``3``, ``3``, ``3``,  ` `            ``3``, ``4``, ``4``, ``5``, ``5``, ``5``, ``6``, ``7``, ``8``, ``10``] ` ` `  `# create 90% confidence interval ` `st.t.interval(alpha``=``0.90``, df``=``len``(gfg_data)``-``1``, ` `              ``loc``=``np.mean(gfg_data), ` `              ``scale``=``st.sem(gfg_data)) `

Output:

`(2.962098014195961, 4.837901985804038)`

Example 2:

In this example, we will be using the data set of size(n=20) and will be calculating the 90% confidence Intervals using the t Distribution using the  t.interval() function and passing the alpha parameter to 0.99 in the python.

## Python

 `import` `numpy as np ` `import` `scipy.stats as st ` ` `  `# define sample data ` `gfg_data ``=` `[``1``, ``1``, ``1``, ``2``, ``2``, ``2``, ``3``, ``3``, ``3``, ` `            ``3``, ``3``, ``4``, ``4``, ``5``, ``5``, ``5``, ``6``, ` `            ``7``, ``8``, ``10``] ` ` `  `# create 99% confidence interval ` `st.t.interval(alpha``=``0.99``, ` `              ``df``=``len``(gfg_data)``-``1``, ` `              ``loc``=``np.mean(gfg_data),  ` `              ``scale``=``st.sem(gfg_data)) `

Output:

`(2.3481954013214263, 5.4518045986785735)`

Interpretation from example 1 and example 2:

In the case of example 1, the calculated confident mean interval of the population with 90% is (2.96-4.83), and in example 2 when calculated the confident mean interval of the population with 99% is (2.34-5.45), it can be interpreted that the example 2 confident interval is wider than the example 1 confident interval with the 95% of the population, which means that there are 99% chances the confidence interval of [2.34, 5.45] contains the true population mean

## Method 2: Calculate confidence Intervals using the Normal Distribution

This approach is used to calculate confidence Intervals for the large dataset where the n>30 and for this, the user needs to call the norm.interval() function from the scipy.stats library to get the confidence interval for a population means of the given dataset where the dataset is normally distributed in python.

Syntax: st.norm.interval(alpha, loc, scale))

Parameters:

• alpha: Probability that an RV will be drawn from the returned range.
• loc: location parameter
• scale: scale parameter

Example 3:

In this example, we will be using the random data set of size(n=100) and will be calculating the 90% confidence Intervals using the norm Distribution using the norm.interval() function and passing the alpha parameter to 0.90 in the python.

## Python

 `import` `numpy as np ` `import` `scipy.stats as st ` ` `  `# define sample data ` `gfg_data ``=` `np.random.randint(``5``, ``10``, ``100``) ` ` `  `# create 90% confidence interval ` `# for population mean weight ` `st.norm.interval(alpha``=``0.90``, ` `                 ``loc``=``np.mean(gfg_data), ` `                 ``scale``=``st.sem(gfg_data)) `

Output:

`(6.920661262464349, 7.3593387375356505)`

Example 4:

In this example, we will be using the random data set of size(n=100) and will be calculating the 99% confidence Intervals using the norm Distribution using the norm.interval() function and passing the alpha parameter to 0.99 in the python.

## Python

 `import` `numpy as np ` `import` `scipy.stats as st ` ` `  `# define sample data ` `gfg_data ``=` `np.random.randint(``5``, ``10``, ``100``) ` ` `  `# create 99% confidence interval ` `# for population mean weight ` `st.norm.interval(alpha``=``0.99``,  ` `                 ``loc``=``np.mean(gfg_data), ` `                 ``scale``=``st.sem(gfg_data)) `

Output:

`(6.689075889330163, 7.450924110669837)`

Interpretation from example 3 and example 4:

In the case of example 3, the calculated confident mean interval of the population with 90% is (6.92-7.35), and in example 4 when calculated the confident mean interval of the population with 99% is (6.68-7.45), it can be interpreted that the example 4 confident interval is wider than the example 3 confident interval with the 95% of the population, which means that there are 99% chances the confidence interval of [6.68, 7.45] contains the true population means.

My Personal Notes arrow_drop_up
Related Articles