Table of Contents

Sample Distribution: Definition, How It's Used, and Example

By

The Investopedia Team

Updated July 27, 2023

Reviewed by

Thomas Brock

Fact checked by Kirsten Rohrs Schmitt

188betLiên kết đăng nhập — Investopedia / Ryan Oakley

What Is a Sampling Distribution?

A sampling distribution is a concept used in statistics. It is a probability distribution of a statistic obtained from a larger number of samples drawn from a specific population. The sampling distribution of a given population is the distribution of frequencies of a range of different outcomes that could possibly occur for a statistic of a population. This allows entities like governments and businesses to make more well-informed decisions based on the information they gather. There are a few methods of sampling distribution used by researchers, including the sampling distribution of a mean.

Key Takeaways

A sampling distribution is a probability distribution of a statistic that is obtained through repeated sampling of a specific population.
It describes a range of possible outcomes for a statistic, such as the mean or mode of some variable, of a population.
The majority of data analyzed by researchers are actually samples, not populations.

How Sampling Distributions Work

Data allows statisticians, researchers, marketers, analysts, and academics to make important conclusions about specific topics and information. It can help businesses make decisions about their future and boost their performance, or it can help governments plan for services needed by a group of people. A lot of data drawn and used are actually samples rather than populations. A sample is a subset of a population. Put simply, a sample is a smaller part of a larger group. As such, this smaller portion is meant to be representative of the population as a whole.

Sampling distributions (or the distribution of data) are statistical metrics that determine whether an event or certain outcome will take place. This distribution depends on a few different factors, including the sample size, the sampling process involved, and the population as a whole. There are a few steps involved with sampling distribution. These include:

Choosing a random sample from the overall population
Determine a certain statistic from that group, which could be the standard deviation, median, or mean
Establishing a frequency distribution of each sample
Mapping out the distribution on a graph

Once the information is gathered, plotted, and analyzed, researchers can make inferences and conclusions. This can help them make decisions about what to expect in the future. For instance, governments may be able to invest in infrastructure projects based on the needs of a certain community or a company may decide to proceed with a new business venture if the sampling distribution suggests a positive outcome.

Each sample has its own sample mean, and the distribution of the sample means is known as the sample distribution.

Special Considerations

The number of observations in a population, the number of observations in a sample, and the procedure used to draw the sample sets determine the variability of a sampling distribution. The standard deviation of a sampling distribution is called the standard error.

While the mean of a sampling distribution is equal to the mean of the population, the standard error depends on the standard deviation of the population, the size of the population, and the size of the sample.

Knowing how spread apart the mean of each of the sample sets are from each other and from the population mean will give an indication of how close the sample mean is to the population mean. The standard error of the sampling distribution decreases as the sample size increases.

Determining a Sampling Distribution

Let's say a medical researcher wants to compare the average weight of all babies born in North America from 1995 to 2005 to those from South America within the same time period. Since they cannot draw the data for the entire population within a reasonable amount of time, they would only use 100 babies in each continent to make a conclusion. The data used is the sample and the average weight calculated is the sample mean.

Now suppose they take repeated random samples from the general population and compute the sample mean for each sample group instead. So, for North America, they pull data for 100 newborn weights recorded in the U.S., Canada, and Mexico as follows:

Four 100 samples from select hospitals in the U.S.
Five 70 samples from Canada
Three 150 records from Mexico

The researcher ends up with a total of 1,200 weights of newborn babies grouped in 12 sets. They also collect sample data of 100 birth weights from each of the 12 countries in South America.

The average weight computed for each sample set is the sampling distribution of the mean. Not just the mean can be calculated from a sample. Other statistics, such as the standard deviation, variance, proportion, and range can be calculated from sample data. The standard deviation and variance measure the variability of the sampling distribution.

Types of Sampling Distributions

Here is a brief description of the types of sampling distributions:

Sampling Distribution of the Mean: This method shows a normal distribution where the middle is the mean of the sampling distribution. As such, it represents the mean of the overall population. In order to get to this point, the researcher must figure out the mean of each sample group and map out the individual data.
Sampling Distribution of Proportion: This method involves choosing a sample set from the overall population to get the proportion of the sample. The mean of the proportions ends up becoming the proportions of the larger group.
T-Distribution: This type of sampling distribution is common in cases of small sample sizes. It may also be used when there is very little information about the entire population. T-distributions are used to make estimates about the mean and other statistical points.

In statistics, a population is the entire pool from which a statistical sample is drawn. A population may refer to an entire group of people, objects, events, hospital visits, or measurements. A population can thus be said to be an aggregate observation of subjects grouped together by a common feature.

Plotting Sampling Distributions

A population or one sample set of numbers will have a normal distribution. However, because a sampling distribution includes multiple sets of observations, it will not necessarily have a bell-curved shape.

Following our example, the population average weight of babies in North America and in South America has a normal distribution because some babies will be underweight (below the mean) or overweight (above the mean), with most babies falling in between (around the mean). If the average weight of newborns in North America is seven pounds, the sample mean weight in each of the 12 sets of sample observations recorded for North America will be close to seven pounds as well. But if you graph each of the averages calculated in each of the 1,200 sample groups, the resulting shape may result in a uniform distribution, but it is difficult to predict with certainty what the actual shape will turn out to be. The more samples the researcher uses from the population of over a million weight figures, the more the graph will start forming a normal distribution.

Why Is Sampling Used to Gather Population Data?

Sampling is a way to gather and analyze information about a larger group. It is done because researchers aren't able to study entire populations due to the sheer volume of subjects involved. As such, not everyone in the larger group can be included as it may take too long to study and analyze the data. It allows entities like governments and businesses to make important decisions about the future, whether that means investing in an infrastructure project, social service program, or new product.

Why Are Sampling Distributions Used?

Sampling distributions are used in statistics and research. They highlight the chance or probability of an event that may take place. This is based on a set of data that is gathered from a small group within a larger population.

What Is a Mean?

A mean is a metric used in statistics and research. It is the average for at least two numbers. The mean may be determined by adding up all the numbers and dividing the result by the number of numbers in that set. This is known as the arithmetic mean. You can determine the geometric mean by multiplying the values of a data set and taking the root of the sum equal to the number of values within that data set.

The Bottom Line

Researchers aren't able to make conclusions about very large groups because of the number of subjects involved. That's why they use sampling. Sampling allows them to take a small group from a large population and analyze data. Once that data is collected, researchers can plot out sampling distributions, which allow them to determine whether an event may take place within a certain population. This may include business growth or population trends, which can help businesses, governments, and other entities make better decisions for the future.

Article Sources

Investopedia requires writers to use primary sources to support their work. These include white papers, government data, original reporting, and interviews with industry experts. We also reference original research from other reputable publishers where appropriate. You can learn more about the standards we follow in producing accurate, unbiased content in our editorial policy.

Penn State, Eberly College of Science. "."
New Jersey Institute of Technology. "."
Organisation for Economic Co-operation and Development. "."

Take the Next Step to Invest

×

The offers that appear in this table are from partnerships from which Investopedia receives compensation. This compensation may impact how and where listings appear. Investopedia does not include all offers available in the marketplace.

188bet