sampling : Probability sample & Non Probability sampleunit4.docx

Sampling: Basic Concept: Defining the Universe
Sampling is a process used in statistical analysis in which a predetermined number of
observations are taken from a larger population. The methodology used to sample from a larger
population depends on the type of analysis being performed but may include simple random
sampling or systematic sampling.
In business, a CPA performing an audit uses sampling to determine the accuracy of account
balances in the financial statements, and managers use sampling to assess the success of the
firm’s marketing efforts.
The sample should be a representation of the entire population. When taking a sample from a
larger population, it is important to consider how the sample is chosen. To get a representative
sample, the sample must be drawn randomly and encompass the whole population. For example,
a lottery system could be used to determine the average age of students in a university by
sampling 10% of the student body.
A good sample is one which satisfies all or few of the following conditions-
(i) Representativeness: When sampling method is adopted by the researcher, the basic
assumption is that the samples so selected out of the population are the best representative of the
population under study. Thus good samples are those who accurately represent the population.
Probability sampling technique yield representative samples. On measurement terms, the sample
must be valid. The validity of a sample depends upon its accuracy.
(ii) Accuracy: Accuracy is defined as the degree to which bias is absent from the sample. An
accurate (unbiased) sample is one which exactly represents the population. It is free from any
influence that causes any differences between sample value and population value.
(iii) Size: A good sample must be adequate in size and reliable. The sample size should be such
that the inferences drawn from the sample are accurate to a given level of confidence to represent
the entire population under study.
The size of sample depends on number of factors. Some important among them are:-

(i) Homogeneity or Heterogeneity of the universe: Selection of sample depends on the nature
of the universe. It says that if the nature of universe is homogeneous then a small sample will
represent the behavior of entire universe. This will lead to selection of small sample size rather
than a large one. On the other hand, if the universe is heterogeneous in nature then samples are to
be chosen as from each heterogeneous unit.
(ii) Number of classes proposed: If a large number of class intervals to be made then the size of
sample should be more because it has to represent the entire universe. In case of small samples
there is the possibility that some samples may not be included.
(iii) Nature of study: The size of sample also depends on the nature of study. For an intensive
study which may be for a long time, large samples are to be chosen. Similarly, in case of general
studies large number of respondents may be appropriate one but if the study is of technical in
nature then the selection of large number of respondents may cause difficulty while gathering
information.
Sampling is the act, process, or technique of selecting a representative part of a population for
the purpose of determining the characteristics of the whole population. In other words, the
process of selecting a sample from a population using special sampling techniques called
sampling. It should be ensured in the sampling process itself that the sample selected is
representative of the population.
Examples of Sample Tests for Marketing
Businesses aim to sell their products and/or services to target markets. Before presenting
products to the market, companies generally identify the needs and wants of their target
audience. To do so, they may employ using a sample of the population to gain a better
understanding of those needs to later create a product and/or service that meets those needs.
Gathering the opinions of the sample helps to identify the needs of the whole.
UNIVERSE OR POPULATION
The population or universe represents the entire group of units which is the focus of the study.
Thus, the population could consist of all the persons in the country, or those in a particular
geographical location, or a special ethnic or economic group, depending on the purpose and

coverage of the study. A population could also consist on non-human units such as farms, houses
or business establishments.
The entire aggregation of items from which samples can be drawn is known as a population. In
sampling, the population may refer to the units, from which the sample is drawn. Population or
populations of interest are interchangeable terms. The term “unit” is used, as in a business
research process, samples are not necessarily people all the time. A population of interest may be
the universe of nations or cities. This is one of the first things the analyst needs to define
properly while conducting a business research. Therefore, population, contrary to its general
notion as a nation’s entire population has a much broader meaning in sampling. “N” represents
the size of the population.
Concept of Statistical Population
In statistics, a population is a set of similar items or events which is of interest for some
question or experiment. A statistical population can be a group of existing objects (e.g. the set of
all stars within the Milky Way galaxy) or a hypothetical and potentially infinite group of objects
conceived as a generalization from experience (e.g. the set of all possible hands in a game of
poker). A common aim of statistical analysis is to produce information about some chosen
population.
In statistical inference, a subset of the population (a statistical sample) is chosen to represent the
population in a statistical analysis. The ratio of the size of this statistical sample to the size of the
population is called a sampling fraction. It is then possible to estimate the population parameters
using the appropriate sample statistics.
In statistics, the term population is used to describe the subjects of a particular study—everything
or everyone who is the subject of a statistical observation. Populations can be large or small in
size and defined by any number of characteristics, though these groups are typically defined
specifically rather than vaguely—for instance, a population of women over 18 who buy coffee at
Starbucks rather than a population of women over 18.

Statistical populations are used to observe behaviors, trends, and patterns in the way individuals
in a defined group interact with the world around them, allowing statisticians to draw
conclusions about the characteristics of the subjects of study, although these subjects are most
often humans, animals, and plants, and even objects like stars.
Overview: Statistical Population
Function Statistical Analysis
Definition
A set of observations that share a
property or set of properties.
Example Coffee drinkers in France
Value
Targeting a set of data for the
purposes of analysis.
Related Techniques Statistical Model
Probability Distribution
Sample, Characteristics of a Good Sample
Sample
A sample is a smaller, manageable version of a larger group. It is a subset containing the
characteristics of a larger population. Samples are used in statistical testing when population

sizes are too large for the test to include all possible members or observations. A sample should
represent the whole population and not reflect bias toward a specific attribute.
In basic terms, a population is the total number of individuals, animals, items, observation, data,
etc. of any given subject. For example, as of 2017, the population of the world was 7.5 billion of
which 49.6% were female and 50.4% were male. The total number of people in any given
country can also be a population size. The total number of students in a city can be taken as a
population, and the total number of dogs in a city is also a population size. Scientists,
researchers, marketers, academicians, and any related or interested party trying to draw data from
a group will find that a population size may be too large to monitor. Consider a team of academic
researchers that want to, say, know the number of students that studied for less than 40 hours for
the CFA exam in 2016 and still passed. Since more than 200,000 people globally take the exam
each year, reaching out to each and every exam participant might be extremely tedious and time
consuming. In fact, by the time the data from the population has been collected and analyzed, a
couple of years would have passed, making the analysis worthless since a new population would
have emerged.
Characteristics of a Good Sample
(1) Goal-oriented: A sample design should be goal oriented. It is means and should be oriented
to the research objectives and fitted to the survey conditions.
(2) Accurate representative of the universe: A sample should be an accurate representative of
the universe from which it is taken. There are different methods for selecting a sample. It will be
truly representative only when it represents all types of units or groups in the total population in
fair proportions. In brief sample should be selected carefully as improper sampling is a source of
error in the survey.
(3) Proportional: A sample should be proportional. It should be large enough to represent the
universe properly. The sample size should be sufficiently large to provide statistical stability or
reliability. The sample size should give accuracy required for the purpose of particular study.

(4) Random selection: A sample should be selected at random. This means that any item in the
group has a full and equal chance of being selected and included in the sample. This makes the
selected sample truly representative in character.
(5) Economical: A sample should be economical. The objectives of the survey should be
achieved with minimum cost and effort.
(6) Practical: A sample design should be practical. The sample design should be simple i.e. it
should be capable of being understood and followed in the fieldwork.
(7) Actual information provider: A sample should be designed so as to provide actual
information required for the study and also provide an adequate basis for the measurement of its
own reliability.
In brief, a good sample should be truly representative in character. It should be selected at
random and should be adequately proportional. These, in fact, are the attributes of a good
sample.

Sampling Frame (Practical Approach for Determining the Sample Frame Expected)
When developing a research study, one of the first things that you need to do is clarify all of
the units (also referred to as cases) that you are interested in studying. Units could be people,
organizations, or existing documents. In research, these units make up the population of interest.
When defining the population, it’s really important to be as specific as possible.

The problem is it’s not always possible or feasible to study every unit in a population. For
example, you might be interested in American college students’ attitudes about owning houses. It
would obviously be too time-consuming and costly to collect information from every college
student in the United States. In cases like these, you can study a portion or subset of the
population called a sample. The process of selecting a sample needs to be deliberate, and there
are various sampling techniques that you can use depending upon the purpose of the research.
Prior to selecting a sample you need to define a sampling frame, which is a list of all the units of
the population of interest. You can only apply your research findings to the population defined
by the sampling frame.
Qualities of a Good Sampling Frame
You can’t just use any list you come across! Care must be taken to make sure your sampling
frame is adequate for your needs.
 Include all individuals in the target population.
 Exclude all individuals not in the target population.
 Includes accurate information that can be used to contact selected individuals.
Other general factors that you would want to make sure you have:

 A unique identifier for each member. This could be a simple numerical identifier (i.e. from 1 to
1000). Check to make sure there are no duplicates in the frame.
 A logical organization to the list. For example, put them in alphabetical order.
 Up to date information. This may need to be periodically checked (i.e. for address changes).
In some cases, it might be impossible, or very difficult, to get a sampling frame. For example,
getting a list of prostitutes in your city isn’t likely (mostly because of the fact that most
prostitutes won’t want to be found). Sometimes, techniques like snowball sampling must be used
to make up for the lack of sampling frame. Snowball sampling is where you find one person (or a
few people) for your survey or experiment. You then ask them to find someone else who would
be willing to participate. Then that person finds someone else, and so on, until you have enough
people for your needs.
Sampling Errors, Non-Sampling Errors, Methods to Reduce the Error
A Sampling error is a statistical error that occurs when an analyst does not select a sample that
represents the entire population of data and the results found in the sample do not represent the
results that would be obtained from the entire population. Sampling is an analysis performed by
selecting a number of observations from a larger population, and the selection can produce both
sampling errors and non-sampling errors.
Sampling error can be eliminated when the sample size is increased and also by ensuring that the
sample adequately represents the entire population. Assume, for example, that XYZ Company
provides a subscription-based service that allows consumers to pay a monthly fee to stream
videos and other programming over the web. The firm wants to survey homeowners who watch
at least 10 hours of programming over the web each week and pay for an existing video
streaming service. XYZ wants to determine what percentage of the population is interested in a
lower-priced subscription service. If XYZ does not think carefully about the sampling process,
several types of sampling errors may occur.

Examples of Sampling Error
A population specification error means that XYZ does not understand the specific types of
consumers who should be included in the sample. If, for example, XYZ creates a population of
people between the ages of 15 and 25 years old, many of those consumers do not make the
purchasing decision about a video streaming service because they do not work full-time. On the
other hand, if XYZ put together a sample of working adults who make purchase decisions, the
consumers in this group may not watch 10 hours of video programming each week.
Selection error also causes distortions in the results of a sample, and a common example is a
survey that only relies on a small portion of people who immediately respond. If XYZ makes an
effort to follow up with consumers who don’t initially respond, the results of the survey may
change. Furthermore, if XYZ excludes consumers who don’t respond right away, the sample
results may not reflect the preferences of the entire population.
Sample Size and Sampling Error
Given two exactly the same studies, same sampling methods, same population, the study with a
larger sample size will have less sampling process error compared to the study with smaller
sample size. Keep in mind that as the sample size increases, it approaches the size of the entire
population, therefore, it also approaches all the characteristics of the population, thus, decreasing
sampling process error.
Non-Sampling Errors
A non-sampling error is an error that results during data collection, causing the data to differ
from the true values. Non-sampling error differs from sampling error. A sampling error is limited

to any differences between sample values and universe values that arise because the entire
universe was not sampled. Sampling error can result even when no mistakes of any kind are
made. The “errors” result from the mere fact that data in a sample is unlikely to perfectly match
data in the universe from which the sample is taken. This “error” can be minimized by increasing
the sample size. Non-sampling errors cover all other discrepancies, including those that arise
from a poor sampling technique.
Non-sampling errors may be present in both samples and censuses in which an entire population
is surveyed and may be random or systematic. Random errors are believed to offset each other
and therefore are of little concern. Systematic errors, on the other hand, affect the entire sample
and are therefore present a greater issue. Non-sampling errors can include but are not limited to,
data entry errors, biased survey questions, biased processing/decision making, non-responses,
inappropriate analysis conclusions and false information provided by respondents.
While increasing sample size will help minimize sampling error, it will not have any effect on
reducing non-sampling error. Unfortunately, non-sampling errors are often difficult to detect, and
it is virtually impossible to eliminate them entirely.
Methods to Reduce Sampling Error
Of the two types of errors, sampling error is easier to identify. The biggest techniques for
reducing sampling error are:
(i) Increase the sample size.
A larger sample size leads to a more precise result because the study gets closer to the actual
population size.
(ii) Divide the population into groups.
Instead of a random sample, test groups according to their size in the population. For example, if
people of a certain demographic make up 35% of the population, make sure 35% of the study is
made up of this variable.

(iii) Know your population.
The error of population specification is when a research team selects an inappropriate population
to obtain data. Know who buys your product, uses it, works with you, and so forth. With basic
socio-economic information, it is possible to reach a consistent sample of the population. In
cases like marketing research, studies often relate to one specific population like Facebook users,
Baby Boomers, or even homeowners.
Methods to Non- Reduce Sampling Error
(i) Thoroughly Pretest your Survey Mediums
As discussed in the example above, it is very important to ensure that your survey and its invites
run smoothly through any medium or on any device your potential respondents might use.
People are much more likely to ignore survey requests if loading times are long, questions do not
fit properly on their screens, or they have to work to make the survey compatible with their
device. The best advice is to acknowledge your sample`s different forms of communication
software and devices and pre-test your surveys and invites on each, ensuring your survey runs
smoothly for all your respondents.
(ii) Avoid Rushed or Short Data Collection Periods
One of the worst things a researcher can do is limit their data collection time in order to comply
with a strict deadline. Your study’s level of nonresponse bias will climb dramatically if you are
not flexible with the time frames respondents have to answer your survey. Fortunately, flexibility
is one of the main advantages to online surveys since they do not require interviews (phone or in
person) that must be completed at certain times of the day. However, keeping your survey live
for only a few days can still severely limit a potential respondent’s ability to answer. Instead, it is
recommended to extend a survey collection period to at least two weeks so that participants can
choose any day of the week to respond according to their own busy schedule.
(iii) Send Reminders to Potential Respondents

Sending a few reminder emails throughout your data collection period has been shown to
effectively gather more completed responses. It is best to send your first reminder email midway
through the collection period and the second near the end of the collection period. Make sure you
do not harass the people on your email list who have already completed your survey! You can
manage your reminders and invites on FluidSurveys through the trigger options found in the
invite tool.
(iv) Ensure Confidentiality
Any survey that requires information that is personal in nature should include reassurance to
respondents that the data collected will be kept completely confidential. This is especially the
case in surveys that are focused on sensitive issues. Make certain someone reading your invite
understands that the information they provide will be viewed as part the whole sample and not
individually scrutinized.
(v) Use Incentives
Many people refuse to respond to surveys because they feel they do not have the time to spend
answering questions. An incentive is usually necessary to motivate people into taking part in
your study. Depending on the length of the survey, the difficulty in finding the correct
respondents (ie: one-legged, 15th-century spoon collectors), and the information being asked, the
incentive can range from minimal to substantial in value. Remember, most respondents won’t
have an invested interest in your study and must feel that the survey is worth their time!
Sample Size Constraints, Non-Response
Effects of Small Sample Size
In the formula, the sample size is directly proportional to Z-score and inversely proportional to
the margin of error. Consequently, reducing the sample size reduces the confidence level of the
study, which is related to the Z-score. Decreasing the sample size also increases the margin of
error.

In short, when researchers are constrained to a small sample size for economic or logistical
reasons, they may have to settle for less conclusive results. Whether or not this is an important
issue depends ultimately on the size of the effect they are studying. For example, a small sample
size would give more meaningful results in a poll of people living near an airport who are
affected negatively by air traffic than it would in a poll of their education levels.
Effect of Large Sample Size
There is a widespread belief that large samples are ideal for research or statistical analysis.
However, this is not always true. Using the above example as a case study, very large samples
that exceed the value estimated by sample size calculation present different hurdles.
The first is ethical. Should a study be performed with more patients than necessary? This means
that more people than needed are exposed to the new therapy. Potentially, this implies increased
hassle and risk. Obviously the problem is compounded if the new protocol is inferior to the
traditional method: More patients are involved in a new, uncomfortable therapy that yields
inferior results.
The second obstacle is that the use of a larger number of cases can also involve more financial
and human resources than necessary to obtain the desired response.
In addition to these factors, there is another noteworthy issue that has to do with statistics.
Statistical tests were developed to handle samples, not populations. When numerous cases are
included in the statistics, analysis power is substantially increased. This implies an exaggerated
tendency to reject null hypotheses with clinically negligible differences. What is insignificant
becomes significant. Thus, a potential statistically significant difference in the ANB angle of 0.1°
between the groups cited in the previous example would obviously produce no clinical difference
in the effects of wearing an appliance.
When very large samples are available in a retrospective study, the researcher needs first to
collect subsamples randomly, and only then perform the statistical test. If it is a prospective

study, the researcher should collect only what is necessary, and include a few more individuals to
compensate for subjects that leave the study.
CONCLUSIONS
In designing a study, sample size calculation is important for methodological and ethical reasons,
as well as for reasons of human and financial resources. When reading an article, the reader
should be on the alert to ascertain that the study they are reading was subjected to sample size
calculation. In the absence of this calculation, the findings of the study should be interpreted with
caution.
An appropriate sample renders the research more efficient: Data generated are reliable, resource
investment is as limited as possible, while conforming to ethical principles. The use of sample
size calculation directly influences research findings. Very small samples undermine the internal
and external validity of a study. Very large samples tend to transform small differences into
statistically significant differences – even when they are clinically insignificant. As a result, both
researchers and clinicians are misguided, which may lead to failure in treatment decisions.
NON-RESPONSE
BA lot of things can go wrong in a survey. One of the most important problems is non-response.
It is the phenomenon that the required information is not obtained from the persons selected in
the sample.
The consequences of non-response
One effect of non-response is that is reduces the sample size. This does not lead to wrong
conclusions. Due to the smaller sample size, the precision of estimators will be smaller. The
margins of error will be larger.
A more serious effect of non-response is that it can be selective. This occurs if, due to non-
response, specific groups are under- or over-represented in the survey. If these groups behave

differently with respect to the survey variables, this causes estimators to be biased. To say it in
other word: estimates are significantly too high or too low.
Example: surveys of Statistics Netherlands
Selective non-response is not uncommon. It occurs in a number of surveys of Statistics
Netherlands. A follow-up study of the Dutch Victimization Survey showed that persons, who are
afraid to be home alone at night, are less inclined to participate in the survey. In the Dutch
Housing Demand Survey, it turned out that people who refused to participate, have lesser
housing demands than people who responded. And for the Survey of Mobility of the Dutch
Population it was obvious that the more mobile people were under-represented among the
respondents.
Probability Sampling
Probability Sampling is a sampling technique in which sample from a larger population are
chosen using a method based on the theory of probability. For a participant to be considered as a
probability sample, he/she must be selected using a random selection.
The most important requirement of probability sampling is that everyone in your population has
a known and an equal chance of getting selected. For example, if you have a population of 100
people every person would have odds of 1 in 100 for getting selected. Probability sampling gives
you the best chance to create a sample that is truly representative of the population.
Probability sampling uses statistical theory to select randomly, a small group of people (sample)
from an existing large population and then predict that all their responses together will match the
overall population.
Probability Sampling Example
Let us take an example to understand this sampling technique. The population of the US alone is
330 million, it is practically impossible to send a survey to every individual to gather information

but you can use probability sampling to get data which is as good even if it is collected from a
smaller population.
For example, consider hypothetically an organization has 500,000 employees sitting at different
geographic locations. The organization wishes to make certain amendment in its human resource
policy, but before they roll out the change they wish to know if the employees will be happy with
the change or not. However, it’s a tedious task to reach out to all 500,000 employees. This is
where probability sampling comes handy. A sample from the larger population i.e from 500,000
employees can be chosen. This sample will represent the population. A survey now can be
deployed to the sample.
From the responses received, management will now be able to know whether employees in that
organization are happy or not about the amendment.
Steps involved in Probability Sampling
1. Choose your population of interest carefully: Carefully think and choose from the population,
people you think whose opinions should be collected and then include them in the sample.
2. Determine a suitable sample frame: Your frame should include a sample from your population
of interest and no one from outside in order to collect accurate data.
3. Select your sample and start your survey: It can sometimes be challenging to find the right
sample and determine a suitable sample frame. Even if all factors are in your favor, there still
might be unforeseen issues like cost factor, quality of respondents and quickness to respond.
Getting a sample to respond to true probability survey might be difficult but not impossible.
But, in most cases, drawing a probability sample will save you time, money, and a lot of
frustration. You probably can’t send surveys to everyone but you can always give everyone a
chance to participate, this is what probability sample is all about.
When to use Probability Sampling
1. When the sampling bias has to be reduced: This sampling method is used when the bias has to
be minimum. The selection of the sample largely determines the quality of the research’s

inference. How researchers select their sample largely determines the quality of a researcher’s
findings. Probability sampling leads to higher quality findings because it provides an unbiased
representation of the population.
2. When the population is usually diverse: When your population size is large and diverse this
sampling method is usually used extensively as probability sampling helps researchers create
samples that fully represent the population. Say we want to find out how many people prefer
medical tourism over getting treated in their own country, this sampling method will help pick
samples from various socio-economic strata, background etc to represent the bigger population.
3. To create an accurate sample: Probability sampling help researchers create an accurate sample
of their population. Researchers can use proven statistical methods to draw accurate sample size
to obtained well-defined data.
Advantages
1. Its Cost-effective: This process is both cost and time effective and a larger sample can also be
chosen based on numbers assigned to the samples and then choosing random numbers from the
bigger sample. Work here is done.
2. It is simple and easy: Probability sampling is an easy way of sampling as it does not involve a
complicated process. It is quick and saves time. The time saved can thus be used to analyze the
data and draw conclusions.
3. It is non-technical: This method of sampling doesn’t require any technical knowledge because
of the simplicity with which this can be done. This method doesn’t require complex knowledge
and it is not at all lengthy.
Types of Probability Sampling: Simple Random Sampling, Systematic Sampling, Stratified
Random sampling, Area sampling, Cluster Sampling
Types of Probability Sampling
1. Simple Random Sample

Simple random sampling as the name suggests is a completely random method of selecting the
sample. This sampling method is as easy as assigning numbers to the individuals (sample) and
then randomly choosing from those numbers through an automated process. Finally, the numbers
that are chosen are the members that are included in the sample.
There are two ways in which the samples are chosen in this method of sampling: Lottery system
and using number generating software/ random number table. This sampling technique usually
works around large population and has its fair share of advantages and disadvantages.
Simple Random Sample Advantages
Ease of use represents the biggest advantage of simple random sampling. Unlike more
complicated sampling methods such as stratified random sampling and probability sampling, no
need exists to divide the population into sub-populations or take any other additional steps before
selecting members of the population at random.
A simple random sample is meant to be an unbiased representation of a group. It is considered a
fair way to select a sample from a larger population, since every member of the population has
an equal chance of getting selected.
Simple Random Sample Disadvantages
A sampling error can occur with a simple random sample if the sample does not end up
accurately reflecting the population it is supposed to represent. For example, in our simple
random sample of 25 employees, it would be possible to draw 25 men even if the population
consisted of 125 women and 125 men. For this reason, simple random sampling is more
commonly used when the researcher knows little about the population. If the researcher knew
more, it would be better to use a different sampling technique, such as stratified random
sampling, which helps to account for the differences within the population, such as age, race or
gender. Other disadvantages include the fact that for sampling from large populations, the
process can be time consuming and costly compared to other methods.
2. Systematic Sample

Systematic Sampling is when you choose every “nth” individual to be a part of the sample. For
example, you can choose every 5th person to be in the sample. Systematic sampling is an
extended implementation of the same old probability technique in which each member of the
group is selected at regular periods to form a sample. There’s an equal opportunity for every
member of a population to be selected using this sampling technique.
Risks Associated With Systematic Sampling
One risk that statisticians must consider when conducting systematic sampling involves how the
list used with the sampling interval is organized. If the population placed on the list is organized
in a cyclical pattern that matches the sampling interval, the selected sample may be biased. For
example, a company’s human resources department wants to pick a sample of employees and ask
how they feel about company policies. Employees are grouped in teams of 20, with each team
headed by a manager. If the list used to pick the sample size is organized with teams clustered
together, the statistician risks picking only managers (or no managers at all) depending on the
sampling interval.
3. Stratified Random Sample
Stratified Random sampling involves a method where a larger population can be divided into
smaller groups that usually don’t overlap but represent the entire population together. While
sampling these groups can be organized and then draw a sample from each group separately.
A common method is to arrange or classify by sex, age, ethnicity and similar ways. Splitting
subjects into mutually exclusive groups and then using simple random sampling to choose
members from groups.
Members in each of these groups should be distinct so that every member of all groups get equal
opportunity to be selected using simple probability. This sampling method is also called “random
quota sampling.
Advantages of Stratified Random Sampling

The main advantage of stratified random sampling is that it captures key population
characteristics in the sample. Similar to a weighted average, this method of sampling produces
characteristics in the sample that are proportional to the overall population. Stratified random
sampling works well for populations with a variety of attributes but is otherwise ineffective if
subgroups cannot be formed.
Stratification gives a smaller error in estimation and greater precision than the simple random
sampling method. The greater the differences between the strata, the greater the gain in
precision.
4. Area Sampling
Area sampling is a method of sampling used when no complete frame of reference is available.
The total area under investigation is divided into small sub-areas which are sampled at random or
according to a restricted process (stratification of sampling). Each of the chosen sub-areas is then
fully inspected and enumerated, and may form the basis for further sampling if desired.
Application of Area sampling
The basic idea of area sampling is both simple and powerful. It enjoys wide usage in situations
where very high quality data are wanted but for which no list of universe items exists. For
instance, many governmental agencies (e.g. Bureau of Labor Statistics) use area sampling.
However, the practical execution of a large scale area sample is highly complex. Typically an
area sampling is conducted in multiple stages, with successively smaller area clusters being sub-
sampled at each stage.
Example: A national sample of households is often constructed in a series of steps like this:
(i) Create geographic strata, each consisting of a group of counties in more or less close
proximity. Fifty or more such strata, containing all of the roughly 3,000 US counties, are
commonly used.

(ii) Within each geographic stratum, choose a probability sample of one or more counties (or
groups of counties such as metropolitan areas).
(iii) Within each sample county (or group of counties), choose a probability sample of places
(cities, towns, etc).
(iv) Within each sample place, select a probability sample of area segments (blocks in cities, area
with identifiable boundaries in other places, etc)
(v) Finally, within sample segments choose a probability sample of households.
5. Cluster Sampling
Cluster sampling is a way to randomly select participants when they are geographically spread
out. For example, if you wanted to choose 100 participants from the entire population of the
U.S., it is likely impossible to get a complete list of everyone. Instead, the researcher randomly
selects areas (i.e. cities or counties) and randomly selects from within those boundaries.
Cluster sampling usually analyzes a particular population in which the sample consists of more
than a few elements, for example, city, family, university etc. The clusters are then selected by
dividing the greater population into various smaller sections.
Cluster Sampling: Steps
Some steps and tips to use cluster sampling for market research, are:-
 Sample: Decide the target audience and also the size of the sample.
 Create and evaluate sampling frames: Create a sampling frame by using either an existing
frame or creating a new one for the target audience. Evaluate frames on the basis of coverage and
clustering and make adjustments accordingly. These groups will be varied considering the
population which can be exclusive and comprehensive. Members of a sample are selected
individually.

 Determine groups: Determine the number of groups by including the same average members in
each group. Make sure each of these groups are distinct from one another.
 Select clusters: Choose clusters randomly for sampling.
 Geographic segmentation: Geographic segmentation is the most commonly used cluster
sample.
 Sub-types: Cluster sampling is bifurcated into one-stage and multi-stage subtypes on the basis of
the number of steps followed by researchers to form clusters.
Cluster Sampling Methods with Examples
There are two ways to classify cluster sampling. The first way is based on the number of stages
followed to obtain the cluster sample and the second way is the representation of the groups in
the entire cluster.
The first classification is the most used in cluster sampling. In most cases, sampling by clusters
happens over multiple stages. A stage is considered to be the steps taken to get to a desired
sample and cluster sampling is divided into single-stage, two-stage, and multiple stages.
(I) Single Stage Cluster Sampling: As the name suggests, sampling will be done just once. An
example of Single Stage Cluster Sampling –An NGO wants to create a sample of girls across 5
neighboring towns to provide education. Using single-stage cluster sampling, the NGO can
randomly select towns (clusters) to form a sample and extend help to the girls deprived of
education in those towns.
(II) Two-Stage Cluster Sampling: A sample created using two-stages is always better than a
sample created using a single stage because more filtered elements can be selected which can
lead to improved results from the sample. In two-stage cluster sampling, instead of selecting all
the elements of a cluster, only a handful of members are selected from each cluster by
implementing systematic or simple random sampling. An example of Two-Stage Cluster
Sampling –A business owner is inclined towards exploring the statistical performance of her
plants which are spread across various parts of the U.S. Considering the number of plants,
number of employees per plant and work done from each plant, single-stage sampling would be

time and cost consuming. This is when she decides to conduct two-stage sampling. The owner
creates samples of employees belonging to different plants to form clusters and then divides it
into the size or operation status of the plant. A two-level cluster sampling was formed on which
other clustering techniques like simple random sampling were applied to proceed with the
calculations.
(III) Multiple Stage Cluster Sampling: For effective research to be conducted across multiple
geographies, one needs to form complicated clusters that can be achieved only using multiple-
stage cluster sampling technique. Steps of listing and sampling will be used in this sampling
method. An example of Multiple Stage Cluster Sampling –Geographic cluster sampling is one of
the most extensively implemented cluster sampling technique. If an organization intends to
conduct a survey to analyze the performance of smartphones across Germany. They can divide
the entire country’s population into cities (clusters) and further select cities with the highest
population and also filter those using mobile devices.
Cluster Sampling Advantages
There are multiple advantages of using cluster sampling, they are:-
(I) Consumes less time and cost: Sampling of geographically divided groups require less work,
time and cost. It’s a highly economical method to observe clusters instead of randomly doing it
throughout a particular region by allocating a limited number of resources to those selected
clusters.
(II) Convenient access: Large samples can be chosen with this sampling technique and that’ll
increase accessibility to various clusters.
(III) Least loss in accuracy of data: Since there can be large samples in each cluster, loss of
accuracy in information per individual can be compensated.
(IV) Ease of implementation: Since cluster sampling facilitates information from various areas
and groups, it can be easily implemented in practical situations in comparison to other

probability sampling methods such as simple random sampling, systematic sampling, and
stratified sampling or non-probability sampling methods such as convenience sampling.
In comparison to simple random sampling, cluster sampling can be effective in deciding the
characteristics of a group such as population and it can also be implemented without having a
sampling frame for all the elements for the entire population.
Non Probability Sample
Non-probability sampling is a sampling technique in which the researcher selects samples based
on the subjective judgment of the researcher rather than random selection.
In non-probability sampling, not all members of the population have a chance of participating in
the study unlike probability sampling, where each member of the population has a known chance
of being selected.
Non-probability sampling is most useful for exploratory studies like pilot survey (a survey that is
deployed to a smaller sample compared to pre-determined sample size). Non-probability
sampling is used in studies where it is not possible to draw random probability sampling due to
time or cost considerations.
Non-probability sampling is a less stringent method, this sampling method depends heavily on
the expertise of the researchers. Non-probability sampling is carried out by methods of
observation and is widely used in qualitative research.
Advantages of non-probability sampling
(i) Non-probability sampling is a more conducive and practical method for researchers deploying
survey in the real world. Although statisticians prefer probability sampling because it yields data
in the form of numbers. However, if done correctly, non-probability sampling can yield similar if
not the same quality of results.

(ii) Getting responses using non-probability sampling is faster and more cost-effective as
compared to probability sampling because sample is known to researcher, they are motivated to
respond quickly as compared to people who are randomly selected.
Disadvantages of non-probability sampling
(i) In non-probability sampling, researcher needs to think through potential reasons for biases. It
is important to have a sample that represents closely the population.
(ii) While choosing a sample in non-probability sampling, researchers need to be careful about
recruits distorting data. At the end of the day, research is carried out to obtain meaningful
insights and useful data.
When to use non-probability sampling?
 This type of sampling is used to indicate if a particular trait or characteristic exists in a
population.
 This sampling technique is widely used when researchers aim at conducting qualitative research,
pilot studies or exploratory research.
 Non-probability sampling is used when researchers have limited time to conduct researcher or
have budget constraints.
 Non-probability sampling is conducted to observe if a particular issue needs in-depth analysis.
Types of Non-Probability Sampling: Judgmental or Purposive Sampling, Convenience Sampling,
Quota Sampling, Snowball Sampling, Consecutive Sampling
1. JUDGMENT OR PURPOSIVE SAMPLING
In judgmental sampling, the samples are selected based purely on researcher’s knowledge and
credibility. In other words, researchers choose only those who he feels are a right fit (with
respect to attributes and representation of a population) to participate in research study.

This is not a scientific method of sampling and the downside to this sampling technique is that
the results can be influenced by the preconceived notions of a researcher. Thus, there is a high
amount of ambiguity involved in this research technique.
For example, this type of sampling method can be used in pilot studies.
2. CONVENIENCE SAMPLING
Convenience sampling is a non-probability sampling technique where samples are selected from
the population only because they are conveniently available to researcher. These samples are
selected only because they are easy to recruit and researcher did not consider selecting sample
that represents the entire population.
Ideally, in research, it is good to test sample that represents the population. But, in some
research, the population is too large to test and consider the entire population. This is one of the
reasons, why researchers rely on convenience sampling, which is the most common non-
probability sampling technique, because of its speed, cost-effectiveness, and ease of availability
of the sample.
An example of convenience sampling would be using student volunteers known to researcher.
Researcher can send the survey to students and they would act as sample in this situation.
3. Quota Sampling
Hypothetically consider, a researcher wants to study the career goals of male and female
employees in an organization. There are 500 employees in the organization. These 500
employees are known as population. In order to understand better about a population, researcher
will need only a sample, not the entire population. Further, researcher is interested in particular
strata within the population. Here is where quota sampling helps in dividing the population into
strata or groups.

For studying the career goals of 500 employees, technically the sample selected should have
proportionate numbers of males and females. Which means there should be 250 males and 250
females. Since, this is unlikely, the groups or strata is selected using quota sampling.
4. Snowball Sampling
Snowball sampling helps researchers find sample when they are difficult to locate. Researchers
use this technique when the sample size is small and not easily available. This sampling system
works like the referral program. Once the researchers find suitable subjects, they are asked for
assistance to seek similar subjects to form a considerably good size sample.
For example, this type of sampling can be used to conduct research involving a particular illness
in patients or a rare disease. Researchers can seek help from subjects to refer other subjects
suffering from the same ailment to form a subjective sample to carry out the study.
5. Consecutive Sampling
This non-probability sampling technique is very similar to convenience sampling, with a slight
variation. Here, the researcher picks a single person or a group of sample, conducts research over
a period of time, analyzes the results and then moves on to another subject or group of subject if
needed.
Consecutive sampling gives the researcher a chance to work with many subjects and fine tune
his/her research by collecting results that have vital insights.
Determining Size of the Sample: Practical Considerations in Sampling and Sample Size
Determining sample size is a very important issue because samples that are too large may waste
time, resources and money, while samples that are too small may lead to inaccurate results. In
many cases, we can easily determine the minimum sample size needed to estimate a process
parameter, such as the population mean.
Sample size determination is the act of choosing the number of observations or replicates to
include in a statistical sample. The sample size is an important feature of any empirical study in

which the goal is to make inferences about a population from a sample. In practice, the sample
size used in a study is determined based on the expense of data collection, and the need to have
sufficient statistical power. In complicated studies there may be several different sample sizes
involved in the study: for example, in a stratified survey there would be different sample sizes
for each stratum. In a census, data are collected on the entire population, hence the sample size is
equal to the population size. In experimental design, where a study may be divided into different
treatment groups, this may be different sample sizes for each group.
Sample sizes may be chosen in several different ways:
 Experience – A choice of small sample sizes, though sometimes necessary, can result in wide
confidence intervals or risks of errors in statistical hypothesis testing.
 Using a target variance for an estimate to be derived from the sample eventually obtained, i.e. if
a high precision is required (narrow confidence interval) this translates to a low target variance
of the estimator.
 Using a target for the power of a statistical test to be applied once the sample is collected.
 Using a confidence level, i.e. the larger the required confidence level, the larger the sample size
(given a constant precision requirement).
When sample data is collected and the sample mean is calculated, that sample mean is
typically different from the population mean (µ) . This difference between the sample and
population means can be thought of as an error. The margin of error is the maximum difference
between the observed sample mean and the true value of the population mean (µ) :
where:
is known as the critical value, the positive Ζ value that is at the vertical boundary for the
area of in the right tail of the standard normal distribution.

σ is the population standard deviation.
n is the sample size.
Rearranging this formula, we can solve for the sample size necessary to produce results accurate
to a specified confidence and margin of error.
This formula can be used when you know and want to determine the sample size necessary to
establish, with a confidence of , the mean value to within You can still use this formula if you
don’t know your population standard deviation and you have a small sample size. Although it’s
unlikely that you know when the population mean is not known, you may be able to
determine from a similar process or from a pilot test/simulation.

sampling : Probability sample & Non Probability sampleunit4.docx

More Related Content

Similar to sampling : Probability sample & Non Probability sampleunit4.docx (20)

More from Rajiv Academy for Technology & Management Mathura (19)

Recently uploaded (20)

sampling : Probability sample & Non Probability sampleunit4.docx