Introduction to research process
Writers
usually treat the research task as a sequential process involving several
clearly defined steps. No one claims that research requires completion of each
step before going to the next. Recycling, circumventing, and skipping occur.
Some steps are begun out of sequence, some are carried out simultaneously, and
some may be omitted. Despite these variations, the idea of a sequence is useful
for developing a project and for keeping the project orderly as it unfolds.
The
exhibit below models the sequence of the research process:
The
research process begins when a management dilemma triggers the need for a
decision. For example for a laptop maker, a growing number of complaints about
post purchase service may start the process. In other situations, a controversy
arises, a major commitment of resources is called for, or conditions in the
environment signal the need for a decision. For a laptop maker, the critical
event could have been the introduction by a competitor of a new technology that
would revolutionize the processing speed of laptops. Such events cause managers
to reconsider their purposes or objectives, define a problem for solution, or
develop strategies for solutions they have identified.
The Management-Research Question Hierarchy
A useful
way to approach the research process is to state the basic dilemma that prompts
the research and then try to develop other questions by progressively breaking
down the original questions into more specific ones, as represented below.
The
process begins at the most general level with the ‘management dilemma’. This is
usually a symptom of an actual problem, such as:
·
Rising
Costs.
·
The
discovery of an expensive chemical compound that would increase the efficacy of
a drug.
·
Increasing
tenant move-outs from an apartment complex.
·
Declining
sales.
·
Increasing
employee turnover in a restaurant.
·
A
larger number of product defects during the manufacture of an automobile.
·
An
increasing number of letters and complaints about post purchase service.
Identifying
management dilemma is rarely difficult. However choosing one dilemma on which
to focus may be difficult. Choosing incorrectly will direct valuable resources
on a path that may not provide critical decision-making information (the
purpose of good research).
The Management Question
The
manager must move from the management dilemma to the management question to
proceed with the research process. The management question restates the dilemma
in question form:
·
What
should be done to reduce employee turnover?
·
What
should be done to increase tenant residency and reduce move-outs?
·
What
should be done to reduce costs?
·
What
should be done to reduce post purchase service complaints?
·
How
can we improve our profit picture?
Refining/ Fine-Tuning the Research Question
Fine
tuning the research question is precisely what a skilful practitioner must do
after completing the exploration of the available published data and getting
inputs and insights from the information gatekeepers as well as other trade and
industry publications and similar researches conducted in the past. At this point,
a clearer picture of the management and research questions begins to emerge.
After a preliminary review of the literature, a brief explanatory study, or
both, the project begins to crystallize in one of two ways:
a) It is apparent the question has
been answered and the process is finished.
b) A question different from the one
originally addressed has appeared.
The
research question does not have to be materially different, but it will have
evolved in some fashion. This is not cause for discouragement. The refined
research question(s) will have better focus and will move the research forward
with more clarity than the initially formulated question(s).
STAGE 2: RESEARCH PROPOSAL
A well
planned and adequately documented proposal is vital for any research process.
The proposal process uses two primary documents: the ‘request for proposal
(RFP)’and the ‘research proposal’. When the organization has research
specialists on the payroll, the internal research proposal is all that is
needed. Often, however, companies do not have adequate capacity, resources or
the specialized talents in-house to execute a project, so they turn to outside
research suppliers (including research specialists, universities, research
centers and consulting firms).
An RFP
is the formal document issued by a corporate research department, a decision
maker to solicit services from research suppliers. The researcher invites a
qualified supplier to submit a proposal in accordance with a specific, detailed
format delivered by a deadline. Besides a definition of the technical
requirements of the desired research, critical components of the RFP include
project management, pricing, and contract administration. These sections allow
the potential research supplier to understand and meet the expectations of the
sponsoring management team for the contracted services. Also, a section on
proposal administration, including important dates, is included.
The
research supplier finally submits the research proposal. It is a document
that is typically written by a scientist or academic which describes the ideas
for an investigation on a certain topic. The research proposal outlines the
process from beginning to end and may be used to request financing for the
project, certification for performing certain parts of research or the
experiment, or as a required task before beginning a college dissertation.
STAGE 3: RESEARCH DESIGN STRATEGY
A
research design has to be tailored to an organization’s particular research
needs. A research design is the blueprint for the collection, measurement, and
analysis of data. Some important research design types are: Exploratory design,
Descriptive design, Causal design etc. Research design aids the researcher in
the allocation of limited resources by posing crucial choices in methodology.
Research design is the plan and structure of investigation so conceived as to
obtain answers to research questions. The plan is the overall scheme or program
of the research. It includes an outline of what the investigator will do from
writing hypotheses and their operational implications to the final analysis of
data.
Research
design expresses both the structure of the research problem- the framework,
organization, or configuration of the relationships among variables of a study-
and the plan of investigation used to obtain empirical evidence on those
relationships.
In short
the essentials of a research designs are:
·
An
activity- and time-based plan.
·
A
plan always based on the research question.
·
A
guide for selecting sources and types of information.
·
A
framework for specifying the relationships among the study’s variables.
·
A
procedural outline for every research activity.
Exploratory Research Design
In the context of marketing research, every research problem is unique
in its own way, but almost all research problems and objectives can be matched
to one of three types of research designs—exploratory, descriptive, or causal.
The researcher’s choice of design depends on available information such as
nature of the problem, scope of the problem, objectives, and known information.
Exploratory research design is chosen to gain background information and to
define the terms of the research problem. This is used to clarify research
problems and hypotheses and to establish research priorities. A hypothesis is a
statement based on limited evidence which can be proved or disproved and leads
to further investigation. It helps organizations to formulate their problems
clearly.
Exploratory research design is conducted for a research problem when
the researcher has no past data or only a few studies for reference. Sometimes
this research is informal and unstructured. It serves as a tool for initial
research that provides a hypothetical or theoretical idea of the research
problem. It will not offer concrete solutions for the research problem. This
research is conducted in order to determine the nature of the problem and helps
the researcher to develop a better understanding of the problem. Exploratory
research is flexible and provides the initial groundwork for future research.
Exploratory research requires the researcher to investigate different sources
such as published secondary data, data from other surveys, observation of
research items, and opinions about a company, product, or service.
Example of Exploratory Research
Design:
Freshbite is a one and half year old e-commerce start-up company
delivering fresh foods as per the order to customer’s doorstep through its
delivery partners. The company operates in multiple cities. Since its
inception, the company achieved a high sales growth rate. However, after
completion of the first year, the sales started declining at brisk rate. Due to
lack of historical data, the sales director was confused about the reasons for
this decline in sales. He prefers to appoint a marketing research consultant to
conduct an exploratory research study in order to discern the possible reasons
rather than making assumptions. The prime objective of this research was not to
figure out a solution to the declining sales problem, but rather to identify
the possible reasons, such as poor quality of products and services,
competition, or ineffective marketing, and to better understand the factors
affecting sales. Once these potential causes are identified, the strength of
each reason can be tested using causal research.
Descriptive Research Design
Descriptive research is used to “describe” a
situation, subject, behaviour, or phenomenon. It is used to answer questions of
who, what, when, where, and how associated with a particular research question
or problem. Descriptive studies are often described as studies that are
concerned with finding out “what is”.
It attempts to gather quantifiable information that can be used to
statistically analyse a target audience or a particular subject. Description
research is used to observe and describe a research subject or problem without
influencing or manipulating the variables in any way. Hence, these studies are
really correlational or observational, and not truly experimental. This type of research is conclusive in
nature, rather than exploratory.
Therefore, descriptive research does not attempt to answer “why” and is
not used to discover inferences, make predictions or establish causal
relationships.
Descriptive research is used extensively in social
science, psychology and educational research. It can provide a rich data set
that often brings to light new knowledge or awareness that may have otherwise
gone unnoticed or encountered. It is
particularly useful when it is important to gather information with disruption
of the subjects or when it is not possible to test and measure large numbers of
samples. It allows researchers to
observe natural behaviours without affecting them in any way. Following is a
list of research questions or problems that may lend themselves to descriptive
research:
- Market researchers may want to
observe the habits of consumers.
- A company may be wanting to
evaluate the morale of the staff.
- A school district may research
whether or not students are more likely to access online textbooks than to
use printed copies.
- A school district may wish to
assess teachers’ attitudes about using technology in the classroom.
- An educational software company
may want to know what aspects of the software make it more likely to be
used by students.
- A researcher may wish to study
the impact of hands-on activities and laboratory experiments on students’
perceptions of science.
- A researcher could be studying
whether or not the availability of hiking/biking trails increases the
physical activity levels in a neighborhood.
In
some types of descriptive research, the researcher does not interact with the
subjects. In other types, the researcher does interact with the subjects
and collects information directly from them. Some descriptive studies may
be cross-sectional, whereby the researcher has a one-time interaction with the
test subjects. Other studies may be longitudinal, where the same test
subjects are followed over time. There are three main methods that may be
used in descriptive research:
- Observational Method – Used to review and
record the actions and behaviors of a group of test subjects in their
natural environment. The researcher typically does not have interaction
with the test subject.
- Case Study Method – This is a much more
in-depth study of an individual or small group of individuals. It may or
may not involve interaction with the test subjects.
- Survey Method – Researchers interact
with individual test subjects by collecting information through the use of
surveys or interviews.
The
data collected from descriptive research may be quantitative, qualitative or
both. The quantitative data is typically presented in the form of
descriptive statistics that provide basic information such as the mean, median,
and mode of a data set. Quantitative data may also be tabulated along a
continuum in numerical form, such as scores on a test. It can also be
used to describe categories of information or patterns of interactions.
Such quantitative data is typically represented in tables, graphs, and charts which
makes it user-friendly and easy to interpret. Qualitative data, such as
the type of narrative data collected in a case study, may be organized into
patterns that emerge or it may be classified in some way, but requires more
detailed analysis.
Causal Research Design
Causal
research, also known as explanatory research is conducted in order to identify
the extent and nature of cause-and-effect relationships. Causal research can be
conducted in order to assess impacts of specific changes on existing norms,
various processes etc.
Causal
studies focus on an analysis of a situation or a specific problem to explain
the patterns of relationships between variables. Experiments are
the most popular primary data collection methods in studies with causal
research design.
The
presence of cause-and-effect relationships can be confirmed only if specific
causal evidence exists. Causal evidence has three important components:
1.
Temporal sequence.
The cause must occur before the effect. For example, it would not be
appropriate to credit the increase in sales to rebranding efforts if the
increase had started before the rebranding.
2.
Concomitant variation.
The variation must be systematic between the two variables. For example, if a
company doesn’t change its employee training and development practices, then
changes in customer satisfaction cannot be caused by employee training and
development.
3.
Non spurious association.
Any co variation between a cause and an effect must be true and not simply due
to other variable. In other words, there should not be a ‘third’ factor that
relates to both, cause, as well as, effect.
The following are
examples of research objectives for causal research design:
§ To
assess the impacts of foreign direct investment on the levels of economic
growth in Taiwan
§ To
analyze the effects of re-branding initiatives on the levels of customer
loyalty
§ To
identify the nature and impact of work process re-engineering on the levels of
employee motivation.
Advantages
of Causal Research (Explanatory Research)
§ Causal
studies may play an instrumental role in terms of identifying reasons behind a
wide range of processes, as well as, assessing the impacts of changes on
existing norms, processes etc.
§ Causal
studies usually offer the advantages of replication if necessity arises.
§ These
type of studies are associated with greater levels of internal validity due to
systematic selection of subjects.
|
STAGE 4:
INSTRUMENT DEVELOPMENT AND PILOT TESTING:
When there is no instrument available that measures the construct of your
interest, you may decide to develop a measurement instrument yourself.
Therefore, the following steps need to be performed:
Step 1: Definition
and elaboration of the construct intended to be measured
The first step in instrument development is conceptualization, which involves defining the construct and the variables to be measured.
Use the International Classification of Functioning, Disability and Health (ICF)
(WHO, 2011) or the model by Wilson and Clearly (1995) as a framework for your
conceptual model. When the construct is not directly observable (latent
variable), the best choice is to develop a multi-item instrument (De Vet et al. 2011). When the observable items are
consequences of (reflecting) the construct, this is called a reflective model. When the observable items are determinants of the construct, this
is called a formative model.
When you are interested in a multidimensional construct, each dimension and its
relation to the other dimensions should be described.
Step 2: Choice of
measurement method (e.g.
questionnaire/physical test)
Some constructs form an indissoluble alliance with a measurement
instrument, e.g. body temperature is measured with a thermometer; and a
sphygmomanometer is usually used to assess blood pressure in clinical practice.
The options are therefore limited in these cases, but in other situations more
options exist. For example, physical functioning can be measured with a
performance test, observations, or with an interview or self-report
questionnaire. With a performance test for physical functioning, information is
obtained about what a person can do, while
by interview or self-report questionnaire information is obtained about what a
person perceives he/she
can do.
Step 3: Selecting and
formulating items
To get input for formulating items for a multi-item questionnaire you
could examine similar existing instruments from the literature that measure a
similar construct, e.g. for different target population, and talk to experts
(both clinicians and patients) using in-depth interview techniques., and In
addition, you should pay careful attention to the formulation of response
options, instructions choosing an appropriate recall period (Van den Brink
& Mellenbergh, 1998).
Step 4: Scoring
issues
Many multi-item questionnaires contain 5-point item scales, and therefore
are ordinal scales. Often a total score of the instrument is considered to be
an interval scale, which makes the instrument suitable for more statistical
analyses. Several questions are important to answer:
How
can you calculate (sub)scores? Add the items, use the mean score of each item, or calculate Z-scores.
Are
all items equally important or will you use (implicit) weights? Note that when an instrument has 3 subscales, with 5, 7, and 10
items respectively, the total score calculated as the mean of the mean
score of each subscale differs from the total score calculated as the mean of
all items.
How
will you deal with missing values? In case of many missings (>5-10%) consider multiple imputation.
Step 5: Pilot study
Be aware that the first version of the instrument you develop will
(probably) not be the final version. It is sensible to (regularly) test your
instrument in small groups of people. A pilot test is intended to test the
comprehensibility, relevance, and acceptability and feasibility of your
measurement instrument.
Step 6: Field-testing
A field test is typically conducted to have experts in the field
review an untested set of survey/interview questions to ensure credibility,
dependability, validity, and risk level. In a field test, data is not
collected. Considerations include:
- Any
assistants helping with the field test are not considered participants in
the main dissertation study.
- Findings
from the field test can be used to further refine survey/interview
questions for the main dissertation study.
Example
Researcher B is conducting a qualitative study about how
families cope with anorexia and therefore plans to conduct in-depth interviews
with the family members of people with anorexia. Researcher B wants to ensure
that the interview questions adequately capture the coping process and unique
issues of individuals who have a family member experiencing anorexia while also
ensuring that the questions utilize appropriate language. Even though Researcher
B has conducted a comprehensive literature review and that review guided the
development of the interview questions, they want experts in the field to
review those questions. Researcher B has decided to conduct a field test with
social workers who have worked extensively with families and anorexia. The
social workers will review the interview questions and recommend improvements.
Pilot
Testing:
In
research, a pilot test is a small preliminary study used to test a proposed
research study before a full scale performance. This smaller study usually
follows the exact same processes and procedures as its full-scale counterpart.
The primary purpose of a pilot study is to evaluate the feasibility of the
proposed major study. The pilot test may also be used to estimate costs and
necessary sample size of the greater study. A pilot test is sometimes called a
pilot experiment, pilot project, pilot study, feasibility study, or pilot run.
Before
investing in a full-scale research study, it is often advisable to perform a
pilot test. Conducting a smaller scale study permits researchers to identify
problems with the study plan before making a major investment of time and
resources. Results of the pilot study may also be used to estimate the costs
and sample size of the proposed full-size study. The pilot test should be run
once the proposed research project has been fully designed, but before
investing in a final launch of the project. These smaller test runs are
considered an essential component of a good study design.
A pilot
study is a research study conducted before the intended study. Pilot
studies are usually executed as planned for the intended study, but on a
smaller scale. Although a pilot study cannot eliminate all systematic
errors or unexpected problems, it reduces the likelihood of making
a Type I or Type II error. Both types of
errors make the main study a waste of effort, time, and money.
Reasons
to Employ a Pilot Study
There
are many reasons to employ a pilot study before implementing the main study.
Here are a few good reasons:
- To test the research process and/or
protocol. These are often referred to as feasibility studies because
the pilot study tests how possible the design is in reality. For example,
are the study resources adequate, including time, finances, and materials?
Are there are any other logistical problems that need to be addressed?
- To identify variables of interest
and decide how to operationalize each one. For instance, what are the
indicators of composite variables? How will variables be measured and/or
computed?
- To test an intervention strategy and
identify the components that are most important to the facilitation of the
intervention.
- To test methodological changes to
implementation or administration of an instrument and/or train personnel
on the administration of instruments.
- To develop or test the efficacy of
research instruments and protocols. Are there confusing or misleading
questions? Is it possible to maintain maximum objectivity and reduce
observer drift?
- To estimate statistical parameters
for later analyses. Certain statistical analyses require the sample size
is sufficiently large and contains enough variability to detect
differences between groups, given there any real differences to be
detected.
Difference
between Census and Sampling
Census and sampling are two methods of collecting survey
data about the population that are used by many countries. Census refers to
the quantitative research method, in which all the members of the population
are enumerated. On the other hand, the sampling is the
widely used method, in statistical testing, wherein a data set is selected from
the large population, which represents the entire group.
Census implies complete enumeration of the study objects,
whereas Sampling connotes enumeration of the subgroup of elements chosen for
participation. These two survey methods are often contrasted with each other,
and so this article makes an attempt to clear the differences between
census and sampling, in detail; Have a look.
BASIS FOR COMPARISON |
CENSUS |
SAMPLING |
Meaning |
A
systematic method that collects and records the data about the members of the
population is called Census. |
Sampling
refers to a portion of the population selected to represent the entire group,
in all its characteristics. |
Enumeration |
Complete |
Partial |
Study
of |
Each
and every unit of the population. |
Only
a handful of units of the population. |
Time
required |
It
is a time consuming process. |
It
is a fast process. |
Cost |
Expensive
method |
Economical
method |
Results |
Reliable
and accurate |
Less
reliable and accurate, due to the margin of error in the data collected. |
Error |
Not
present. |
Depends
on the size of the population |
Appropriate
for |
Population
of heterogeneous nature. |
Population
of homogeneous nature. |
SAMPLING
Sampling is a process used in statistical analysis in which a
predetermined number of observations are taken from a larger population. The
methodology used to sample from a larger population depends on the type of
analysis being performed but may include simple random sampling or systematic
sampling.
In business, a CPA performing an audit uses sampling to determine the
accuracy of account balances in the financial statements, and managers use
sampling to assess the success of the firm’s marketing efforts.
The sample should be a representation of the entire population. When
taking a sample from a larger population, it is important to consider how the
sample is chosen. To get a representative sample, the sample must be drawn
randomly and encompass the whole population. For example, a lottery system
could be used to determine the average age of students in a university by
sampling 10% of the student body.
Examples of Sample Tests for
Marketing
Businesses aim to sell their products and/or services to target
markets. Before presenting products to
the market, companies generally identify the needs and wants of their target
audience. To do so, they may employ using a sample of the population to gain a
better understanding of those needs to later create a product and/or service
that meets those needs. Gathering the
opinions of the sample helps to identify the needs of the whole.
What is
Non-Probability Sampling?
Non-probability
sampling is a sampling technique where the odds of any member being selected
for a sample cannot be calculated.
It’s the opposite of probability sampling, where
you can calculate the odds. In addition, probability sampling
involves random selection, while non-probability sampling does not–it relies on
the subjective
judgement of the researcher.
The odds do not have to be equal for a method
to be considered probability sampling. For example,
one person could have a 10% chance of being selected and another person could
have a 50% chance of being selected. It’s non-probability sampling when
you can’t calculate the odds at all.
Types of
Non-Probability Sampling
Types of Non-Probability Sampling
·
Convenience Sampling: as the name
suggests, this involves collecting a sample from somewhere convenient to you:
the mall, your local school, your church. Sometimes called accidental sampling,
opportunity sampling or grab sampling.
·
Haphazard Sampling: where a
researcher chooses items haphazardly, trying to simulate randomness. However,
the result may not be random at all and is often tainted by selection bias.
·
Purposive Sampling: where the
researcher chooses a sample based on their knowledge about the population and
the study itself. The study participants are chosen based on the study’s
purpose. There are several types of purposive sampling. For a full list,
advantages and disadvantages of the method, see the article: Purposive Sampling.
·
Expert
Sampling: in this method, the researcher draws
the sample from a list of experts in the field.
·
Heterogeneity Sampling
/ Diversity Sampling: a type of sampling where you
deliberately choose members so that all views are represented. However, those
views may or may not be represented proportionally.
·
Modal Instance Sampling: The most “typical”
members are chosen from a set.
·
Quota
Sampling: where the groups (i.e. men and women)
in the sample are proportional to the groups in the population.
·
Snowball Sampling: where research participants recruit
other members for the study. This method is particularly useful when
participants might be hard to find. For example, a study on working prostitutes
or current heroin users.
How to Develop a Good Research Question:
·
Researchers should begin by identifying a broader
subject of interest that lends itself to investigation. For example, a
researcher may be interested in childhood obesity.
·
The next step is to do preliminary research on the
general topic to find out what research has already been done and what
literature already exists. How much research has been done on childhood
obesity? What types of studies? Is there a unique area that yet to
be investigated or is there a particular question that may be worth
replicating?
·
Then begin to narrow the topic by asking open-ended
"how" and "why" questions. For example, a researcher
may want to consider the factors that are contributing to childhood obesity or
the success rate of intervention programs. Create a list of potential
questions for consideration and choose one that interests you and provides an
opportunity for exploration.
·
Finally, evaluate the question by using the following
list of guidelines:
·
Is the research question one that is of interest to
the researcher and potentially to others? Is it a new issue or problem
that needs to be solved or is it attempting to shed light on previously
researched topic.
·
Is the research question researchable? Consider
the available time frame and the required resources. Is the methodology
to conduct the research feasible?
·
Is the research question measureable and will the
process produce data that can be supported or contradicted?
·
Is the research question too broad or too narrow?
Types
of sampling design in Research Methodology
There are different types of sample designs based on
two factors viz., the representation basis and the element selection technique.
On the representation basis, the sample may be probability sampling or it may
be non-probability sampling. Probability sampling is based on the concept of
random selection, whereas non-probability sampling is ‘non-random’ sampling. On
element selection basis, the sample may be either unrestricted or restricted.
When each sample element is drawn individually from the population at large,
then the sample so drawn is known as ‘unrestricted sample’, whereas all other
forms of sampling are covered under the term ‘restricted sampling’. The
following chart exhibits the sample designs as explained above.
Thus, sample designs are basically of two types viz.,
non-probability sampling and probability sampling. We take up these two designs
separately.
CHART SHOWING BASIC
SAMPLING DESIGNS
Non-probability
sampling: Non-probability sampling is that sampling procedure which does not
afford any basis for estimating the probability that each item in the
population has of being included in the sample. Non-probability sampling is
also known by different names such as deliberate sampling, purposive sampling
and judgement sampling. In this type of sampling, items for the sample are
selected deliberately by the researcher; his choice concerning the items
remains supreme. In other words, under non-probability sampling the organizers
of the inquiry purposively choose the particular units of the universe for
constituting a sample on the basis that the small mass that they so select out
of a huge one will be typical or representative of the whole. For instance, if
economic conditions of people living in a state are to be studied, a few towns
and villages may be purposively selected for intensive study on the principle
that they can be representative of the entire state. Thus, the judgement of the
organizers of the study plays an important part in this sampling design.
Probability sampling: Probability sampling is
also known as ‘random sampling’ or ‘chance sampling’. Under this sampling
design, every item of the universe has an equal chance of inclusion in the
sample. It is, so to say, a lottery method in which individual units are picked
up from the whole group not deliberately but by some mechanical process. Here
it is blind chance alone that determines whether one item or the other is
selected. The results obtained from probability or random sampling can be
assured in terms of probability i.e., we can measure the errors of estimation
or the significance of results obtained from a random sample, and this fact
brings out the superiority of random sampling design over the deliberate
sampling design. Random sampling ensures the law of Statistical Regularity
which states that if on an average the sample chosen is a random one, the
sample will have the same composition and characteristics as the universe. This
is the reason why random sampling is considered as the best technique of selecting
a representative sample. In such a design, personal element has a great chance
of entering into the selection of the sample. The investigator may select a
sample which shall yield results favorable to his point of view and if that
happens, the entire inquiry may get vitiated. Thus, there is always the danger
of bias entering into this type of sampling technique. But in the investigators
are impartial, work without bias and have the necessary experience so as to
take sound judgment, the results obtained from an analysis of deliberately
selected sample may be tolerably reliable. However, in such a sampling, there
is no assurance that every element has some specifiable chance of being
included. Sampling error in this type of sampling cannot be estimated and the
element of bias, great or small, is always there. As such this sampling design
in rarely adopted in large inquires of importance. However, in small inquiries
and researches by individuals, this design may be adopted because of the
relative advantage of time and money inherent in this method of sampling. Quota
sampling is also an example of non-probability sampling. Under quota
sampling the interviewers are simply given quotas to be filled from the
different strata, with some restrictions on how they are to be filled. In other
words, the actual selection of the items for the sample is left to the
interviewer’s discretion. This type of sampling is very convenient and is
relatively inexpensive. But the samples so selected certainly do not possess
the characteristic of random samples. Quota samples are essentially judgement
samples and inferences drawn on their basis are not amenable to statistical
treatment in a formal way.
What
is Sample design in Research Methodology ?
A sample design is made up of two elements. Random
sampling from a finite population refers to that method of sample selection
which gives each possible sample combination an equal probability of being
picked up and each item in the entire population to have an equal chance of
being included in the sample. This applies to sampling without replacement
i.e., once an item is selected for the sample, it cannot appear in the sample
again (Sampling with replacement is used less frequently in which procedure the
element selected for the sample is returned to the population before the next
element is selected. In such a situation the same element could appear twice in
the same sample before the second element is chosen). In brief, the
implications of random sampling (or simple random sampling) are:
·
It
gives each element in the population an equal probability of getting into the
sample; and all choices are independent of one another.
·
It
gives each possible sample combination an equal probability of being chosen.
Keeping this in view we can define
a simple random sample (or simply a random sample) from a finite population as
a sample which is chosen in such a way that each of the NCn possible
samples has the same probability, 1/NCn, of being selected. To make it
more clear we take a certain finite population consisting of six elements
(say a, b, c, d, e, f )
i.e., N = 6. Suppose that we want to take a sample of
size n = 3 from it. Then there are 6C3 = 20
possible distinct samples of the required size, and they consist of the
elements abc, abd, abe, abf, acd, ace, acf, ade, adf, aef, bcd, bce, bcf, bde, bdf, bef, cde, cdf, cef,
and def. If we choose one of these samples in such a way that
each has the probability 1/20 of being chosen, we will then call this a random
sample.
8 Important Types of Probability
Sampling
The eight important types of probability sampling used for conducting
social research. The types are: 1. Simple Random Sampling 2.
Systematic Sampling 3. Stratified Random Sampling 4. Proportionate Stratified
Sampling 5. Disproportionate Stratified Sampling 6. Optimum Allocation Sample
7. Cluster sampling 8. Multi-Phase Sampling.
Type # 1. Simple Random Sampling:
Simple random sampling is in a
sense, the basic theme of all scientific sampling. It is the primary
probability sampling design. Indeed, all other methods of scientific sampling
are variations of the simple random sampling. An understanding of any of the
refined or complex variety of sampling procedure presupposes an understanding
of simple random sampling.
A simple random sample is selected
by a process that not only gives to each element in the population an equal
chance of being included in the sample but also makes the selection of every
possible combination of cases in the desired sample size, equally likely.
Suppose, for example, that one has a population of six children, viz., A, B, C,
D, E and F.
There will be the following
possible combinations of cases, each having two elements from this population,
viz., AB, AC, AD, AE, AF, BC, BD, BE, BF, CD, CE, EF, DE, DF, and EF, i.e., in
all 15 combinations.
If we write each combination on
equal sized cards, put the cards in a basket, mix them thoroughly and let a
blindfolded person pick one, each of the cards will be afforded the same
chance of being selected/included in the sample.
The two cases (the pair) written
on the card picked up by the blind-folded person thus, will constitute the
desired simple random sample. If one wishes to select simple random samples of
three cases from the above population of six cases, the possible samples, each
of three cases, will be, ABC, ABD, ABE, ABF, ACD, ACE, ACF, ADE, ADF, BCD, BCE,
BCF, BDE, BDF, BEF, CDE, CDF, CEF, and DEF, i.e., 20 combinations in all.
Each of these combinations will
have an equal chance of selection in the sample. Using the same method, one can
select a simple random sample of four cases from this population.
In principle, one can use this
method for selecting random samples of any size from a population. But in
practice, it would become a very cumbersome and in certain cases an impossible
task to list out all possible combinations of the desired number of cases. The
very same result may be obtained by selecting individual elements, one by one,
using the above method (lottery) or by using a book of random numbers.
The book of tables comprising
list of random numbers is named after Tippet who was first to translate the
concept of randomness into a book of random numbers.
This book is prepared by a very
complicated procedure in such a manner that the numbers do not show any
evidence of systematic order, that is, no one can estimate the number
following, on the basis of the preceding number and vice-versa. Let us discuss
the two methods of drawing a simple random sample.
Lottery Method:
This method involves the
following steps:
(a) Each member or item in the
‘population’ is assigned a unique number. That is, no two members have the same
number,
(b) Each number is noted on a
separate card or a chip. Each chip or card should be similar to all the others
with respect to weight, size and shape, etc.,
(c) The cards or chips are
placed in a bowl and mixed thoroughly,
(d) A blind-folded person is
asked to pick up any chip or card from the bowl.
Under these circumstances, the
probability of drawing any one card can be expected to be the same as the
probability of drawing any other card. Since each card represents a member of
the population, the probability of selecting each would be exactly the same.
If after selecting a card (chip)
it was replaced in the bowl and the contents again thoroughly mixed, each chip
would have an equal probability of being selected on the second, fourth, or nth
drawing. Such a procedure would ultimately yield a simple random sample.
Selecting Sample with the Help
of Random Numbers:
We have already said what random
numbers are. These numbers help to avoid any bias (unequal chances) to items
comprising a population, of being included in the sample in selecting the
sample.
These random numbers are so
prepared that they fulfill the mathematical criterion of complete randomness.
Any standard book on statistics contains a few pages of random numbers. These
numbers are generally listed in columns on consecutive pages.
The use of the tables of random
numbers involves the following steps:
(a) Each member of the
population is assigned a unique number. For example, one member may have the
number 77 and another 83, etc.
(b) The table of random numbers
is entered at some random point (with a blind mark on any page of the book of
tables) and the cases whose numbers come up as one moves from this point down
the column are included in the sample until the desired number of cases is
obtained.
Suppose our population consists
of five hundred elements and we wish to draw fifty cases as a sample. Suppose
we use the last three digits in each number of five digits (since the universe
size is 500, i.e., three-digital).
We proceed down the column
starting with 42827; but since we have decided to use only three digits (say
the last three), we start with 827 (ignoring the first two digits). We now note
each number less than 501 (since the population is of 500).
The sample would be taken to
consist of the elements of the population bearing the numbers corresponding to
those chosen. We stop after we have selected 50 (the size decided by us)
elements. On the basis of the above section of the table, we shall be choosing
12 numbers corresponding to those chosen. We shall choose 12 cases
corresponding to the numbers 237, 225, 280, 184, 203, 190, 213, 027, 336, 281,
288, 251.
Characteristics of Simple Random Sample:
We shall start by considering
one very important property of the simple random samples; this being, that
larger the size of the sample, the more likely it is that its mean (average
value) will be
close to the ‘population’ mean,
i.e., the true value. Let us illustrate this property by supposing a population
comprising six members (children).
Let the ages of these children
be respectively: A=2 years, B=3 years, C=4 years, D=6 years, E=9 years and F=12
years. Let us draw random samples of one, two, three four and five members each
from this population and see how in each case, the sample means (averages)
behave with reference to the true ‘population’ mean (i.e., 2+3+4+6+9+12 = 36/ 6
= 6). Table following illustrates the behaviour of the sample means as associated
with the size of the sample.
Table showing the possible
samples of one, two, three, four and five elements (children, from the
population of six children of ages 2, 3, 4, 6, 9 and 12 years respectively):
In the given table, all possible
random samples of various sizes (i.e., 1, 2, 3, 4 and 5) and their
corresponding means are shown. The true (population) mean is 6 years. This mean
can of course be calculated by adding up the mean-values of the total
combinations of the elements in the population for any given sample size.
In the table we see, for
example, that for the sample size of three elements there are 20 possible
combinations of elements, each combination having an equal chance of being
selected as a sample according to the principle of probability.
Adding up the mean-values of
these possible combinations shown in the table, we get the total score of 120.
The mean will be 120 ÷20 = 6, which is also, of course, the population mean.
This holds good for other columns too.
Let us now examine the table
carefully. We shall find that for samples of one element each (column A) there
is only one mean-value which does not deviate by more than 1 unit from the true
population mean of 6 years. That is, all others, viz., 2, 3, 4, 9 and 12,
deviate by more than one unit from the population mean, i.e., 6. As we increase
the size of the sample, e.g., in column B, where the sample size is 2, we find
a greater proportion of means (averages) that do not deviate from the
population mean by more than 1 unit.
The above table shows that for
the sample of two, there are 15 possible combinations and hence 15 possible
means. Out of these 15 means there are 5 means which do not deviate from the
population mean by more than 1 unit.
That is, there are 33% of the
sample means which are close to the population mean within +1 and -1 units. In
column C of the table, we see that there are 20 possible combinations of
elements for the sample-size of three elements, each.
From out of the 20 possible
sample-means, we find that 10, i.e., 50% do not deviate from the population
mean by more than 1 unit. For the sample size of four elements, there are 67%
of means which are within the range of +1 and -1 unit from the true
(population) mean.
Lastly, for the sample size of
five elements, there are much more, i.e., 83% of such means or estimates. The
lesson surfacing out of our observations is quite clear, viz., the larger the
sample, the more likely it is that its mean will be close to the population
mean.
This is the same thing as saying
that the dispersion of estimates (means) decreases as the sample size
increases. We can clearly see this in the above table. For the sample size of
one (column A) the range of means is the largest, i.e., between 2 and 12 = 10.
For the sample size of two the range is between 2.5 and 10.5 = 8.
For the sample size of three,
four and five, the range of variability of means is respectively 3 to 9 = 6,
3.8 to 7.8 = 4 and 4.8 to 6.8 = 2. It will also be seen from the table that the
more a sample mean differs from population-mean the less frequently it is
likely to occur.
We can represent this phenomenon
relating to simple random sampling clearly with the help of a series of curves
showing the relationship between variability of estimates and the size of
sample. Let us consider a big population of residents. One can imagine that
their ages will range between below 1 year (at the least) and above 80 years
(at the most).
The normal and reasonable
expectation would be that there are lesser cases as one approaches the extremes
and that the number of cases goes on increasing progressively and symmetrically
as we move away from these extremes.
The mean-age of the population
is, let us say, 40 years. Such a distribution of residents can be represented
by a curve known as the normal or bell-shaped curve (A in the diagram
following). Let us now suppose that we take from this population various random
samples of different sizes, e.g., 10,100 and 10,000. For any of the sample-size
we shall get a very large number of samples from the population.
Each of these samples will give
us a particular estimate of the population mean. Some of these means will be
over-estimates and some under-estimates of the population characteristic (mean
or average age). Some means will be very close to it, quite a few rather far.
If we plot such sample means for
a particular sample-size and join these points we shall in each case, get a
normal curve. Different normal curves will thus represent the values of
sample-means for samples of different sizes.
Distribution of Mean-Values
The above diagram approximates a
picture of how the sample-means would behave relative to the size of the
sample. The curve A represents the locations of ages of single individuals. The
estimated means of samples of 10 individuals, each, from the curve B that shows
quite a wide dispersion from true population-mean 40 years).
The means of samples of 100
individuals each, form a normal curve C which shows much lesser deviation from
the population mean. Finally, the means of the samples of 10,000 from a curve
that very nearly approximates the vertical line corresponding to the population
mean. The deviation of the values representing curve D from the population mean
would be negligible, as is quite evident from the diagram.
It can also be discerned very
easily from the above figure that for samples of any given size, the most
likely sample-mean is the population-mean. The next most likely are the mean
values close to the population mean.
Thus, we may conclude that the
more a sample mean deviates from the population-mean, the less likely it is to
occur. And lastly, we also see what we have already said about the behaviour of
the samples, namely, the larger the sample the more likely it is that its mean
will be close to the population-mean.
It is this kind of behaviour on
the part of the simple random (probability) samples with respect to the mean as
well as to proportions and other types of statistics, that makes it possible
for us to estimate not only the population-characteristic (e.g., the mean) but
also the likelihood that the sample would differ from the true population value
by some given amount.
One typical features of the
simple random sampling is that when the population is large compared to the
sample size (e.g., more than, say, ten times as large), the variabilities of
sampling distributions are influenced more by the absolute number of cases in
the sample than by the proportion of the population that the sample includes.
In other words, the magnitude of
the errors likely to arise consequent upon sampling, depends more upon the
absolute size of the sample rather than the proportion it bears with the
population, that is, on how big or how small a part it is of the population.
The larger the size of the
random sample, the greater the probability that it will give a reasonably good
estimate of the population-characteristic regardless of its proportion compared
to the population.
Thus, the estimation of a
popular vote at a national poll, within the limits of a tolerable margin of
error, would not require a substantially larger sample than the one that would
be required for an estimation of population vote in a particular province where
poll outcome is in doubt.
To elaborate the point, a sample
of 500 (100% sample) will give perfect accuracy if a community had only 500
residents. A sample of 500 will give slightly greater accuracy for a township
of 1000 residents than for a city of 10,000 residents. But beyond the point at
which the sample is a large portion of the ‘universe’ there is no appreciable
difference in accuracy with the increases in the size of the ‘universe.’
For any given level of accuracy,
identical sample sizes would give same level of accuracy for communities of
different population, e.g., ranging from 10,000 to 10 millions. The ratio of
the sample- size to the populations of these communities means nothing,
although this seems to be important if we proceed by intuition.
Type # 2. Systematic Sampling:
This type of sampling is for all
practical purposes, an approximation of simple random sampling. It requires
that the population can be uniquely identified by its order. For example, the
residents of a community may be listed and their names rearranged
alphabetically. Each of these names may be given a unique number. Such an index
is known as the ‘frame’ of the population in question.
Suppose this frame consists of
1,000 members each with a unique number, i.e., from 1 to 1,000. Let us say, we
want to select a sample of 100. We may start by selecting any number between 1
to 10 (both included). Suppose we make a random selection by entering the list
and get 7.
We then proceed to select
members; starting from 7, with a regular interval of 10. The selected to select
members: starting from with a regular interval of 10. The selected sample would
thus consist of elements bearing Nos. 7, 17, 27, 37, 47, … 977, 987, 997. These
elements together would constitute a systematic sample.
It should be remembered that a
systematic sample may be deemed to be a probability sample only if the first
case (e.g., 7) has been selected randomly and then every, tenth case from the
frame was selected thereafter.
If the first case is not
selected randomly, the resulting sample will not be a probability sample since,
in the nature of the case, most of the cases which are not at a distance of ten
from the initially chosen number will have a Zero (0) probability of being
included in the sample.
It should be noted that in the
systematic sampling when the first case is drawn randomly, there is, in
advance, no limitation on the chances of any given case to be included in the
sample. But once the first case is selected, the chances of subsequent cases
are decisively affected or altered. In the above example, the cases other than
17, 27, 37, 47… etc., have no chance of being included in the sample.
This means that systematic
sampling plan does not afford all possible combinations of cases, the same
chance of being included in the sample.
Thus, the results may be quite
deceptive if the cases in the list are arranged in some cyclical order or if
the population is not thoroughly mixed with respect to the characteristics
under study (say, income or hours of study), i.e., in a way that each of the
ten members had an equal chance of getting chosen.
Type # 3. Stratified Random Sampling:
In the stratified random
sampling, the population is first divided into a number of strata. Such strata
may be based on a single criterion e.g., educational level, yielding a number
of strata corresponding to the different levels of educational attainment) or
on combination of two or more criteria (e.g., age and sex), yielding strata
such as males under 30 years and males over 30 years, females under 30 years
and females over 30 years.
In stratified random sampling, a
simple random sample is taken from each of the strata and such sub-samples are
brought together to form the total sample.
In general, stratification of
the universe for the purpose of sampling contributes to the efficiency of
sampling if it establishes classes, that is, if it can divide the population
into classes of members or elements that are internally comparatively
homogeneous and relative to one another, heterogeneous, with respect to the
characteristics being studied. Let us suppose that age and sex are two
potential bases of stratification.
Now, should we find that
stratification on the basis of sex (male / female) yields two strata which
differ markedly from each other in respect of scores on other pertinent
characteristics under study while on the other hand, age as a basis of
stratification does not yield strata which are substantially different from one
another in terms of the scores on the other significant characteristics, then
it will be advisable to stratify the population on the basis of sex rather than
age.
In other words, the criterion of
sex will be more effective basis of stratification in this case. It is quite
possible that the process of breaking the population down into strata that are
internally homogeneous and relatively heterogeneous in respect of certain relevant
characteristics is prohibitively costly.
In such a situation, the
researcher may choose to select a large simple random sample and make up for
the high cost by increasing (through a large-sized simple random sample) the
total size of the sample and avoiding hazards attendant upon stratification.
It should be clearly understood
that stratification has hardly anything to do with making the sample a replica
of the population.
In fact, the issues involved in
the decision whether stratification is to be effected are primarily related to
the anticipated homogeneity of the defined strata with respect to the
characteristics under study and the comparative costs of different methods of
achieving precision. Stratified random sampling like the simple random sampling,
involves representative sampling plans.
We now turn to discuss the major
forms or stratified sampling. The number of cases selected within each stratum
may be proportionate to the strength of the stratum or disproportionate
thereto.
The number of cases may be the
same from stratum to stratum or vary from one stratum to another depending upon
the sampling plan. We shall now consider very briefly these two forms, i.e.,
proportionate and the disproportionate stratified samples.
Type # 4. Proportionate Stratified Sampling:
In proportionate sampling, cases
are drawn from each stratum in the same proportion as they occur in the
universe. Suppose we know that 60% of the ‘population’ is male and 40% of it is
female. Proportionate stratified sampling with reference to this ‘population’,
would involve drawing a sample in a manner that this same division among sexes
is reflected, i.e., 60:40, in the sample.
If the systematic sampling
procedure is employed in a study, the basis on which the list is made
determines whether or not the resulting sample is a proportionate stratified
sample. For example, if every 7th name is selected in a regular sequence from a
list of alphabetically arranged names, the resulting sample should contain
approximately 1/7th of the names beginning with each letter of the alphabet.
The resulting sample in this
case would be a proportionate stratified alphabetical sample. Of course, if the
alphabetical arrangement is completely unrelated and irrelevant to the problem
being studied, the sample might be considered a random sample with certain
limitations typical of the systematic samples discussed above.
Various reasons may be adduced
for sampling the various strata in unequal or dissimilar proportions.
Sometimes, it is necessary to increase the proportion sampled from strata
having a small number of cases in order to have a guarantee that these strata
come to be sampled at all.
For example, if one were
planning a study of retail sales of clothing’s in a certain city at a given
point of time, a simple random sample of retail cloth stores might not give us
an accurate estimate of the total volume of sales, since a small number of
establishments with a very large proportion of the total sales, may happen to
get excluded from the sample.
In this case, one would be wise
in stratifying the population of cloth stores in terms of some few cloth stores
that have a very large volume of sales will constitute the uppermost stratum.
The researcher would do well to include all of them in his sample.
That is, he may do well at times
to take a 100% sample from this stratum and a much lesser percentage of cases
from the other strata representing a large number of shops (with low or
moderate volume of turn-over). Such a disproportionate sampling alone will most
likely give reliable estimates in respect of the population.
Another reason for taking a
larger proportion of cases from one stratum rather than from others is that the
researcher may want to subdivide cases within each stratum for further
analysis.
The sub-strata thus derived may
not all contain enough number of cases to sample from and in the same
proportion as the other sub-strata, hence would not afford enough cases to
serve as an adequate basis for further analysis. This being the case, one may
have to sample out higher proportion of cases from the sub-stratum.
In general terms, it may be said
that greatest precision and representation can be obtained if samples from the
various strata adequately reflect their relative variabilities with respect to
characteristics under study rather than present their relative sizes in the
‘population.’
It is advisable to sample more
heavily in strata where the researcher has a reason to believe that the
variability about a given characteristic, e.g., attitudes or participation,
would be greater.
Hence, in a study undertaken for
predicting the outcome of the national elections employing the method of
stratified sampling, with states as a basis of stratification, a heavier sample
should be taken from the areas or regions where the outcome is severely clouded
and greatly in doubt.
Type # 5. Disproportionate Stratified Sampling:
We have already suggested the
characteristics of the disproportionate sampling and also some of the major
advantage of this sampling procedure. It is clear that a stratified sample in
which the number of elements drawn from various strata is independent of the
sizes of these strata may be called a disproportionate stratified sample.
This same effect may well be
achieved alternatively by drawing from each stratum an equal number of cases,
regardless of how strongly or weakly the stratum is represented in the
population.
As a corollary of the way it is
selected, an advantage of disproportionate stratified sampling relates to the
fact that all the strata are equally reliable from the point of view of the
size of the sample. An even more important advantage is economy.
This type of sample is
economical in that, the investigators are spared the troubles of securing an
unnecessarily large volume of information from the most prevalent groups in the
population.
Such a sample may, however, also
betray the combined disadvantages of unequal number of cases, i.e., smallness
and non-representativeness. Besides, a disproportionate sample requires deep
knowledge of pertinent characteristics of the various strata.
Type # 6. Optimum Allocation Sample:
In this sampling procedure, the
size of the sample drawn from each stratum is proportionate to both the size
and the spread of values within any given stratum. A precise use of this
sampling procedure involves the use of certain statistical concepts which have
not yet been adequately or convincingly introduced.
We now know something about the
stratified random sampling and its different manifestations. Let us now see how
the variables or criteria for stratification should be planned.
The following considerations ideally enter into the selection of
controls for stratification:
(a) The information germane to
institution of strata should be up-to-date, accurate, complete, applicable to
the population and available to the researcher.
Many characteristics of the
population cannot be used as controls since no satisfactory statistics about
them are available. In a highly dynamic society characterized by great
upheavals in the population, the researcher employing the strategy of
stratification typically runs the risk of going quite wrong in his estimates
about the sizes of the strata he effects in his sample.
(b) The researcher should have
reasons to believe that the factors or criteria used for stratification are
significant in the light of the problem under study.
(c) Unless the stratum under
consideration is large enough and hence the sampler and field workers have no
great difficulty locating candidates for it, it should not be used.
(d) When selecting cases for
stratification, the researcher should try to choose those that are homogeneous
with respect to the characteristics that are significant for the problem under
study. As was said earlier, stratification is effective to the extent that the
elements within the stratum are like each other and at the same time different
relative to the elements in other strata.
Let us now consider the merits
and limitations of stratified random sampling in a general way:
(1) In employing the stratified
random sampling procedure, the researcher can remain assured that no essential
groups or categories will be excluded from the sample. Greater
representativeness of the sample is thus assured and the occasional mishaps
that occur in simple random sampling are thus avoided.
(2) In the case of more
homogeneous populations, greater precision can be achieved with fewer cases.
(3) Compared to the simple
random ones, stratified samples are more concentrated geographically, thereby
reducing the costs in terms of time, money and energy in interviewing
respondents.
(4) The samples that an
interviewer chooses may be more representative if his quota is allocated by the
impersonal procedure of stratification than if he is to use his own judgement
(as in quota sampling).
The main limitation of
stratified random sampling is that in order to secure the maximal benefits from
it in the course of a study, the researcher needs to know a great deal about
the problem of research and its relation to other factors. Such a knowledge is
not always forthcoming and quite so often waiting is long.
It should be remembered that the
viewpoint of the theory of probability sampling, it is essentially irrelevant
whether stratification is introduced during the procedure of sampling or during
the analysis of data, except in so far as the former makes it possible to
control the size of the sample obtained from each stratum and thus to increase
the efficiency of the sampling design.
In other words, the procedure of
drawing a simple random sample and then dividing it into strata is equivalent
in effect to drawing a stratified random sample using as the sampling frame
within each stratum, the .population of that stratum which is included in the
given simple random sample.
Type # 7. Cluster Sampling:
Typically, simple random
sampling and stratified random sampling entail enormous expenses when dealing
with large and spatially or geographically dispersed populations.
In the above types of sampling,
the elements chosen in the sample may be so widely dispersed that interviewing
them may entail heavy expenses, a greater proportion of non-productive time
(spent during travelling), a greater likelihood of lack of uniformity among
interviewers’ questionings, recordings and lastly, a heavy expenditure on
supervising the field staff.
There are also other practical
factors of that sampling. For example, it may be considered less objectionable
and hence permissible to administer a questionnaire to three or four
departments of a factory or office rather than administering it on a sample
drawn from all the departments on a simple or stratified random basis, since
this latter procedure may be much more disruptive of the factory routines.
It is for some of these reasons
that large-scale survey studies seldom make use of simple or stratified random
samples; instead, they make use of the method of cluster sampling.
In cluster sampling, the sampler
first samples out from the population, certain large groupings, i.e.,
“cluster.” These clusters may be city wards, households, or several
geographical or social units. The sampling of clusters from the population is
done by simple or stratified random sampling methods. From these selected
clusters, the constituent elements are sampled out by recourse to procedures
ensuring randomness.
Suppose, for example, that a
researcher wants to conduct a sample study on the problems of undergraduate
students of colleges in Maharashtra.
He may proceed as follows:
(a) First he prepares a list of
all the universities in the state and selects a sample of the universities on a
‘random’ basis.
(b) For each of the universities
of the state included in the sample, he makes a list of colleges under its
jurisdiction and takes a sample of colleges on a ‘random’ basis.
(c) For each of the colleges
that happen to get included in the sample, he makes a list of all undergraduate
students enrolled with it. From out of these students, he selects a sample of
the desired size on a ‘random’ basis (simple or stratified).
In this manner, the researcher
gets a probability or random sample of elements, more or less concentrated,
geographically. This way he is able to avoid heavy expenditure that would
otherwise have been incurred had he resorted to simple or stratified random
sampling, and yet he need not sacrifice the principles and benefits of
probability sampling.
Characteristically, this
sampling procedure moves through a series of stages. Hence it is, in a sense, a
‘multi-stage’ sampling and sometimes known by this name. This sampling
procedure moves progressively from the more inclusive to the less inclusive
sampling units the researcher finally arrives at those elements of population
that constitute his desired sample.
It should be noted that with
cluster sampling, it is no longer true that every combination of the desired
number of elements in the population is equally likely to be selected as the
sample of the population. Hence, the kind of effects that we saw in our
analysis of simple random samples, i.e., the population-value being the most
probable sample-value, cannot be seen here.
But such effects do materialize
in a more complicated way, though, of course, the sampling efficiency is
hampered to some extent. It has been found that on a per case basis, the
cluster sampling is much less efficient in getting information than comparably
effective stratified random sampling.
Relatively speaking, in the
cluster sampling, the margin of error is much greater. This handicap, however,
is more than balanced by associated economies, which permit the sampling of a
sufficiently large number of cases at a smaller total cost.
Depending on the specific
features of the sampling plan attendant upon the objects of survey, cluster
sampling may be more or less efficient than simple random sampling. The
economies associated with cluster sampling generally tilt the balance in favour
of employing cluster sampling in large-scale surveys, although compared to
simple random sampling, more cases are needed for the same level of accuracy.
Type # 8. Multi-Phase Sampling:
It is sometimes convenient to
confine certain questions about specific aspects of the study to a fraction of
the sample, while other information is being collected from the whole sample.
This procedure is known as ‘multi-phase sampling.’
The basic information recorded
from the whole sample makes it possible to compare certain characteristics of
the sub-sample with that of the whole sample.
One additional point that merits
mention is that multi-phase sampling facilitates stratification of the
sub-sample since the information collected from the first phase sample can
sometimes be gathered before the sub-sampling process takes place. It will be
remembered that panel studies involve multi-phase sampling.
D A T A C O L L E C T
I O N
We have previously seen some major steps of research such as how to
select a topic, what method and approach to select, where to find reading
materials, and, above all, how to manage time. They all prepared you to
the upcoming and equally important stage; data collection. This summary is
an attempt to bring forth all that is related to the data collection
process. It will first highlight some access and ethical issues that one may
encounter while collecting data and the ways to overcome them. Second, it will
present the various sampling techniques. Third, it will go through all the
different methods and techniques that one could follow in collecting data
such as questionnaires, documents, interviews etc… Then it will look at the possible ways
to keep them recorded. Last but not least, it will shed light on some
tips and advice in order to avoid psychological pitfalls while pursuing data
collection.
SAMPLING AND SELECTION:
While these terms are usually associated with the ‘survey
approach’, some form of sampling
and selection exists in any research project. In general,
as it is impossible to observe all of
the subjects of one’s interest at once for instance, it
is important to sample part of the
‘population’ one is focusing on and select it carefully.
The chapter details the different
sampling strategies a
researcher can pick from. Different sampling strategies are divided into two
large categories; probability and non-probability sampling. If the former is
selected, this means that every member of a research population has an
equal chance of being selected. The choice of samples is based on the
scale of the study (a small-scale study would not allow you to choose from
the whole population so you are forced to use a cluster), on knowledge of
the population (probability is used if you don’t
know enough about it), and on the topic you are working on. A sensitive
issue such as emotional trauma due to sexual abuse may cause one to select
one’s subjects more carefully for instance.
APPLYING TECHNIQUES FOR
COLLECTING DATA:
Collection of data procedure obeys a certain method in order to sustain
consistency in your dissertation. Studies in anthropology, geography, or
sociology often require fieldwork, that is to say, using techniques such as
observation and questionnaires. It is true that, as first time researchers,
postgraduate students may find it awkward to go down the streets asking people
they do not know about topics that sometimes sound complicated to them.
However, students should overcome such a feeling, because in fact fieldwork
requires relative rigour and procedure so that research can be carried out in
an optimal way. On the other hand, some disciplines demand different methods.
Research in psychology or politics, for instance, would be better studied using
already-existing data such as documents. Deskwork, i.e. collecting data from
libraries, data bases, or institutions, is indeed better fitted in this case
than fieldwork. Still, depending on the
chosen approach and methodology, both fieldwork and deskwork can be fit
for either of the aforementioned disciplines and it is up to the researcher to
decide accordingly.
DOCUMENTS
Since most research in arts and social sciences is based on data
collected from documents, it is then necessary for the researcher to master
analytic and critical reading skills so that he/she can emit his/her comments
on previous research and bring forward his/her own viewpoint on the matter.
There are several types of documents one can make use of when carrying out
research, among which there is library-based documents, computer-based
documents, historical archives etc... As for the sources of the documents, they
can be from government surveys, and government legislations; historical
records; media documents such as newspapers, magazines articles, TV and radio
programs; or sometimes personal documents such as diaries and photographs.
Because primary sources are difficult to access and costly, many
researchers nowadaysopt for secondary sources, that is to say, data that has
already been collected and analysedby other people. These types of
documents may be cost efficient and time saving; however,any rigorous
researcher would have to be careful in using them. For instance, one
mustcheck the conditions of its production, its author’s position, its way of targeting the
readership, and above all its purpose and ends. Also, one has
to verify if variables havechanged over time in case of quantitative
research or if the methods are up-to-date, if not,one has to check whether they
are still reliable for the current research.
To insert a piece of information
taken from a document in a dissertation, it is a good idea to start with the
name of the author, then to put the date of the publication between
parentheses, and then proceed with the idea being reported. Usually, ideas are
put with a reporting verb like “analysed, examined, interviewed...” after that,
a brief explanation of the methodology of the experiment is given alongside the
aim of the research as in this example.
Arber and Ginn (1995) used
General Household Survey data to explore the relationship between
informal care and paid work. They found that it is the norm tobe in paid
work and also be providing informal care. (From How to Research byLoraine
Blaxter, Christina Hughes and Malcolm Tight)
participation and observation?
Does he/she need to do a pilot observation? Will
he be openly observing or
‘hiding in a corner’ so to speak? Will his presence and his appearance
influence the session? Etc. Observation is time consuming, and in the hope of
saving time, one can pre-structure the observation session but at the risk of
losing important details and flexibility. If the observation technique is
focused on observing the participants’ reaction to stimuli and analyzing it,
the researcher has moved towards the experimental approach. If on the other
hand the researcher actively participates in the process then it looks more
like action research. The latter is a process where a ‘community of practice’
comes together to conduct experiments and exercises as a group in order to find
solutions to a problem or improve the way certain things are handled. This is
mostly used in companies and schools.
QUESTIONNAIRES
: Questionnaires are the most
widely spread method to collect people’s opinions; in the meantime, it is one
of the most complicated techniques to elaborate for many reasons.
Questionnaires can be administered in many ways: by post, via e-mails,
face-to-face, or by telephone. Nevertheless, each one of these methods has got
its shortcomings. For instance, posted and e-mailed questionnaires might not
receive replies, or the provided answers might be poor because of the lack of
interaction between the questionnaire giver and taker. Moreover, face-to-face
or telephone questionnaires are time consuming and sometimes costly. It is thus
up to the researcher to decide for the method according to his/her means and
capabilities. Either way, while distributing questionnaires, it is crucial that
one always introduces him/herself, presents the goal of the questionnaire,
provides any contact details and is ready to answer any possible queries about
it. Remind your questionnaire taker that their answers would stay anonymous.
Remember to thank them after they finish answering. As mentioned above,
questionnaires are complex to elaborate, for there are various techniques on how
to ask the questions. Basically, there are seven questionnaire types: quantity
or information, category, list or multiple choice, scale, ranking, complex grid
or table, and open-ended. Still, in order to have good results, one should
follow some tips and hints while writing up the questionnaire. Ambiguous,
hypothetical, imprecise questions or those that appeal to emotions or to
memories would give inaccurate unfaithful answers and therefore should be
avoided; in this case, simpler shorter questions are recommended instead. For
the sake of efficiency, open-ended questions should be limited to a certain
number, for they are time-consuming and require more effort to collect, analyze
and report. And finally, have your questionnaires translated in several languages
if necessary so as to increase the chances of having a higher response rate.
RECORDING YOUR PROGRESS:
Note-taking is a vital step when
collecting date, for it is the condition sine qua non to keeping track of your
data. Mastering this skill will enable any researcher to save a considerable
amount of time and avoid getting lost amid the clutter of books and articles.
There are various techniques
to keep notes. First of all,
there are research diaries; they are widely used to keep any thought that may
come to the mind. Second, boxes files are a good way to sort out materials into
different categories according to the subjects or the chapters they belong to.
Third, colours are said to help enhance memory; therefore, it is recommended to
choose different paper colours for each section of your papers to better spot
them. Last but not least, technology can prove to replace all of the
aforementioned, so a good use of computers can help you manage your data really
well. However, in any case, backup copies and up-to-date printed materials
should be regularly generated to avoid any accidental loss of the original
materials.
THE UPS AND DOWNS OF DATA COLLECTION:
We may call this the ups and
downs of research because it mentions issues that may occur in our case as
well: “There may be days when you really enjoy yourself, when you discover
something interesting …There will also be days when you can barely force your self
to do the necessary work.” The two most common ‘downs’ in research according to
the authors are loneliness and obsession. As for the first, it seems to be both
inevitable and beneficial in research; it occurs in any process that you have
to carry out alone and from which you have to draw your own conclusions, and in
this case it reveals a lot about who you are and what you are capable of. A
peculiar case is mentioned, that of the researcher carrying out fieldwork; they
are both an insider, someone who is part of the community they are researching,
and an outsider because of their role as observer and ‘judge’ in a sense. This
can be exacerbated by not having a supportive manager, supervisor or
colleagues, especially if you are conducting the research in your workplace. It
is recommended that the researcher seeks out a strong ‘support network’ from
the beginning, and that he /she dedicates some of his/her time to other
activities in order to keep in touch with people. Obsessiveness seems to go
hand in hand with loneliness, as diving into research can obviously force a
person to isolate him/herself to focus solely on the task at hand. It can have
the duel effect of drawing the researcher away from even the people that were
most supportive of them at the start, and also, most dangerously, of no longer
distinguishing between research and daily life. The expression ‘going native’
is employed to describe a phenomenon where the researcher (mostly
anthropologists) becomes “unable to separate their interest from those of the
research subjects.” This implies losing objectivity as well, so it can
jeopardize the research. In order to counteract this problem, the research is
advised to plan the project rigorously, thus reducing the risk of the heavy
workloads which lead to obsession. He/she can also warn a friend or family
member to warn him/her if he/she gets too obsessive, and get in touch with
fellow researchers, thus creating a community of support. To enjoy data
collection (and research), it is advised that the researcher combines it with
activities that they enjoy, places that they love, as well as regulating their
research schedule to avoid overwork. Boredom is inevitable at some point so it
should not be cause for alarm. It is important to know when to stop collecting
data in order to find sufficient time for “the analysis and the writing up of
your research findings.” In our case, we may not know when to stop reading and
start piecing together the deductions we have drawn from our reading. One
should keep in mind that their ultimate goal is not to write ‘the ultimate
research paper’, this is both unrealistic and stressful as a target. Small -scale
research has as its purpose to produce a new idea about something that has
already been discussed, re-conduct an experiment using a different method or
another setting in order to test results, or look into a field that hasn’t yet
gained much attention in order to shed light on it. Collecting sufficient data
is the aim, rather than going on forever with reading. It is critical to start
the analyzing process.
CONCLUSION:
Truly then and to sum it all up,
throughout all the points mentioned above, that is to say what type of data is
available, where to find it, how to select it, and how to design some
techniques for the sake of backing up research with concrete data and results.
Some ethical and psychological pieces of advice have been provided, and we
have, generally speaking, seen how to go about data collection effectively.
Notwithstanding that, this summary is just a trial to help students to be
better prepared for this step and is obviously far from being exhaustive; of
course further reading is highly encouraged in order to broaden one’s knowledge
in this concern.
No comments:
Post a Comment