The Response Variable

Introduction

To make the research objectives measurable, it is essential that data are collected that can be used to answer the questions in the research objectives. Without a good experimental design plan, it is not unusual to see researchers collect data and then find out afterwards that the collected data do not help answer the questions defined in the research objectives.

The measurements that are recorded during an experiment and used to help evaluate the objectives are called the response variable (or dependent variable). Although it is possible to analyze data with multiple response variables, we will focus only on methods that require one response variable. The response variable is the measurement that measures the outcome of the research objectives and is used to help us determine how different factors influence that outcome.

Measurement Scales

When defining a response variable, it is important to understand measurement scales. The four possible measurement scales are: nominal, ordinal, interval, and ratio. These four scales will now be discussed further and are listed in order of least information provided (nominal) to most information provided (ratio).

Data that are represented by labelled categories would be considered nominal. Nominal categories have no quantitative value and have no particular order. Gender and color are examples of nominal measurements. A dichotomous (having two categories) variable is a common nominal measurement scale. Analysis of variance, the common analytical technique used throughout this textbook, is not appropriate when the response variable is nominal.
An ordinal variable is similar to the nominal variable, except that the categories fall in a significant and meaningful order. A likert scale is a commonly used ordinal measurement. A simple example of likert, ordinal scale would be “bad”, “good”, and “great”. There is an obvious order, however, the distances between the categories are not quantitative. This means that the distance between “bad” and “good” can’t be quantitatively determined and it is not known if that difference is similar to or different than the distance between “good” and “great”. Analysis of variance is generally not appropriate when the response variable is ordinal, especially when the researcher is not comfortable in calculating a mean across the ordinal values.
Interval measurements are quantitative with a meaningful notion of distance between values, however zero does not have a true zero value. An example of an interval variable is temperature, measured in Fahrenheit. Time is also an interval variable when there is no zero-point defined. Analysis of variance is generally appropriate when the response variable is interval.
Ratio measurements are also quantitative with a meaningful notion of distance between values and zero does have a true zero value. Most quantitative measurements are on a ratio scale. Examples of ratio variables include distance, length, weight, and height.

Note

All response variables used in this textbook will be quantitative, either interval or ratio. This is because we will learn Analysis of Variance (ANOVA) throughout the semester, and ANOVA requires a quantiative response variable.

The example toothbrush study, explained previously, wanted to determine if there were differences in plaque build-up when considering 4 different toothbrush types. Plaque build-up is a ratio scale measurement and would be the response variable for this study.

An example of a nominal scale to measure plaque build-up would be recording a “yes” if plaque was present, or “no” if plaque wasn’t present. This is a simple response variable, but not very informative. An improvement on this would be to classify each amount of plaque build-up into one of five groups: “no plaque”, “very little plaque”, “moderate plaque”, “heavy plaque”, and “complete plaque”. This would be an ordinal scale, and while it has more information than the nominal scale, it still is vaguely informative and not appropriate for analysis of variance.

The best measurement would be quantitative, either interval or ratio. By using red dye indicating the presence of plaque, an oral camera, and software, the percentage of the tooth area covered in plaque could be determined. This response variable would be a ratio scale and would provide much better information than the nominal or ordinal scales. When possible, it is best to have a response variable that is quantitative (interval or ratio).

Validity

A response variable needs to have a high degree of validity. Validity is the degree a study or a measure actually represents what it claims to represent. There are various types of validity (internal, external, construct, etc.) and various methods for checking validity, but we will talk about validity in general terms. How your response variable is defined can greatly affect its validity.

For example, if the response variable for an experiment was a person’s arithmetic skills we would need to carefully decide how that could be measured. If the questions used to assess the skills focused too heavily on addition and ignored subtraction, multiplication and division it may not be considered valid.

As another example, consider how validity of a measure (i.e. a survey question) may change when translated or applied to another culture. In some countries, socioeconomic status may be measured in number of lightbulbs in the home but for more developed countries that may not be a good measure. In some cultures frequency of reading the Holy Bible could be a measure of religiosity, but in other cultures that would have poor validity as a measure of religiosity.

The above examples deal with a latent (not directly observable) response such as arithmetic skills or religiosity. But validity should be a key consideration in all cases, even with observable responses. Converting your response variable into a measure of change, a rate, or a percentage may drastically change its validity.

For example, an experiment was conducted to study the effectiveness of different therapies at reducing pain. Rather than measure a patient’s pain after receiving the therapy, researchers can improve the validity by measuring pain before and after therapy. They can then define the response as the difference in pain rating. If pain after therapy is the response it may actually be reflecting a patients sensitivity to pain. By defining the response measure as the difference in pain ratings we can more accurately assess the therapy’s effectiveness.

You can take it one step further and convert the change into a percent change. A percent change is a great way to account for differences in the items/people in your study. In our toothbrush study, the effectiveness of a toothbrush is measured in terms of the area of the teeth with plaque. Since total teeth surface area varies from person to person, researchers decided to use the more valid response measure of percent of surface area with plaque.

As a final example, think back on the COVID pandemic. When comparing the severity of a COVID outbreak across different states, looking at the total number of COVID deaths can be a helpful measure. However, since states like California, New York and Texas have larger populations to begin with, a more valid measure might be to look at the deaths per 100,000 residents, a rate.

When working with rates and percentages there are a couple of pitfalls to be aware of. First, if the quantity you are calculating a percent from is small, the percentage can be highly variable. For example, when comparing the effectiveness of a public health policy like mask wearing at reducing deaths due to COVID we could consider percent change as our variable. However, if the locale had only a couple of deaths to start with, then any change from that number will look extreme in percentage terms.

Simpson’s Paradox¹ is another phenomenon to be aware of when working with percentages. This occurs when the marginal percentages (or means) are different than the overall percentages. In other words, when the trend observed in small groups is different, even reversed, than the trend observed when all the data is looked at together.

Reliability

Reliability and validity are often discussed together. Reliability is a key characteristic of a response variable. High reliability means the measure will give the same value as a response if it is used in identical conditions. A measure that has a lot of random noise is not desirable. One source of noise in the response measure is variability in the Experimental Units. A response measure’s reliability can be improved by using homogeneous experimental units. However, uniformity in experimental units should not be pursued at the expense of a sample’s representativeness. Or in other words, the validity of a measure should not be sacrificed to increase its reliability.

Footnotes

More reading about Simpson’s Paradox can be found here and here ↩︎