EVIDENCE-BASED SURVEY DESIGN: ADDING “MODERATELY” OR “SOMEWHAT” TO LIKERT SCALE OPTIONS AGREE AND DISAGREE TO GET INTERVAL-LIKE DATA
Likert scales (although ordinal scales) are often treated as interval scales during statistical analyses. There have been attempts to add a modifier (such as moderately, somewhat, or slightly) to the intermediate anchors of Likert scales (i.e., disagree and agree) to help them become interval-like scales. However, the research findings are inconsistent. For interval-like 5-point and 4-point Likert scales, a recent study suggests using strongly disagree, moderately disagree, neutral, moderately agree, strongly agree (in either ascending or descending response order), and strongly agree, somewhat agree, somewhat disagree, strongly disagree (in descending order). However, practitioners and researchers should be aware that research evidence in this regard has been inconsistent.
INTRODUCTION
Imagine you have been asked to conduct an employee job satisfaction survey with hundreds of employees and to submit a report on various factors that affect employee performance. You are aware of the “survey fatigue” phenomenon and want to make the survey questionnaire short. Nonetheless, you are still expected to include at least a dozen survey items to measure different aspects of job satisfaction. So, you plan to use closed-ended questions rather than open-ended questions for obvious reasons: closed-ended questions help you collect data from a large number of people at once, and the collected quantitative data can be fairly quickly analyzed.
However, these are benefits of administering closed-ended survey questions and analyzing quantitative data. What practitioners may not be aware of is the amount of work and expertise involved in designing closed-ended questions in self-administered survey questionnaires (see a series of evidence-based survey design articles by Chyung et al., 2017, 2018a, 2018b, 2018c, 2020). Just like rulers are instruments for measuring length, survey questionnaires are instruments for measuring intended aspects of social phenomena, and they should be adequately designed to produce the intended type of data. In other words, practitioners should first determine what type of data they need to have for their report, and then design their survey questionnaires accordingly. Unfortunately, having an “ends in mind” approach during survey design can be overlooked.
DESIGN SURVEY QUESTIONNAIRES WITH THE ENDS IN MIND
Put yourself back in the scenario in which you are asked to conduct an employee job satisfaction survey with hundreds of employees. How would you approach this task? Compare the following two approaches.
Approach A. Not having the ends in mind:
-
Design survey questions with response scales you are familiar with or based on samples.
-
Collect data.
-
Figure out how to analyze the collected data.
-
Write a report based on the data you obtained.
Approach B. Having the ends in mind:
-
Find out what your client wants to learn from your survey and determine what type of data is needed and how you plan to analyze the data. Even prepare a report template in which you can plug survey results in later.
-
Design survey questions with appropriate questions and response scales that will generate the intended type of data. Conduct a pilot study and apply the analysis method to the collected data. Adjust the survey questions and response scales if needed.
-
Collect data.
-
Write a report with analyzed data, as intended.
Certainly, approach B of having the ends in mind is a better practice than approach A. With approach A, you may end up realizing that the collected data cannot be analyzed in the way that you wanted to, and you may fail to write a report your client asked for. For example, let us say you measured employees' work-related stress levels with the following survey item:
Q: Are you stressed at work?
__Yes __No
The yes or no response scale makes it simple to report the data; however, if you received many yes responses, the data do not help conduct more detailed analyses, such as whether employees have different stress levels due to the types of work or job ranks. In that case, it would have been better if you had used a different response scale to increase variability in data, such as:
Q: How stressed are you at work?
__Not at all __A little bit __Somewhat __Quite a bit __A lot
USE APPROPRIATE RESPONSE SCALES WITH THE ENDS IN MIND
The example above illustrates the importance of having the ends in mind. Doing so includes a good understanding of different types of data generated from different types of response scales. Response scales may generate dichotomous data (e.g., yes, no; or true, not true), nominal data (e.g., citizen, resident alien, nonresident alien; or female, male, nonbinary), or ordinal data (e.g., not at all, a little bit, somewhat, quite a bit, a lot; or never, seldom, sometimes, often, always). Response options in an ordinal scale are presented in a linear order, but the distances between two consecutive points are unequal. Interval scales are similar to ordinal scales, but the distances between two consecutive points are considered equal (e.g., 1:00 pm, 2:00 pm, 3:00 pm, etc.). Numerical response scales such as not satisfied 1 – 2 – 3 – 4 – 5 fully satisfied or not likely 0 – 1 – 2 – 3 – 4 – 5 – 6 – 7 – 8 – 9 – 10 extremely likely are often treated in an interval-like manner. Examples of ratio data are years of education, number of employees to supervise, and amount of sales, which you should collect with open-ended survey items.
Depending on the type of data, certain statistical analyses are used. For example, if you have interval or ratio data, you use Pearson's r, but if you have ordinal data, you use Spearman's ρ. How about the Likert scale? Based on the original 5-point Likert scale (strongly approve, approve, undecided, disapprove, strongly disapprove) that Rensis Likert developed in the 1930s (Likert, 1932), a typical Likert scale format used these days is strongly agree, agree, neutral (or neither agree nor disagree), disagree, strongly disagree. Sometimes, a 4-point Likert scale is used without the midpoint. Does the survey item with the Likert scale below generate ordinal or interval data?
Q: I am stressed at work.
__Strongly agree __Agree __Neutral __Disagree __Strongly disagree (in descending order)
or
__Strongly disagree __Disagree __Neutral __Agree __Strongly agree (in ascending order)
Strictly speaking, the 5-point Likert scale is an ordinal scale rather than an interval scale. It shows a certain order among the options (i.e., from disagreement to agreement); however, the equal-interval-assumption is questionable: i.e., the difference between strongly disagree and disagree, the difference between disagree and neutral, the difference between neutral and agree, and/or the difference between agree and strongly agree are not necessarily equal. Nonetheless, practitioners and researchers often treat the 5-point Likert scale as an interval scale (as if it has equal intervals).
Here, consider another 5-point response scale such as never, frequently, often, usually, always. This scale is clearly an ordinal scale with unequal intervals (e.g., the difference between never and frequently and the difference between often and usually are conceptually and practically unequal). However, if we change the wording of several options in the scale to never, very infrequently, occasionally, most of the time, always, it may become close to an interval scale (Casper, 2013). Similarly, is there a way to change some of the response options in the 5-point Likert scale to make it close to an interval scale, since practitioners and researchers often treat the 5-point Likert scale as an interval scale?
GET INTERVAL-LIKE DATA FROM LIKERT SCALES
Such research was conducted by Worcester and Burns (1975). The researchers modified the second and fourth response options (i.e., intermediate anchors) in Likert-type scales to see if the modified versions would generate interval-like data. In their study, the researchers asked 1,932 adult participants to complete one of four versions of the survey that used different labels for the response options (see Table 1), and then asked them to indicate their interpretation of their chosen response by marking its location on a continuous line. The researchers found that participants indeed interpreted the intermediate anchors such as tend to agree, agree slightly, and agree differently, and when adding a modifier slightly to the intermediate anchors, agree and disagree (as shown in Scale B in Table 1), the data most closely resembled interval data, showing approximately equal intervals between anchors.
The study by Worcester and Burns (1975) was conducted almost five decades ago using a paper version of survey questionnaires, and surprisingly, not many studies have been conducted on this topic since then. From extensive library search using various databases such as Academic Search Premier, JSTOR, and Science Direct, a couple of doctoral dissertations were found to show relevant research—one by Casper (2013) and another one by Spratto (2018). Both studies were conducted in online survey environments using Qualtrics. In Casper's (2013) study, the researcher concluded that potential equidistant anchors for a 5-point Likert scale are strongly disagree, disagree, neither agree nor disagree, moderately agree, and very much agree. Whereas Worcester and Burns (1975) and Casper (2013) used 5-point Likert scales in their studies, Spratto (2018) focused on 4-point Likert scales and tested if the following 4-point Likert scale would have equal response options: completely disagree, moderately disagree, moderately agree, and completely agree. However, the researcher found that the 4-point Likert scale did not produce data that were equally spaced as expected.
ACCOUNT FOR DEFAULT SETTINGS IN ONLINE SURVEY SYSTEMS
Due to the lack of research and inconsistency in research findings, it is difficult to develop an evidence-based recommendation as to exactly which modifier should be used for intermediate anchors if you hope to treat Likert scales as interval-like scales. Furthermore, because surveys are often administered online these days, it is a concern that practitioners and researchers may simply adopt the default settings that the online survey systems provide. Interestingly enough, different online survey systems use different default wording for the intermediate anchors in their Likert scale setting. For example, in Qualtrics, the 5-point Likert scale's default setting shows somewhat disagree and somewhat agree as the intermediate anchors, while SurveyMonkey uses agree and disagree as its default setting (see Table 2). Also, Qualtrics displays the Likert scale options in ascending order while SurveyMonkey uses descending order, although both systems allow users to easily reverse the order.
Because it is easy for users to simply adopt the default setting provided by the online survey systems, it would be prudent to investigate more in the online survey environment to learn whether and how the different modifiers added to the intermediate anchors in Likert scales would affect the type of data we collect. A master's thesis project at Boise State University carried out this investigation (Hutchinson, 2021).
DESIGNING RESEARCH WITH MODERATELY, SLIGHTLY, OR SOMEWHAT AS A MODIFIER
The study started with an assumption that the 5-point Likert scale (strongly agree, agree, neutral, disagree, strongly disagree) is an ordinal scale, and it aimed to answer the following two research questions.
-
Does adding a modifier such as moderately, slightly, and somewhat to disagree and agree in the 5-point Likert scale influence people to perceive the scale to be closer to an interval scale (i.e., an interval-like scale) when administered in an online environment?
-
Does the order of response options in the Likert scale (ascending versus descending) make a difference in people's perceptions, as tested in research question 1?
The population of this study was native English-speaking adults (18 or older) with a minimum of undergraduate college education. This study used a convenience sample of students, alumni, and faculty of Boise State University's Organizational Performance and Workplace Learning (OPWL) graduate program in the United States. The recruitment email was sent to a total of 327 people; 109 of them responded to a survey (a 33.3% return rate).
For the survey instrument, all items used a slider style scale. The slider bar was initially placed at the center of the scale, and participants were asked to move the slider bar to the location that they believed best represented the response option questioned (see Figure 1). Participants were first asked to respond to a set of four disagree-related questions, asking about the locations of disagree, moderately disagree, slightly disagree, and somewhat disagree as shown below:



Citation: Performance Improvement 62, 1; 10.56811/PFI-22-0012
-
Where do you think “Disagree” should be placed on the continuum? Move and place the slider to indicate it.
-
Where do you think “Moderately Disagree” should be placed on the continuum? Move and place the slider to indicate it.
-
Where do you think “Slightly Disagree” should be placed on the continuum? Move and place the slider to indicate it.
-
Where do you think “Somewhat Disagree” should be placed on the continuum? Move and place the slider to indicate it.
Participants were then asked to respond to another set of four agree-related questions (asking about the locations of agree, moderately agree, slightly agree, and somewhat agree). The eight questions were presented in ascending order first (as shown in question 2 in Figure 1). Then, another eight questions were presented in descending order (as shown in question 10 in Figure 1). In total, participants responded to 16 questions, followed by several demographic questions.
After screening the data to remove inappropriate data (e.g., participants using an incorrect side of the continuum, leaving the marker on neutral, or not being a native English speaker), 70 responses were considered valid for analysis. They were 73% females and 26% males (1% did not want to report), and mostly in their 30s and 40s.
Participants' marked data were collected in two-digit numbers after the decimal point (e.g., −1.24, +0.88). Research question 1 was to test if adding a modifier to disagree and agree helped the participants perceive the intermediate anchors to be close to −1 and +1, respectively, where strongly disagree is −2, neutral is 0, and strongly agree is +2. This research question was answered by analyzing 95% confidence intervals of the marked data against the interval values for −1 and +1, as shown in Figure 2. For example, a mean value of −1.19 with its 95% confidence interval between −1.29 and −1.09 would be considered significantly different or far from −1 because the confidence interval does not include −1. On the other hand, a mean value of +0.88 with its 95% confidence interval being +0.68 and +1.08 would not be considered significantly different or far from +1 because the interval includes +1.



Citation: Performance Improvement 62, 1; 10.56811/PFI-22-0012
Research question 2 was analyzed by performing paired sample t-tests to see whether the paired data for each response option when presented in ascending versus descending order were significantly different from each other.
Findings 1: Moderately disagree and moderately agree for 5-Point Likert Scales
The data (see Figure 3, Figure 4, and Table 3) showed that moderately disagree and moderately agree were not significantly far from −1 and +1, respectively, regardless of the response order (ascending or descending). However, adding other modifiers (somewhat or slightly) to disagree and agree or using no modifier made the intermediate anchors significantly far from −1 and +1, and/or show inconsistent results depending on the response order. In other words, the data obtained from this study support that moderately disagree and moderately agree are the most appropriate and reliable intermediate anchors for 5-point Likert scales, regardless of the response order, when hoping to use the Likert scale as an interval-like scale.



Citation: Performance Improvement 62, 1; 10.56811/PFI-22-0012



Citation: Performance Improvement 62, 1; 10.56811/PFI-22-0012
Findings 2: Somewhat agree and somewhat disagree for 4-Point Likert Scales
An unexpected and interesting finding was that although somewhat was not an appropriate modifier when hoping to use a 5-point, interval-like Likert scale, it turned out to be an appropriate modifier when using a descending-ordered, 4-point Likert scale if you hope this 4-point Likert scale to be an interval-like scale. As presented in Figure 3, Figure 4, and Figure 5, the immediate response anchors among strongly agree, somewhat agree, somewhat disagree, and strongly disagree were approximately equidistant (mean values = 1.37, 1.29, and 1.33). To confirm this observation, a single-factor analysis of variance was performed, and it revealed a nonsignificant difference [F(2, 207) = 0.86, p = 0.43)] among the distances between the immediate response anchors, suggesting this descending-ordered 4-point Likert scale as an interval-like scale. On the other hand, the ascending-ordered 4-point Likert scale (strongly disagree, somewhat disagree, somewhat agree, strongly agree) revealed unequal distances [F(2, 207) = 4.72, p < .01]. However, the use of a small convenience sample results in limited generalization of the findings; thus, findings 1 and 2 should be applied to practice with caution.



Citation: Performance Improvement 62, 1; 10.56811/PFI-22-0012
IMPLICATIONS FOR PRACTITIONERS AND RESEARCHERS
As discussed above, there is insufficient evidence on which modifier added to the intermediate anchors in Likert scales would help Likert scales become interval-like scales. Almost five decades ago, Worcester and Burns (1975) suggested slightly as a modifier for both agree and disagree when using 5-point Likert-type scales. Moving to online survey environments, Casper (2013) suggested moderately as a modifier for agree but disagree without a modifier when using a Likert-type scale. Spratto (2018), focusing on 4-point Likert scales, was not able to conclude that using modifiers would make a difference. More recently, Hutchinson (2021) suggested using moderately agree and moderately disagree as intermediate anchors for 5-point Likert scales regardless of the response order and using somewhat agree and somewhat disagree for descending-ordered, 4-point Likert scales (see Table 4).
While there has not been consistent evidence on which modifier added to the intermediate anchors in 5-point and 4-point Likert scales would help them become interval-like scales, as of summer 2022, Qualtrics uses somewhat agree and somewhat disagree as its default intermediate anchors in both 5-point and 4-point Likert scales, and SurveyMonkey does not use a modifier. You as a practitioner or researcher should pay attention to these differences in default settings provided by different online survey systems while designing your closed-ended survey items with Likert scales. Until more research is conducted to generate reliable recommendations, you may need to make a decision based on your local experience and expertise as to whether to use a modifier or which modifier to use for the intermediate anchors in Likert scales. If you intend to use Likert scales as ordinal scales and use statistical analyses appropriate for ordinal data, it would be okay to add or not to add a modifier to the intermediate anchors. If you intend to treat Likert scales as interval-like scales by adding moderately or somewhat to the intermediate anchors, you should be aware of the limited evidence that supports your action.
To learn about other evidence-based survey design principles, see the series of articles published in Performance Improvement Journal regarding the use of a midpoint in the Likert scale (Chyung et al, 2017), the use of ascending and descending order of Likert-type response options (Chyung et al, 2018b), the use of negatively worded items in survey (Chyung et al., 2018a), the use of continuous rating scales in surveys (Chyung et al, 2018c), and ceiling effects associated with response scales (Chyung et al, 2020).

Sample Questions Used in the Survey Instrument

An Illustration of 95% Confidence Intervals

Mean Scores of Survey Items Presented in Ascending Order

Mean Scores of Survey Items Presented in Descending Order

Somewhat Agree and Somewhat Disagree Used in Descending-Ordered 4-Point Likert Scale
Contributor Notes
DOUGLAS HUTCHINSON, M.S., is the Business Operations Manager for the Institute for Pervasive Cybersecurity at Boise State University. He holds a Master of Science degree in organizational performance and workplace learning as well as a Bachelor of Business Administration degree in human resources management and entrepreneurship management from Boise State University. He worked as a graduate assistant while pursuing his master's degree. Email: douglashutchinson@boisestate.edu
SEUNG YOUN (YONNIE) CHYUNG, ED.D., is a professor and Chair of the Department of Organizational Performance and Workplace Learning in the College of Engineering at Boise State University. She teaches graduate courses on program evaluation. In her research lab, she and her research assistants focus on generating evidence-based survey design principles based on research findings. Email: ychyung@boisestate.edu


