Best Practices for Constructing Quantitative Rating Scales that Minimize Scale Use Bias

  1. Keeping in mind that a shorter scale will always lead to less scale use bias than a longer scale, the number of scale points should be no more than is absolutely necessary for the level of granularity required by the research. Whether the number is odd and thus has a middle point should be based on whether or not the human attitude or behavior being measured itself has a natural middle or not. Preference is typically for a 5-point scale for performance rating scales and agree-disagree scales and for 4-point scales for importance rating scales. When the number of same-scaled items in a battery exceeds approximately 20, it often can be appropriate to decrease the number of scale points to minimize overall burden and ensure thoughtful responding.

  2. It is best, for clarity in meaning – consistent clarity across respondents, to label each and every point on the quantitative scale. This is the only way for respondents to be certain of what each rating scale point stands for and how it differs from the adjacent point(s). Moreover, when some points are left unlabeled (as in a 5 point scale with anchors only on the 1, 3, and 5) bias is systematically being introduced simply because respondents tend to be slightly more prone to give responses that have associated labels (i.e. there would be slightly less 2 and 4 responses than there would be if the 2 and 4 had been anchored).

  3. Anchors should be distinctly clear in meaning and create inter-scale-point “mentally interpreted distances” that are as equal as possible. For example, using both “somewhat” and “moderately” for two separate points on a scale would nearly certainly confuse respondents and ultimately lead to scale use bias. An example of 5 points that have good “equal” inter-scale-point distances is the ubiquitous 1-5 agree-disagree scale: 1 = Strongly Disagree, 2 = Somewhat Disagree, 3 = Neither Disagree nor Agree, 4 = Somewhat Agree, and 5 = Strongly Agree.

  4. The anchors for the two endpoints should typically be as extreme as possible. Related to this, the statements used in agree-disagree scales should themselves be extreme . As an example, a 5 response on a 1-5 agree-disagree scale is much clearer and distinct in meaning when the 5 is “extremely agree” to the statement “I love ice cream” than would be a 5 that is merely “agree” to the statement “I like ice cream.”