Star Reviews and the Danger of Misleading Metrics
by Robin Tinker, on Feb, 19, 2016
Star reviews seem to be as popular on sites like Amazon, Google, Open Table, and Yelp, as are the number of people who become regular contributors. The simple act of clicking a star rating is akin to absentmindedly flipping through an ex's Facebook photos until you mistakenly like a picture from 3 years and 1,000 photos ago. In short, much like a Facebook like, star reviews, usually based on something similar to the Likert scale, can be misleading to marketers and decision makers who are interested in gaining valuable insights into the effectiveness of their content, communications, and products.
Intricacies Of The Likert Scale And Its Misleading Data
A true Likert scale presents the respondent with a series of statements. The respondents are then asked to indicate their level of agreement with these statements by using a scale with an odd number of points, such as five or seven-points. While this might seem like a simple enough task, the immediate downfall of the Likert scale occurs in its construction.
When a Likert scale is not properly constructed, i.e. it does not ask the right types of questions, it quickly leads to respondent fatigue, aka "straight-lining" or the practice of unwittingly providing the same rating down the line to complete the survey or rating faster.
Another pitfall occurs when the questions are not placed in the correctly randomized order. When questions are ordered incorrectly, they can contribute to a response bias in which a respondent will base their subsequent responses on the earlier ratings.
Finally, incorrectly established Likert scales have a tendency to use yes or no questions. Yes or no questions have no room on a five or seven-point scaling system, as the reader quite clearly only has two possible answers rather than a range.
All of this being said, five-point scales are the most common type of star reviews. In fact, they are readily used by Amazon, Netflix, and iTunes to rate products and services. Unfortunately, the metrics that a five-point or five-star scale provides are riddled with complexities that can easily provide for improper analysis.
What do 1-5 star reviews really tell you?
Amazon recently released a study that showed the pitfalls of a non-detailed star review. The study revealed that when a respondent isn't required to provide any additional information to their chosen rating, they are more likely to respond in a bimodal distribution. In other words, the distribution of ratings tends to cluster around two different numbers, for example, 1 and 5, rather than a normal distribution where ratings cluster around a single height such as 3. In the latter instance, the median of the ratings is not only misleading, but it is also inaccurate.
Regarding Likert scales, marketers are often mistakenly encouraged to calculate the mean or average of the responses. In theory, this type of calculation makes sense, however on paper it completely ignores the purpose of the question-based scale.
To calculate the mean of Likert scale responses relies on the assumption that the psychological distance between "no opinion" and "agree," is the same as "strongly agree" and "agree." It also implies that the distance between "agree" and "strongly disagree" is four times greater than "agree" and "strongly agree." While the mathematical model needs these assumptions to properly function, the psychology does not support the outcome. Can you really give an emotion a set numerical value?
In short, the mathematics are not only complicated, but they do not validate the sound psychological reasoning that surrounds a Likert scale or star review.
Randy Farmer of Yahoo discussed these limitations in the context of fan-based star ratings. In his words, "Only the fans of a show evaluate the episodes, and being fans, will never rate an episode one or two stars, ever. I've seen this attempted over and over on the net with the same results every time: Each episode of a show is 4-stars +/- .5 stars. This goes all the way back to the Babylon-5 website, probably the first source for this kind of data." To combat these limitations, a marketer needs first to think about what questions they are trying to answer.
What questions are you trying to answer?
Before one can begin to analyze the data of a Likert scale or star reviews, they must first determine what types of questions they are trying to answer. Then choose the metrics accordingly. In short, before another star review is hastily assigned a meaning based on the "mean" result, first examine to see if:
- Ratings have been given with an accompanying explanation,
- The number of responses per star rating is available, and;
- How many and what type of users/customers responded.
Also, consider that the initial marketing questions may be answered via other methods.
What type of data can better and more accurately answer marketer's questions?
To avoid the pitfalls of misinterpreting the data associated with star reviews, marketers should rely on objective data. Objective data might include analyzing the time that a user or customer spent using or viewing a product or service. What was the purpose of this interaction? Did they achieve the desired result? Did they take the next step? Did a particular page or content piece lead to a closed sale?
[Tweet "What type of data can better & more accurately answer marketer's questions?"]
Alternatively, take a page out of Mad Men and decide to simply pick up the phone for a good, old-fashioned phone call with past, current, and potential customers. This "old school" tactic can provide valuable marketing insights into the effectiveness of the previously established content, communications, and products. Use the information to design new campaigns and content that aptly meet the needs of past, current, and potential customers.
The Bottom Line
While star reviews run rampant on the Internet, the most successful marketers know that the results of these reviews should be carefully analyzed to avoid mistakenly extrapolating incorrect data points. Whenever possible, objective data should be gathered to gain true insights into the customer's mindset. Finally, if Likert scales and star reviews are to be used successfully, then their limitations must be recognized, so that the correct types of questions can be asked.
Share your opinions, comment, or tweet us @modusengagement.