A common practice in usability testing is to ask individual test participants to describe what they are doing as they complete the tasks in a test scenario. The “Think-Aloud” protocol, or method, asks test participants to maintain a running commentary about what they are doing and why. By analyzing verbal feedback from test participants, usability professionals gain valuable information about how users experience websites and other user interfaces.
In a recent DigitalGov University webinar, usability expert Erica Olmsted-Hawala (U.S. Census Bureau) discussed the theory behind the original think-aloud method, described a later variant of the method, and noted the various ways in which the think-aloud approach is applied by usability practitioners. Although usability practitioners tend to favor an active approach to probing the test participant, practitioners typically do not report the style of their think-aloud protocols.
Erica went on to describe the conditions in an experiment conducted in 2010 on think-aloud methods (Olmsted-Hawala, Murphy, Hawala, & Ashenfelter, 2010):
- Traditional: This method required the test administrator to say as little as possible to keep the test participant talking. The only permissible verbal cue was “Keep talking.”
- Speech-communication: This method allowed the test administrator to acknowledge the test participant’s verbalizations with “Mm-hmm?” or “Uh-huh?” in addition to “Keep Talking.”
- Coaching: The test administrator was trained to ask the participant for feedback and actively intervene with probes, such as “Did you notice that link up here?”, “You’re doing great,” or “Can you explain why you clicked on that link?”
- Silent control: The test participant completed tasks silently, and the test administrator gave no verbal feedback.
The question explored by the 2010 experiment is whether the style of the think-aloud protocol matters to the results of usability testing. Results indicate that the kind of think-aloud protocol definitely matters. As pointed out by Betty Murphy (Human Solutions, Inc.), practitioners, as well as test sponsors, need to be aware of the potential effects on participant accuracy and satisfaction. Inflated usability findings can mislead decision makers about the actual usability of a user-interface design.
Statistical analysis of the data collected on test-participant accuracy, showed that those in the “coaching” condition achieved significantly greater accuracy than did those in the other conditions. Conducted in the Census Bureau’s Usability Laboratory, the experiment revealed that test participants did better on their tasks and reported higher levels of satisfaction when they were being coached by a test administrator than they would be likely to do in an unassisted field setting. In other words, their accuracy rates and satisfaction with the website were significantly higher than were the comparable results for the other conditions. There were no significant differences, however, between the traditional, speech-communication and coaching conditions in the amount of time it took participants to complete their tasks.
Although usability practitioners may continue using active think-aloud methods, they and their sponsors are cautioned about assuming that test results will reflect the actual usability of the user interface. Actual usability is likely to be significantly poorer than such results would indicate. Olmsted-Hawala and Murphy recommend the following practices to help users of test results gauge the accuracy of reported usability findings:
- Use only traditional or speech-communication methods for evidence of how accurately users will complete tasks “in the field.”
- Accurately document the kind of think-aloud protocol used for each usability test.
The presenters urge the usability community to develop and follow standards for the use of think-aloud protocols.
By guest bloggers Erica Olmsted-Hawala, U.S. Census Bureau and Betty Murphy, Human Solutions, Inc.
Olmsted-Hawala, E., Murphy, E., Hawala, S. and Ashenfelter, K., (2010). “Think-Aloud Protocols: A Comparison of Three Think-Aloud Protocols for use in Testing Data Dissemination Web Sites for Usability.” Proceedings of CHI 2010, ACM Conference on Human Factors in Computing Systems. ACM Press: pp 2381-2390.