Assignment (12%) -- A Usability Study

How can we tell if a computer system is any good for a person to use? Most developers simply create the system, try it out themselves until they are satisfied with it, and then dump it onto the user audience. The result is usually a product that people have problems with.

One of the easiest methods for getting to "know the user" and for evaluating the human computer interface is through usability studies. Although these come in many flavors, they all require an observer to watch a typical user try the system out for a real task. It is surprising how many design flaws can be detected this way!

Usability studies are increasingly popular in industry. Many modern software companies now have usability labs staffed by HCI professionals whose job it is to find usability problems in products as they are being developed. Most labs contain all the equipment permanently in place (e.g., computers), and are instrumented with audio, video, screen-capture software, one-way mirrors, and so on.

Usability studies are extremely practical, and you can do them 'on the cheap' without these special usability labs. The simplest studies just require you: to pull up a chair next to a typical user; to watch them do their work (and perhaps having them explain what they are doing as they are doing it); to jot down any noteworthy events that occurred; and to listen to the user's comments.

Your job. You work for Usability Inc., a consulting firm that specializes in evaluating interfaces. You and your team have been contracted to do a usability study of the system described in the attached handout. Your deliverable will be a report written for the Vice President in charge of that system's use and redevelopment. Your report will describe how you went about looking for design problems, what problems you saw, and what changes you recommend. Depending upon how convincing you are, the VP will use your report to authorize changes in the upcoming version.

You will do this assignment in teams of three, and will follow the major steps below.

1. Things to prepare ahead of time

Subject selection. Select one of your group to be an experimenter. In the initial part of the experiment, the other members of the group will act as subjects 1 & 2. Try to find others to act as subjects 3 & 4 (but subjects 1 & 2 can redo it if necessary).

Pre-test questionnaire. Create a short pre-test questionnaire (~10 questions) testing the subject's experiences and beliefs about the system. It is extremely important that you ask relevant questions that helps you understand a subject's background and beliefs, as related to the task and system. Each subject should at least indicate their prior experience with computers, the windowing system, and the system being tested (why is this important?). They should also indicate their expectations. Example experience levels include:

never used it,
used it once or twice over the last few years
used it ~3-7 times this year, but not regularly
use it regularly (how often?)

while beliefs may be:

will need personal instruction to get started
will learn it after a bit of playing around
will be able to do simple tasks with no problems
will be able to do complex tasks

Post-test questionnaire. Similarly, create a post-test questionnaire. Good questions will give you information about how participants judge the system's usability, where they think they had most problems, and so on. You may want to leave space after each question for comments, where you would encourage people to say why they answered a question a certain way. For example, here is such a question that uses a rating scale:

I found the system: easy to use 1 2 3 4 5 hard to use. Reason for your rating::____________________

Selecting a core set of typical tasks. Usability studies requires an observer to watch someone go through the paces with 'typical' tasks. It is your job as experimenter to prepare a set of example tasks ahead of time that the subjects will try to perform. These tasks should be realistic ones that typical users would try to do with the system! But how do you discover what those typical tasks are?

The first way is to let subjects use their own real tasks. To do this, you would have to solicit subjects who have a real need, and ask them if you could watch them do their tasks. This is only an option for you if the system being studied is a popular one.

The second way is to ask a random sample of people who are using the system what they typically do with it, and then generalize those as tasks to give to subjects. Again, this is only an option if you can find real users!

The third way is for you to use the system, and contrive a few sample tasks through intuition. Although this will not produce a set of reliable tasks, you may not have any other choice. (By the way, jot down any problems with the system you see as you try it. You can compare these later with the problems you notice in the actual study).

To get you started, I have enclosed a few example tasks (your TA will give them to you), but you must come up with your own as well.

2 The Usability Study

See the handout on "User Observations" for a basic description of the method.

Pre-test Questionnaire. Take the pre-test questionnaire you had created previously and have subjects fill it out before they do the tasks.

Throughout the exercise: Active intervention and conceptual model formation. You can gain a sense of a person's initial conceptual model of the system by having them explain each screen as it appears, what each interface component does, and what they think they can do with it. This conceptual model will be formed from prior experiences and their interpretation of the visuals on the screen. You are looking for places where the model is incorrect or undeveloped. For example, people may not understand the meaning of labels and icons, what they are supposed to do, and how they are supposed to do it. Some of these problems are related (but not limited) to the lack of meaningful visual affordances, constraints, mapping, and so on.

All your subjects should begin with this step. Using this information as a baseline, you can see how a person's conceptual model develops (correctly and incorrectly) during system use merely by asking them to re-explain the display after major dialog steps (e.g., after reading documentation or after doing a transaction) or at the end of their session. Note that this means you are actively intervening in a person's session, for you are disrupting them in the middle of their task, and their act of explaining the screen to you may result in extra learning by them. Thus you should use this carefully, and at opportune moments.

Condition 1: The Silent Observer. In the first condition, the observer and Subject 1 are not allowed to speak to each other. The first subject should carry out one or two tasks on the system (remember, tasks were prepared ahead of time!), with the observer taking notes of the subject's behavior and where the system appears to break down (e.g., errors, problems, etc.).

Note: It is sometimes difficult for the observer to figure out what the subject is doing.

Condition 2: Think Aloud Method This condition is similar to the above, except that Subject 2 is asked to say what they are doing as they are doing it, and they should elaborate on any problems they are having. For example, here is what a subject may say:

"I'm going to try to do this task ... OK, this is probably the menu item I should select. Hmmm ... It's not doing anything, what's wrong? Oh, I see, I have to double click it...

As before, the observer must take notes of the subject's behavior and key comments (professional usability people often use a tape recorder or video setup as well). While the observer is allowed to encourage the subject to talk freely (i.e. "What is it you are doing now? Why did you do that?) the observer should not interfere or help the subject in any way, no matter how tempting!

Note: Talking aloud is sometimes uncomfortable and unnatural for people to do. It may also interfere with the task the person is trying to accomplish.

Caveat: If subjects get stuck. While the experimenter should not help the subject with the task, there are a few exceptions to this rule.

If a subject has problems getting started, record the problems and give them a hint to get going. This is OK, because if they can't get started, they will not be able to do the tasks!

If a subject cannot complete a particular task after a reasonable amount of time, tell them to stop and start them on the next task. Or, give them a hint if they cannot overcome some conceptual problem necessary to trying out other parts of the system. Again, record all problems.

Getting stuck is discouraging for subjects. Try to give them an early success experience, and remind them that they can quit at any time for any reason if they wish.

Condition 3: Constructive Interaction. This condition involves Subjects 3 and 4 (or 1 and 2 if you can't find other people) working together on a new task, with the observer taking notes as before. The difference is that the natural communication between the two subjects will replace the unnatural talking aloud in condition 2. Also, the differences between subject's knowledge may lead to interesting questions, explorations, and answers between them. The best match of subjects is a semi-knowledgeable person matched with a fairly new user, with the later being in charge of interacting with the system. Thus you hear the new user asking questions, and the knowledgeable one explaining how to do things (sometimes incorrectly!).

Note: If the now experienced Subjects 1 & 2 are performing this condition, make sure that they have tasks that differ considerably from their previous ones.

Condition 4: Questionnaire. Take the post-test questionnaire you had created previously and have subjects fill it out after they do the task.

Condition 5: Interview. The observer should then interview the subjects about their beliefs on how they performed, where errors were made, where the system helped them, where the system was weak, etc. As before, the observer should be taking detailed notes. Use the things you saw in the previous conditions to guide your interview. You can also use the filled-in questionnaire as a discussion tool (i.e. why did you answer this way?).

At this point, you are encouraged to repeat the experiment with your friends, strangers, and so on. The more people you observe, the better! Note that you can also allow people to perform open-ended tasks where they set their own goals.

3 The write up.

Your write-up should be oriented towards a senior person in your company that will make the major decisions on the software changes. Your TA will describe details and format of the write up to you. Here is a template for you to follow.

Section 1. Scenario: Give a very brief reminder to the VP on what the system is, and then explain the role of your product evaluation team. Make sure you tell her the point of your work!

Section 2. Methodology: Explain what you did. Assume that the VP knows what the particular usability methods are (as described in this sheet) and their purpose. Include the number of subjects, the pre-test evaluation, task description, etc. You must provide a list of the tasks that you have developed, and why you included them. You must also provide the pre- and post- test questionnaires, and why you included each question.

Section 3: Observations: Summarize your observations. Where appropriate, use selected raw and collapsed data, paraphrasing, comments, questionnaire and interview results, etc. It is important to present as much information as possible with economy!

Section 4: Interpretation: System strengths and weaknesses: Identify common and important problems and strengths of the system. This should be more than a checklist of all the problems seen. Try to generalize problems when necessary, although you can use examples to highlight them.

Section 5: Suggested improvements: Describe five important changes that you would make to the design of the system, with explanation. Refer back to your observations and the discussion on design as covered in class. Note that you must stay within the style of interface presented: for example, your modification cannot turn (say) a form fill-in system into a graphical map.

Section 6: Conclusion: Summarize what you found and the recommendations.

Appendix 1: Comparison of different techniques: For future usability studies, you want to tell your product team what worked well and what didn't in this usability study. Briefly summarize your experiences with each method, contrasting them for ease of use, the richness of the information obtained, their advantages, etc. Then recommend the methods you wish your group to use in the future. Which was most useful? Which was least useful? What would you keep? What would you throw away?

Appendix 2: Raw data: All original observations/recordings, etc. should be attached here.

Usability studies are immensely practical: you can and should use it every time you design (or wish to select!) a user interface. Good luck, and have fun!