Interaction Design Evaluation

What is interaction design evaluation?

14 min readAug 2, 2021

Imagine that you designed an app for teenagers to share music, gossip, and photos. You prototyped your first design and implemented the core functionality. How would you find out whether it would appeal to them and whether they will use it? You would need to evaluate it — but how? This article presents an introduction to the main types of evaluation and the methods that you can use to evaluate design prototypes and design concepts.

Evaluation is integral to the design process. It involves collecting and analyzing data about users’ or potential users’ experiences when interacting with a design artifact such as a screen sketch, prototype, app, computer system, or component of a computer system.

The evaluation methods described in this article so far have involved interaction with, or direct observation of, users. In this chapter, we introduce methods that are based on understanding users through one of the following:

• Knowledge codified in heuristics

• Data collected remotely

• Models that predict users’ performance

None of these methods requires users to be present during the evaluation. Inspection methods often involve a researcher, sometimes known as an expert, role-playing the users for whom the product is designed, analyzing aspects of an interface, and identifying potential usability problems. The most well-known methods are heuristic evaluation and walkthroughs. Analytics involves user interaction logging, and A/B testing is an experimental method. Both analytics and A/B testing are usually carried out remotely. Predictive modeling involves analyzing the various physical and mental operations that are needed to perform particular tasks at the interface and operationalizing them as quantitative measures. One of the most commonly used predictive models is Fitts’ law.

Heuristic Evaluation

In heuristic evaluation, researchers, guided by a set of usability principles known as heuristics and evaluate whether user-interface elements, such as dialog boxes, menus, navigation structure, online help, and so on, conform to tried-and-tested principles. Heuristic evaluation was developed by Jakob Nielsen and his colleagues and later modified by other researchers for evaluating the web and other types of systems.

These heuristics closely resemble high-level design principles such as making designs consistent, reducing memory load, and using terms that users understand. Jakob Nielsen’s heuristics are probably the most-used usability heuristics for user interface design.

Visibility of system status

The system should always keep users informed about what is going on, through appropriate feedback within a reasonable time.

Match between system and the real world

The system should speak the user’s language, with words, phrases, and concepts familiar to the user, rather than system-oriented terms.

User control and freedom

Users often choose system functions by mistake and will need a clearly marked “emergency exit” to leave the unwanted state without having to go through an extended dialogue. Support undo and redo.

Consistency and standards

Users should not have to wonder whether different words, situations, or actions mean the same thing.

Error prevention

Even better than good error messages is a careful design that prevents a problem from occurring in the first place. Either eliminate error-prone conditions or check for them and present users with a confirmation option before they commit to the action.

Recognition rather than recall

Minimize the user’s memory load by making objects, actions, and options visible. Instructions for use of the system should be visible or easily retrievable whenever appropriate.

Flexibility and efficiency of use

Accelerators — unseen by the novice user — may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users. Allow users to tailor frequent actions.

Aesthetic and minimalist design

Dialogues should not contain information that is irrelevant or rarely needed. Every extra unit of information in a dialogue competes with the relevant units of information and diminishes their relative visibility.

Help users recognize, diagnose, and recover from errors:

Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution.

Help and documentation:

Even though it is better if the system can be used without documentation, it may be necessary to provide help and documentation. Any such information should be easy to search, focused on the user’s task, list concrete steps to be carried out, and not be too large.

Using these heuristics, Designers and researchers evaluate aspects of the interface. Those doing the heuristic evaluation go through the interface several times, inspecting the various interactive elements and comparing them with the list of usability heuristics. During each iteration, usability problems will be identified and ways of fixing them may be suggested.

Walk-Throughs

Walk-throughs offer an alternative approach to heuristic evaluation for predicting user problems without doing user testing. The meaning of walk-throughs involves walking through a task with the product and noting problematic usability features. There are main two walk-through methods called Cognitive walk-throughs and Pluralistic walk-throughs.

Cognitive Walkthroughs

What is a cognitive walkthrough?

Cognitive walkthroughs are used to evaluate a product’s usability. It focuses on the new user’s perspective by narrowing the scope to tasks needed to complete specific user goals. It was created in the early 90’s by Cathleen Wharton, John Riemann, Clayton Lewis, Peter Polson.

Cognitive walkthroughs are sometimes confused with heuristic evaluations, but, while both methods uncover usability problems and take the users’ point of view, heuristic evaluations typically focus on the the product as a whole, not specific tasks.

How to conduct a cognitive walkthrough

At its core, a cognitive walkthrough has three parts:

1. Identify the user goal you want to examine

2. Identify the tasks you must complete to accomplish that goal

3. Document the experience while completing the tasks

Identifying the user goal

A user goal is a big, overarching objective and doesn’t include specific, step-by-step tasks. From a user’s point-of-view, it doesn’t necessarily matter how the goal is accomplished, as long as it gets completed.

For example, I host a dinner party every month. Beforehand, I ask everyone invited to send me 10 songs they love. Then, I use Spotify to create a playlist of those songs to play during the party. As a user, my goal here is to create a playlist with others to play at my dinner party.

Identifying the tasks

I’ll note here I’ll only be walking through one possible path of accomplishing these goals. Spotify has a number of ways to accomplish these goals. Ideally, you’d identify the optimal path and tasks for each interface. However, in this article, I’ll only be walking through one path.

Goal: Create a Playlist

•Open Spotify web player

•Enter user name in user name field

• Enter password in password field

• Click the login button

• Click the your library section

• Click the new playlist button

• Type a name into the playlist name field

• Click the create button

Goal: Add a track to the playlist

• Click search icon

• Enter track name into the field

• Click tracks tab

• Find track in results

• Hover over track

• Click “…”

• Click “add to playlist”

• Select playlist

Documenting the experience

Since experience is subjective, it’s important to structure how an evaluator documents it so that all walkthroughs use the same criteria. Traditionally, the evaluator asks/answers four questions during each task.

• Will users understand how to start the task?

• Are the controls conspicuous?

• Will users know the control is the correct one?

• Was there feedback to indicate you completed (or did not complete) the task?

However, I like to add an additional question to specify task completion. I add this question because it allows anyone to easily find those tasks that stop users from completing their goal. Often, these tasks become the highest priority and need to be addressed first.

• Were you able to complete the task?

Template

I’ve created a walkthrough template for the above Spotify example. For each task, answer each question with a yes or no. The worksheet will color code answers for you so that a briefly scan will quickly reveal the problem areas.

Pluralistic Walkthroughs

This type of walkthrough involves multiple groups, including the users (thus the use of ‘pluralistic’ in the name) Representatives of at least three groups are present for the walkthrough:

— Users (at least two, hopefully more)

— User experience professionals (one or two; generally serve as moderator and recorder)

— Programmers (one or two)

Other relevant groups could also be present.

Pluralistic Walkthrough Process

1. A task is chosen for testing.

2. Storyboards are prepared for that task (e.g., registration, checkout) and the first storyboard is given to each person present.

3. Approaching the task as a user would, each writes on the first storyboard (or on a piece of paper) the actions to take:

• press the down arrow key twice to scroll the page, click this empty text box, type text in it, then click this button next to the text box.

4. Once everyone is finished with that first storyboard they compare notes and discussion begins

• Users always give their input first.

5. At the end of the discussion the facilitator shows the ‘correct’ sequence of actions (based on the specs/use cases) and then the next storyboard is distributed to participants Jason Withrow (jwithrow@umich.edu) Pluralistic Walkthrough Process.

6. This process of individual analysis, followed by group discussion and then a review of the ‘correct’ actions, is repeated for each new storyboard, one at a time.

7. A list of prioritized usability issues and their recommended solutions is the end result.

Compared with heuristic evaluation, walk-throughs focus more closely on identifying specific user problems at a detailed level.

Web Analytics

Web analytics is the collection, reporting, and analysis of website data. The focus is on identifying measures based on your organizational and user goals and using the website data to determine the success or failure of those goals and to drive strategy and improve the user’s experience. Critical to developing relevant and effective web analysis is creating objectives and calls-to-action from your organizational and site visitors goals and identifying key performance indicators (KPIs) to measure the success or failures for those objectives and calls-to-action.

Web analytics processes are made of four essential stages or steps, which are:

Collection of data:

This stage is the collection of the basic, elementary data. Usually, these data are counts of things. The objective of this stage is to gather the data.

Processing of data into information:

This stage usually take counts and make them ratios, although there still may be some counts. The objective of this stage is to take the data and conform it into information, specifically metrics.

Developing KPI:

This stage focuses on using the ratios (and counts) and infusing them with business strategies, referred to as key performance indicators (KPI). Many times, KPIs deal with conversion aspects, but not always. It depends on the organization.

Formulating online strategy:

This stage is concerned with the online goals, objectives, and standards for the organization or business. These strategies are usually related to making money, saving money, or increasing market share.

Generally, web analytics has been used to refer to on-site visitor measurement. However, this meaning has become unclear as of late. Many different vendors provide on-site web analytics software and services. There are two main technical ways of collecting the data. The first and traditional method, server log file analysis, reads the logfiles in which the web server records file requests by browsers. The second method, page tagging, uses JavaScript embedded in the webpage to make image requests to a third-party analytics-dedicated server, whenever a webpage is rendered by a web browser or if desired when a mouse click occurs. Both collect data that can be processed to produce web traffic reports.

A/B Testing

An A/B test, also known as a split test, is an experiment for determining which of different variations of an online experience performs better by presenting each version to users at random and analyzing the results. So, what is A/B testing? A/B testing demonstrates the efficacy of potential changes, enabling data-driven decisions and ensuring positive impacts.

Benefits of A/B Testing

• Improved user engagement

Elements of a page, app, ad, or email that can be A/B tested include the headline or subject line, imagery, call-to-action (CTA) forms and language, layout, fonts, and colors, among others. Testing one change at a time will show which affected users’ behavior and which did not. Updating the experience with the “winning” changes will improve the user experience in aggregate, ultimately optimizing it for success.

• Improved content

Testing ad copy, for instance, requires a list of potential improvements to show users. The very process of creating, considering, and evaluating these lists winnows out ineffective language and makes the final versions better for users.

• Reduced bounce rates

A/B testing points to the combination of elements that helps keep visitors on-site or app longer. The more time visitors spend on site, the likelier they’ll discover the value of the content, ultimately leading to a conversion.

• Increased conversion rates

A/B testing is the simplest and most effective means to determine the best content to convert visits into sign-ups and purchases. Knowing what works and what doesn’t helps convert more leads.

• Higher conversion values

The learnings from A/B testing successfully applied to one experience can be applied to additional experiences, including pages for higher-priced products and services. Better engagement on these pages will demonstrate similar lifts in conversions.

• Ease of analysis

Determining a winner and a loser of an A/B test is straightforward: which page’s or app’s metrics come closer to its goals (time spent, conversions, etc.,).

And while testing services have evolved to include statistical analysis for users of all levels of spreadsheet expertise, the numbers for a comparison of two experiences are underwhelming in their complexity. The clarity of these stats also undermines the highest-paid person’s opinion (HIPPO) that may otherwise be overvalued.

• Quick results

Even a relatively small sample size in an A/B test can provide significant, actionable results as to which changes are most engaging for users. This allows for short-order optimization of new sites, new apps, and low-converting pages.

• Everything is testable

Forms, images, and text are typical items for A/B testing and updating, but any element of a page or app can be tweaked and tested. Headline styling, CTA button colors, form length, etc., can all affect user engagement and conversion rates in ways that may never be known if they’re not tested. No idea need be rejected on a conference call; testing and metrics, not emotions, prove what works and what doesn’t.

• Reduced risks

By A/B testing, commitments to costly, time-intensive changes that are proven ineffective can be avoided. Major decisions can be well-informed, avoiding mistakes that could otherwise tie up resources for minimum or negative gain.

• Reduced cart abandonment

For e-commerce, getting a user to follow through with checkout after clicking “buy” on an item is a significant challenge, as most potential customers abandon their carts before paying. A/B testing can help find the optimal combination of tweaks to the order pages that will get users to the finish.

• Increased sales

Any and all of the above-mentioned A/B testing benefits serve to increase sales volume. Beyond the initial sales boost optimized changes produce, testing provides better user experiences which, in turn, breeds trust in the brand, creating loyal, repeat customers and, therefore, increased sales.

A/B testing process

The following is an A/B testing framework you can use to start running tests:

• Collect data: Your analytics will often provide insight into where you can begin optimizing. It helps to begin with high traffic areas of your site or app to allow you to gather data faster. Look for pages with low conversion rates or high drop-off rates that can be improved.

• Identify goals: Your conversion goals are the metrics that you are using to determine whether or not the variation is more successful than the original version. Goals can be anything from clicking a button or link to product purchases and e-mail signups.

• Generate hypothesis: Once you’ve identified a goal you can begin generating A/B testing ideas and hypotheses for why you think they will be better than the current version. Once you have a list of ideas, prioritize them in terms of expected impact and difficulty of implementation.

• Create variations: Using your A/B testing software (like Optimizely), make the desired changes to an element of your website or mobile app experience. This might be changing the color of a button, swapping the order of elements on the page, hiding navigation elements, or something entirely custom. Many leading A/B testing tools have a visual editor that will make these changes easy. Make sure to QA your experiment to make sure it works as expected.

• Run experiment: Kick off your experiment and wait for visitors to participate! At this point, visitors to your site or app will be randomly assigned to either the control or variation of your experience. Their interaction with each experience is measured, counted and compared to determine how each performs.

• Analyze results: Once your experiment is complete, it’s time to analyze the results. Your A/B testing software will present the data from the experiment and show you the difference between how the two versions of your page performed and whether there is a statistically significant difference.

Predictive Models

What Is Predictive Modeling?

Predictive modeling uses statistical techniques to predict future user behaviors. To understand the intricacy of the design of predictive analytics, you must dive deep and comprehend what a predictive model is. A predictive model uses historical data from various sources. You must first normalize the raw data by cleansing it of anomalies and preprocess it to fit a suitable format that would facilitate analysis. Then, apply a statistical model to the data to draw inferences. Each predictive model comprises various indicators — that is, factors that would likely impact future outcomes — that are called independent variables, or predictor variables.

“Applying a predictive-analytics algorithm to UX design … presents users with relevant information….”

Applying a predictive-analytics algorithm to UX design does not result in changes to a user interface. Instead, the algorithm presents users with relevant information that they need. Here’s a simple illustration of this capability from the ecommerce domain: A user who has recently purchased an expensive mobile phone would likely need to purchase a cover to protect it from dust and scratches. Therefore, that user would receive a recommendation to buy a cover. The ecommerce site might also suggest other accessories such as headphones, memory cards, or antivirus software.

Here are some other examples of predictive modeling. Spam filters use predictive modeling to identify the probability that a given message is a spam. The first mail-filtering program to use naive Bayes spam filtering was Jason Rennie’s ifile program, which was released in 1996. Bayes theorem predicted which email messages were spam and which were genuine. Facebook uses DeepText, a form of unsupervised machine earning, to interpret the meaning of users’ posts and comments. For example, if someone said, “I like blackberries,” they might mean the fruit or the smartphone. In Customer Relationship Management, predictive modeling targets messaging to those customers who are most likely to make a purchase.

Predictive User Experience

Envision coffee machines that start brewing just when you think it’s a good time for an espresso, office lights that dim when it’s sunny and workers don’t need them, your favorite music app playing a magical tune depending on your mood, or your car suggesting an alternative route when you hit a traffic jam.

Predictability is the essence of a sustainable business model. In a digital world, with millions of users across the globe, prediction definitely has the power to drive the future of interaction. Feeding a historical dataset into a system that uses machine-learning algorithms to predict outcomes makes prediction possible.

References

Helen Sharp, Jenny Preece, Yvonne Rogers — Interaction Design_ Beyond Human-Computer Interaction-Wiley (2019)

[1] Wang, John, editor. Encyclopedia of Business Analytics and Optimization. Hershey, PA: Business Science Reference, 2014.

[2] Johnson-Laird, P.N. “Mental Models and Cognitive Change.” Journal of Cognitive Psychology, March 20, 2013.

Kaushik, A. (2010, April 19). Web Analytics 101: Definitions: Goals, Metrics, KPIs, Dimensions, Targets. Retrieved from Kaushik.net: https://www.kaushik.net/avinash/web-analytics-101-definitions-goals-metrics-kpis-dimensions-targets/

Thank you for reading