When Do We Trust AI’s Recommendations More Than People’s?

Mr Pickles/Getty Images

More and more companies are leveraging technological advances in machine learning, natural language processing, and other forms of artificial intelligence to provide relevant and instant recommendations to consumers. From Amazon to Netflix to REX Real Estate, firms are using AI recommenders to enhance the customer experience. AI recommenders are also increasingly used in the public sector to guide people to essential services. For example, the New York City Department of Social Services uses AI to give citizens recommendations on disability benefits, food assistance, and health insurance.

However, simply offering AI assistance won’t necessarily lead to more successful transactions. In fact, there are cases when AI’s suggestions and recommendations are helpful and cases when they might be detrimental. When do consumers trust the word of a machine, and when do they resist it? Our research suggests that the key factor is whether consumers are focused on the functional and practical aspects of a product (its utilitarian value) or focused on the experiential and sensory aspects of a product (its hedonic value).

In an article in the Journal of Marketing — based on data from over 3,000 people who took part in 10 experiments — we provide evidence supporting for what we call a word-of-machine effect: the circumstances in which people prefer AI recommenders to human ones.

The word-of-machine effect.

The word-of-machine effect stems from a widespread belief that AI systems are more competent than humans in dispensing advice when utilitarian qualities are desired and are less competent when the hedonic qualities are desired. Importantly, the word-of-machine effect is based on a lay belief that does not necessarily correspond to the reality. The fact of the matter is humans are not necessarily less competent than AI at assessing and evaluating utilitarian attributes. Vice versa, AI is not necessarily less competent than humans at assessing and evaluating hedonic attributes. Indeed, AI selects flower arrangements for 1-800-Flowers and creates new flavors for food companies such as McCormick.

Nevertheless, our experiments suggest that if someone is focused on utilitarian and functional qualities, then, from a marketer’s perspective, the word of a machine is more effective than the word of human recommenders. For someone focused on experiential and sensory qualities, human recommenders are more effective.

For instance, in one of our experiments we assessed the word-of-machine effect on people’s propensity to choose products and people’s consumption experiences. To do so, we asked over 200 passersby in (pre-Covid-19) Boston to participate in a blind market test for haircare products. Using leaflets to explain the test, we asked each person to select one of two hair product samples, one recommended by AI and the other by a human. As predicted, when passersby were asked to focus only on utilitarian and functional attributes such as practicality, objective performance, and chemical composition, more people chose the AI-recommended sample (67%) than the one recommended by a person. When passersby were asked to focus only on experiential and sensory attributes such as indulgence, scent, and a spa-like vibe, more people choose the human-recommended sample (58%) than the one recommended by AI.

The word-of-machine effect also emerged in a second field experiment that we conducted in the Italian resort town of Cortina. We first primed people to consider a real estate investment relying only on either its functional and practical qualities or its emotional and sensory-based qualities. Then, we asked people to choose one of two selections of house properties: one curated by a human real estate agent and one by an AI algorithm. When presented with a pitch that focused on practicality, more people (60%) chose a list of AI-curated properties. But more participants (76%) chose the human-curated property list in response to a pitch that appealed to the senses such as enjoyment.

The word-of-machine effect even extended to product consumption and taste perception. We recruited 144 participants from the University of Virginia campus and informed them that we were testing chocolate-cake recipes for a local bakery. The participants were given two options: one cake created with ingredients selected by an AI chocolatier and one created with ingredients selected by a human chocolatier. Participants were then asked to eat one of the two cakes, which were identical in appearance and ingredients, and rate the cake for two experiential/sensory features (indulgent taste and aromas, pleasantness to the senses) and two utilitarian/functional attributes (beneficial chemical properties and healthiness). Participants rated the AI-recommended cake as less tasty but healthier than the cake recommended by the human chocolatier.

In cases where utilitarian features are most important, the word-of-machine effect is more pronounced. Using an online survey, we asked 303 respondents to imagine buying a winter coat and review a list of practical/functional (e.g., breathability) and experiential/sensory (e.g., fabric type) qualities of the coat, rating how much they cared about these features. The more participants cared about utilitarian features, the more they preferred an AI shopping assistant over a human, and the more they cared about hedonic features, the more they preferred a human shopping assistant over an AI.

Even though it is clear that consumer confidence in AI assistance is higher when searching for products that are utilitarian (e.g., computers and dishwashers), this does not mean that companies offering products that promise more hedonic experiences (e.g., fragrances, food, and wine) are out of luck when it comes to using AI recommenders. In fact, we found that people embrace AI’s recommendations as long as AI works in partnership with humans. For instance, in one experiment, we framed AI as augmented intelligence that enhances and supports human recommenders rather than replacing them. The AI-human hybrid recommender fared as well as the human-only recommender even when experiential and sensory considerations were important.

These findings are important because they represent the first empirical test of augmented intelligence that focuses on AI’s assistive role in advancing human capabilities, rather than as an alternative to humans, which is how it is typically perceived. One company that is seeing success with this approach is Stitch Fix, which uses AI in partnership with human stylists to choose clothing for its customers.

Convincing customer to give AI the benefit of the doubt.

How might managers correct the incorrect lay belief about the competence of AI vs. humans to provide advice? Our research uncovered several interventions that could attenuate it.

In one experiment, we asked participants to consider the opposite of what they initially believed to be true with regards to AI competence. We did so by prompting them to consider the ways in which they could be wrong about what they expected a human or an AI recommender to be good at. We found that prompting people to consider a different viewpoint about the recommender’s ability reduced the effect we found in the previous experiments. In other words, the AI recommenders scored higher on a hedonic scale, suggesting that people were more open to AI recommenders even when focused on experiential/sensory qualities, and the human recommenders scored higher on a utilitarian scale, suggesting that people were more open to human recommenders even when seeking functional/practical qualities.

In the real world, trying to convince a consumer to think the opposite of their core belief is difficult, so we tested a more practical and easier intervention to determine the effect of removing bias. We invited 299 online respondents to read about an app called “Cucina” that would rely on AI to give recipe recommendations. Within the app, the participants were able to interact with a chat bot — an AI chef — that was programmed to assist them. The chat bot greeted each participant and introduced itself (“Hi, Mark! I am here to suggest a recipe for you to try!”). The AI chef then delivered the consider-the-opposite protocol using a fun, interactive nudge: “Some people might think that an artificial intelligence chef is not competent to give food suggestions, but this is a misjudgment. For a moment, set aside your expectations about me. When it comes to making food suggestions, could you consider the idea that I could be good at things you do not expect me to be good at?” That resulted in more favorable perceptions of the AI recommendation even when people were considering the experiential and sensory qualities of a recipe like taste and aromas.

Our findings are insightful for managers navigating the remarkable technology-enabled opportunities that are growing in today’s marketplace. Although there is a clear correlation between utilitarian/functional and experiential/sensory attributes and consumer trust (or distrust) in AI recommenders, there are ways in which organizations can design the customer experience to take advantage of the word-of-machine effect.

For instance, companies like Netflix and YouTube could emphasize AI-based recommendations when utilitarian attributes are relatively more important to people (e.g., when they are selecting a documentary to view) and human-based recommendations (“similar users”) when hedonic attributes are relatively more important (e.g., when selecting a horror movie to view). Similarly, a company in the hospitality industry such as TripAdvisor could emphasize AI-based recommendations for business travel services and de-emphasize AI-based recommendations for leisure travel services.

As firms navigate the challenges of attracting and retaining customers in a crowded digital marketplace, those with a good understanding of the conditions under which consumers do and do not trust the “word” of AI recommenders will have a competitive advantage.

Source: When Do We Trust AI’s Recommendations More Than People’s?