by Dhruv Grewal, Cinthia B. Satornino, Thomas H. Davenport and Abhijit Guha
Generative AI (gen AI) has sent shock waves of technological disruption across the marketplace ecosystem—particularly when it comes to marketing—leaving stakeholders to grapple with its implications, opportunities, and challenges. Because it produces various forms of content, marketers often view it as a powerful advancement in creating product copy, blog posts, video and web ads, personalized offers to customers, and market research (e.g., in some cases gen AI can be used to predict responses by prospects, customers, and other market participants). Indeed, the ninth edition of Salesforce’s “State of Marketing” report, a survey of 5,000 global marketers, found that “implementing or leveraging AI” was their number one priority. Some organizations have used gen AI tools to achieve significantly better marketing outcomes. For example, Vanguard has used gen AI to increase LinkedIn ad conversion rates by 15%. Similarly, Unilever’scustomer service agents rely on gen AI in ways that reduce their time-to-respond by 90%.
However, the Salesforce survey showed that although 96% of the marketers had generative AI in place or were planning to have it within 18 months, only 32% had fully implemented it within marketing operations. This may be because implementing AI-driven marketing initiatives is not without risks. For example, Coca-Cola, which used AI to create a remake of its 1995 holiday commercial, “Holidays Are Coming,” initially received a highly positive response from consumers about the ad, but it later met with considerable criticism due to the “lack of warmth” that is a common critique of gen AI-sourced imagery. The challenge, therefore, is not whether firms should but rather how they implement gen AI tools across various marketing implementations to maximize the benefits while mitigating the risks.
Despite the high stakes, many Chief Data and Analytics Officers CDAOs, who are often responsible for introducing AI across the organization, have not formalized their gen AI strategies and tactics for marketing or other functions to use. Decisions about gen AI tool selection (of which there are a dizzying array) often take place at the individual level, reflecting a patchwork system of on-the-fly experimentation, sometimes unbeknownst to senior management. In our discussions with more than 20 company leaders, we learned that companies’ journeys to a successful gen AI strategy for marketing often include three key decisions: 1) whether to use gen AI or traditional analytical AI; 2) what, if anything, must be added to a gen AI model to generate the desired output, and 3) what level of human augmentation, such as prompts and output review, is necessary.
To make these decisions, companies need to answer a host of questions, including:
- What tasks do we want to accomplish with this gen AI tool (e.g., are we trying to make business predictions or generate content)?
- Do we have structured or unstructured data for this use case?
- What resource constraints do we have?
- How much productivity improvement do we need to achieve?
- How quickly do we need to deliver the output to end users?
- How damaging are errors or inaccuracies in gen AI outputs?
- How closely do accuracy, privacy, and risk mitigation relate to our reputation and value proposition?
- How much control do we need over the process and output?
- How much legal and regulatory risk are we willing to take on?
- How intense are privacy concerns for us? For our end users?
In this article, we lay out how marketers can navigate these questions, the trade-offs with respect to training and creating prompt-based inputs for gen AI tools and processes, and a framework to help marketers weigh the trade-offs strategically.
Is the use case appropriate for gen AI or analytical AI?
The first key decision for marketers is to determine whether the use case needs gen AI or if analytical AI would suffice. Many marketers don’t fully understand the differences between generative AI and analytical AI. To clarify, the purpose of analytical AI is to analyze existing data and use it to predict or classify data that doesn’t yet exist. It is trained on structured data (data that can be organized into rows and columns of numbers) and in turn produces structured outputs. In marketing, it has been used to predict the product or service a customer is likely to buy, the price a customer is likely to pay, the promotion a customer will respond to, or the ad a customer will click through.
Traditional analytical machine learning has been widely and effectively used by marketers for several decades. For example, about 10 years ago, Kia used IBM Watson to identify influencers who embodied traits associated with the Kia brand (these influencers were subsequently used to buttress Kia’s 2016 Super Bowl presence)—a useful insight still needed today. Analytical AI is quite good at these types of predictions, and thus analytical AI continues to be extremely valuable to marketers. But given the recent hype around generative AI, many business leaders appear not to be considering analytical AI when appropriate and, fearing missing out on a source of competitive advantage or being left behind, instead are simply chasing the latest technology.
Like analytical AI, gen AI also uses analytical methods and machine learning. However, the objective of gen AI is less about making predictions and more about creating new content from patterns discerned in existing content. Gen AI is trained on relatively unstructured data like words, sounds, and images in a sequence, and it can produce unstructured data outputs. For example, in marketing and customer-facing functions, it can be used to generate offers, ads, new product images, blog posts, messages for customers, and product descriptions, as well as analyze customer sentiment, propose solutions to service issues, and, in some cases, predict future customer behavior.
Marketers will need both generative and analytical AI to accomplish their objectives. If, for example, a company wants to send “next best offers” to customers, analytical AI can predict what offer is the most likely to be purchased by a particular customer based on past customer purchase data. If the marketer wants to present a personalized message and offer product descriptions for the predicted purchase to the customer, gen AI will be the best tool for accomplishing that task. Combining both of these AIs, marketers can send the next best offer generated by analytical AI tools, wrapped in a gen AI–generated, personalized marketing message designed to further entice the purchase.
Do we need custom or general inputs for gen AI?
Secondly, assuming that gen AI is appropriate for the identified use case, the next key decision is whether customized or general inputs are necessary for customer-facing applications. The inputs used to train gen AI tools can be thought of as existing on a continuum. On one end, there are general, vendor-provided inputs in so-called “foundation models” (trained on very large amounts of publicly available data, such as Wikipedia, GitHub, data scraped from social media sites, and similar sources). On the other end, there is customized input, or proprietary, firm-specific data. Between the two endpoints on the continuum are many applications that draw from hybrid inputs (both general and customized data).
In many cases, the gen AI application benefits from training on and accessing a relatively broad data set—vast large language models (LLMs) or large image models (LIMs). For example, when interacting with individual users, as a service agent or virtual companion, or summarizing conversations with customers, gen AI tools like ChatGPT benefit from accessing broad data across multiple domains and viewpoints, which enables them to generate a broad range of responses depending on the context and content of the interaction. This training content can be thought of as a broad general education for AI models.
Conversely, if firm-specific content is needed to generate the desired output (critical for generating product copy, blog posts, ads, or customer service replies, for example), models may need to be trained on or augmented with a narrow, carefully constructed data set of proprietary data. Some custom models are typically trained from scratch on these data sources. The models are usually domain specific, such as BloombergGPT and FinGPT for finance, KL3M and ChatLaw for legal applications, and BioNeMo and MedLM for life sciences applications. (Note that none of these examples are currently used in marketing, sales, or customer service.) Other models can be “custom trained” on specific content, which then modifies the general foundation models, though this method is also relatively difficult and rare. One example is Harvey, a legal LLM that has been fine-tune-trained with legal content, in collaboration with OpenAI and GPT-4. At least one vendor, Jasper, now offers custom-trained versions of OpenAI models on marketing-oriented content.
A far more common approach in marketing and customer service applications is to augment general purpose models with specialized and proprietary content through a set of prompts that do not change the underlying model but greatly influence its outputs. Such a “retrieval-augmented generation” (RAG) approach has been adopted by many companies that use it to modify their models. This method provides benefits in terms of reduced frequency of hallucinations or inappropriate outputs—a source of significant risk for marketing firms. Colgate-Palmolive, for example, worked with the vendor Market Logic to use this approach for capturing consumer and market knowledge so that marketers around the firm could easily access it. Similarly, Jasper AI also allows customers to input their own proprietary content related to “specific brand guidelines and content requirements” in a program called “Custom Apps.”
Because custom inputs can modify models permanently or temporarily, companies requiring custom inputs often work with open-source models stored on their own premises or else proprietary versions of foundation models from cloud vendors. By enacting control over the proprietary or foundational models, they can also avoid leakages of intellectual property or confidential data into publicly available models, another source of significant risk for firms. If companies restrict (typically through RAG and prompt filtering techniques) prompts to those involving the custom input content, hallucinations are usually substantially reduced, and the system can provide citations to the relevant custom content.
The trade-off between a general input and a custom input thus entails both cost and risk. With general inputs, firms do not incur the expense of creating their own data sets to train the models and support output generation. Firms that only require general inputs can partner with OpenAI or Google Gemini, for example, and avoid the expense altogether. However, these models do not necessarily provide the most accurate and specific information outputs in narrow content domains. They also create a relatively high risk of hallucinations or undesirable output made by these LLM “predict-the-next-word” models. Employing publicly available models also raises privacy and confidentiality risksthat leave firms open to ethical and regulatory vulnerabilities. Firms should carefully consider their risk tolerance, privacy and confidentiality needs, and resource constraints when they select tools based on general versus customized inputs.
How much human augmentation do we need?
The third key decision in implementing gen AI relates to the level of human augmentation before the output is delivered to the end consumer. At one extreme, firms may choose to let gen AI output flow directly to the end consumer, which may be appropriate when there are minimal risks related to inaccuracy, bias, and/or offensive content. For example, given that the risks resulting from errors or inaccuracies are relatively low, the task of summarizing product reviews using a gen AI tool can be executed and directly uploaded onto a website without a human needing to review the output. Conversely, when these risks are high, such as if gen AI were used to create legally binding promotions or offers, firms may choose substantial levels of review and editing by a human agent, thereby augmenting the output prior to delivery to the final end user.
Although in our interviews, some company representatives asserted that there are few or no productivity gains associated with using gen AI if the content must be subsequently reviewed and edited, the lack of human augmentation can result in material losses for firms in cases where risks of errors are high. For example, a chatbot from Air Canada offered a customer a bereavement discount that the airline later denied. Subsequently though, a court ruled that the airline had to honor the chatbot’s discount offer. The trade-offs with human augmentation include the increased cost of a human agent and the decreased speed of delivery introduced by additional steps in the process to review the output, but the benefit includes increased accuracy and appropriateness of the output.
A Framework for Effective Use of GenAI in Marketing
Our framework (below) can help decision-makers categorize these trade-offs among different approaches, to glean the benefits of gen AI and select tools that best match their firms’ strategic and tactical objectives, as well as hedge their bets when navigating the risk–reward trade-off. The four quadrants each represent a specific combination of benefits and costs based on the trade-offs with respect to training and access inputs and output delivery to end users.
Quadrant 1 (Q1): No custom input, no need for output review.
There are marketing applications with little need for augmented input data and that involve relatively low risk from errors or inaccuracies (e.g., summarizing product reviews). In such cases, gen AI implementations that involve little or no human augmentation prior to delivering the output to end users may be suitable. These general low-input human augmentation processes typically offer high speeds of delivery at a low cost. Notably, there are risks with respect to privacy and accuracy. However, marketers deploying these processes should have deemed these risks acceptable trade-offs for the speed and cost advantages prior to implementation. This category could include internal summaries of market research documents or customer conversations, internal meeting summaries, or any other content that is unlikely to be consumed by customers or to contain contractually enforced promises.
Quadrant 2 (Q2): No custom input, but output review needed.
For firms that demand more accuracy but still rely on generic, public outputs, employing a public LLM and assigning a human agent to review and edit the output prior to delivery may be fitting. These firms incur the cost of a human review, which is likely to slow down the delivery of output and lower productivity gains. Yet this approach mitigates potentially costly risks due to error or inaccuracy. Examples in this cell could include blog posts, AI-created podcasts, or product copy for well-understood products.
Quadrant 3 (Q3): Custom content, but no output review needed.
Some firms produce output derived from proprietary data but face minimal risk of inaccuracy or error. These implementations include the cost of developing and maintaining a proprietary data set for training and access—a nontrivial expense. However, they offer more relevant output and mitigate privacy risks. The lack of human augmentation may lead to inaccuracy, but in these use cases, the inaccuracy-related risks are considered low. Examples may include gen AI applications that advise in-store product locations, customer service chatbots with proprietary content about products and services, or internal marketing knowledge-management systems.
Quadrant 4 (Q4): Custom content with human review.
When proprietary data, risk mitigation, and output accuracy are all necessary, a high-end approach is called for. These gen AI implementations are the costliest because firms incur the costs of both generating and maintaining a proprietary data set and of performing human agent reviews. Further, the output delivery to the end user is slowed by the review and editing phases. However, this is the quadrant wherein substantial risk mitigation is desirable. Relevant examples include applications with high levels of regulatory or contractual sensitivity, such as an enforceable offer to a customer or a product description for a drug or medical device.
It’s important for marketers to remember that both analytical and generative AI can provide value, and that customer interactions will, in many cases, require both types of AI. Companies that are committed to gen AI usage on a broad scale are likely to eventually encounter use cases that cover all of the four quadrants we have described. The above figure and discussions provide guidance in considering the various trade-offs involved in a gen AI implementation. Over time, technology changes may make it easier to incorporate custom content and reduce the prevalence of errors and inaccuracies in publicly available systems. Today and for the foreseeable future, however, all of the above require significant attention and effort to address.