Gartner recently published its “Hype Cycle for Artificial Intelligence, 2024.” After spending 2022 and 2023 in the cycle’s peak of inflated expectations, generative AI (GenAI) has entered the trough of disillusionment for 2024.
The Gartner AI update signals a welcome shift in the GenAI conversation. It’s time for organizations to take a closer look at their GenAI initiatives and aspirations, especially concerning GenAI impact on broader, enterprise-wide productivity; the experience and expectations of their user base; and GenAI technology itself.
Accept the gift of disillusionment
In the world of Gartner’s five-phase Hype Cycle, the trough of disillusionment (phase three) is a necessary step every technology innovation takes on the road to mainstream adoption (aka the plateau of productivity, phase five).
The trough of disillusionment is where the naive optimism fostered in phase two, the peak of inflated expectations, evolves into realism informed by experience. As Gartner puts it in its Hype Cycle explainer, in phase two “product usage increases, but there’s still more hype than proof that the innovation can deliver what you need. [In phase three], the original excitement wears off and early adopters report performance issues and low ROI.”
Despite its negative connotations, disillusionment in the context of the Hype Cycle is a positive process. It literally frees us from illusions. And the viral success of ChatGPT, Github Co-Pilot, and similar GenAI apps has created a lot of illusions in the enterprise.
Goldman Sachs estimated that GenAI could raise global GDP by 7%—which translates to nearly $7 trillion—over 10 years. McKinsey estimated GenAI’s contribution to the global economy could reach $4.4 trillion annually. At the task level, developers coded twice as fast and business users slashed their writing times by 40% when assisted by GenAI. And IDC predicted GenAI spending worldwide will hit $143 billion in 2027, up from $16 billion in 2023.
“Investment in AI has reached a new high with a focus on generative AI, which, in most cases, has yet to deliver its anticipated business value,” wrote Gartner senior director analysts Afraz Jaffri and Haritha Khandabattu. “Generative AI has passed the peak of inflated expectations, although hype about it continues.”
Don’t miss the forest for the trees
Jaffri and Khandabattu aren’t the only ones taking a second look at the state of GenAI. In the Harvard Business Review, Ben Waber and Nathanael J. Fast ask, “Is GenAI’s impact on productivity overblown?” The point the authors drive home is that research on GenAI productivity—especially the productivity enabled by large language models (LLMs)—tends to miss the forest for the trees.
The research and its resulting hype to date have almost exclusively focused on discrete, employee-level benefits—the productivity gains realized when individual employees use LLMs to perform specific tasks. In most cases, the holistic, enterprise-wide impact has yet to be felt much less examined.
For instance, one study of an LLM-based call center app found that it managed to increase the productivity of the customer service representatives who used it by 14% on average. Novice reps and those with lower-skilled reps increased their productivity by 34%. The gains, however, were made at the expense of top performing reps, who basically trained the LLM with their best practices. Top performing reps saw a “small but statistically significant decrease” in their productivity.
All reps, however, are paid bonuses that are based on how they perform relative to the other reps. That means top reps could actually take a pay cut as their relative performance decreases. And the pay cut could get deeper over time as the performance gap between reps continues to close. In other words, the top reps could be penalized rather than rewarded for their positive contributions to the app—if compensation policies don’t change.
The app’s 14% overall task-level improvement is obvious. However, its impact on employee compensation, retention, engagement, and related, enterprise-level issues remains to be seen.
“Amid all the hype, there is reason to question whether these [LLM] tools will have the transformative effects on company-wide productivity that some predict,” wrote Waber and Fast. “The consequences of introducing new products, including the possibility of turnover among experts whose output is used to train these systems, haven’t been examined. […] In the absence of a more comprehensive, long-term analysis, looking at task-specific data reveals little about the true effect of a new technology like LLMs on overall firm performance.”
Respect the user base
Take a second look at your base of users and potential users. In the enterprise, it seems everybody knows about GenAI, but few people really know GenAI. As we’ll see, most GenAI users are nontechnical. They’re business users. And many feel like impostors.
“No one wants to sound clueless about AI. Especially your boss,” wrote Wall Street Journal reporter Ray A. Smith. “Chances are your boss is standing at the base of the [generative AI] learning curve, too—right next to you. Rarely has such a transformative, new technology spread and evolved so quickly, even before business leaders have grasped its basics. […] Many chief executives and other senior managers are talking a visionary game about AI’s promise to their staff—while trying to learn exactly what it can do.”
Smith drove his point home with data from recent surveys: In one, 61% of 2,000 C-suite executives “said AI would be a ‘game-changer.’ Yet nearly the same share said they lacked confidence in their leadership teams’ AI skills or knowledge.” In another, “10,000 workers and executives […] cited AI as a reason 71% of CEOs and two-thirds of other senior leaders said they had ‘impostor syndrome’ in their positions.”
Elsewhere, McKinsey research on GenAI in the workplace reveals that about 70% of the employees surveyed do not use GenAI in any capacity at work. Of the remaining 30%, only 12% are technical users, either “creators” who build GenAI models and apps or “heavy users” whose core work involves GenAI.
The other 88%—nearly nine of every 10 employees—are nontechnical users who use GenAI to perform routine tasks in their positions as salespeople, marketeers, human resources associates, customer service and support reps, etc. While the GenAI user base is expected to explode going forward, the ratio of technical users to nontechnical users is likely to remain the same.
Overall, you can’t take your users’ skillset, or lack thereof, for granted. The preponderance of nontechnical users suggests that educating users and setting reasonable expectations will be priorities for most IT teams.
(Re)establish trust
Of all the challenges posed by GenAI, the capacity to hallucinate may be the biggest. It’s arguably the most well-known, thanks to hands with too many fingers, advice to eat rocks, and similar viral gaffes created by ChatGPT, Google AI Overviews, and DALL-E and other publicly-accessible apps. The notoriety, in turn, has left many users wondering if they can trust GenAI, especially in the workplace where hallucinations—illogical, nonsensical, incorrect, or misleading outputs—could do real damage to careers and companies.
To reduce hallucinations and restore trust, make sure your GenAI model is trained on high-quality data that is relevant to the tasks it will perform. Likewise, make sure the model has access to relevant, high-quality input data. Set limits on what the model can generate, confining it to relevant outputs.
As far as your users are concerned, provide them with data templates, which not only help reduce hallucinations but also help improve the consistency of the output. Tell them explicitly what the GenAI app is designed to do and about any limitations within its design parameters. Encourage them to provide feedback to the app as they use it, refining their prompts to get the results they want and avoid those they don’t. Finally, keep humans in the loop to vet the individual prompt results as well as to ensure the ongoing integrity of the GenAI app and its output.
Technologists are working on algorithms to eliminate GenAI’s tendency to hallucinate, but their ultimate success remains in question.
“In the short to medium term, I think it is unlikely that hallucination will be eliminated. It is, I think, to some extent intrinsic to the way that LLMs function,” Arvind Narayanan, a professor of computer science at Princeton University, told Time correspondent Billy Perrigo. “There’s always going to be a boundary between what people want to use them for, and what they can work reliably at. […] That is as much a sociological problem as it is a technical problem. And I don’t think it has a clean technical solution.”
Prepare for enlightenment
On the other side of Gartner’s trough of disillusionment lies the slope of enlightenment, phase four of the Hype Cycle, where “early adopters see initial benefits and others start to understand how to adapt the innovation to their organizations.” Taking a closer look at your GenAI efforts will give everyone in the enterprise a reality check and an opportunity to course correct. It will also move you one step closer to GenAI enlightenment.