Apr 25, 2024 • Opinion

Generative AI: The Bloom Is Off the Rose

Artificial Intelligence is not living up to the hype. Hallucinations are an intractable part of the problem.

By Girish Mhatre

What’s at stake:
Funny things happened on the speedway to the promised land: Vendors of the most hyped technology since the internet are easing up on the gas pedal; cloud providers are quietly ratcheting down customer expectations on what AI can deliver and at what cost.

Also, according to a survey conducted by The Information, “Several executives, product managers and salespeople at the major cloud providers, such as Microsoft, Amazon Web Services and Google, privately said most of their customers are being cautious or ‘deliberate’ about increasing spending on new AI services, given the high price of running the software, its shortcomings in terms of accuracy and the difficulty of determining how much value they’ll get out of it.”

Even OpenAI CEO Sam Altman appears to be downplaying GPT-4, his company’s latest and greatest AI model. “I think it kind of sucks,” Altman said, when asked about GPT-4 and its most impressive capabilities.

What gives? Isn’t generative AI supposed to be the best thing since sliced bread? In just the past year, the promise of AI’s metaphysical power to change society in fundamental ways has stratospherically inflated big tech valuations. In a frenzy reminiscent of the dot com era, every tech company big and small is jumping on the bandwagon, scrambling for a piece of the action.

Much of it is transparently a sham; the carpetbaggers are practicing the art of “AI washing.” They’re simply tarting up ordinary tech with the usual buzzwords – “machine learning”, “neural networks”, “deep learning”, “natural language processing” – to convey a sheen of innovation.

AI washing may work to pump up valuations for a while. But where the rubber meets the road, it’s turning out to be a different story. Early corporate users aren’t buying it. They simply don’t trust AI. They’re not convinced of the benefits.

Two recently released studies bear that out.

PagerDuty’s survey of 100 Fortune 1,000 IT leaders, reveals that 100 percent of respondents are concerned about the risks of the technology and 98 percent have paused Gen-AI projects as a result.

Elastic’s study is more optimistic but reports the same top concern: Nearly all respondents reported that adoption has slowed. Reasons most often cited were fears of security, data privacy, regulation issues, and lack of implementation skills.

In short, generative AI’s biggest problem is not finding use cases or demand or distribution, it is proving value.
Sequoia Capital

A report from Sequoia Capital, a blue-chip Silicon Valley venture capital firm, is even more worrisome: “…a whisper began to spread within Silicon Valley that generative AI was not actually useful. The products were falling far short of expectations, as evidenced by terrible user retention. End user demand began to plateau for many applications. Was this just another vaporware cycle?”

Sequoia estimates that the AI industry spent $50 billion on the Nvidia chips used to train advanced AI models last year but brought in only $3 billion in revenue. According to the firm’s report, “so far, few startups have been able to show how they might recoup the steep costs associated with developing generative AI products.”

The same goes for corporate customers who are spending money hoping that it’s going to turbo-charge business productivity; so far there is scant evidence that generative AI is lowering costs or raising top lines

Suddenly everything is being questioned. What exactly is the market and how lucrative is it? And the all -important question: When will we see those long-promised returns from AI?

“In short, generative AI’s biggest problem is not finding use cases or demand or distribution, it is proving value. As our colleague David Cahn writes, ‘the $200B question is: What are you going to use all this infrastructure to do? How is it going to change people’s lives?’ The path to building enduring businesses will require fixing the retention problem and generating deep enough value for customers that they stick and become daily active users,” concludes Sequoia.

Not surprisingly, the faithful are undaunted. They’ve seen it before. The history of new technology absorption, modeled in the Gartner Hype Cycle, predicts that a “trough of disappointment” inevitably follows a “peak of inflated expectations” before a technology takes off. AI is simply running true to form.

Still, something seems rotten in the state of Denmark; there’s a fundamental problem.

That lack of trust expressed by corporate users stems from the fact that generative AI is unpredictable. It often goes completely off the rails, coming up with outputs that can range from wrong to bizarre. Worse, it can be deliberately manipulated to go rogue.

The tendency of generative AI to “hallucinate” – to randomly spout far out, often outrageous nonsense – doesn’t happen very often, and sometimes those hallucinations might be innocuous if, for example, a large language model (LLM, a text-generator subset of gen-AI models) is generating a high school essay. But hallucinations could be fatal if a medical diagnosis is being generated. “The risk of hallucinations and other unpredictable behavior has become one of the primary impediments to their broader use, particularly in customer-facing scenarios,” according to some experts.

Behind the scenes developers are furiously attempting to stem the rot. No one’s yet been able to find a way to stop Gen-AI from hallucinating, but they’ve found ways to constrain it. One way is to “fine-tune models” by training them with domain-specific data, thus imbuing them with appropriate, specialized knowledge in particular fields: Medical, legal, HR and such.

Another is RAG (retrieval automated generation) which adds a retrieval function to the generative function of the AI. The answer to a user query is first “retrieved” from an external data source and then both the query and the retrieved answers are fed into the generative function to help keep it from going too far off course. RAG allows developers to add the latest information (from today’s New York Times, say) to the generative model without retraining, while fine-tuning requires training to produce – static – specialized models.

But these are mere band aids, not cures. The problem is that neural nets are black boxes. We’ve designed them, we know how they are supposed to work, but at any given moment we don’t know what’s going on inside. We can’t follow the computation. They work; we know why they work; but we don’t know exactly how they worked in specific cases.

Worse, it’s quite possible that we will never solve the hallucinations problem; it may be inherent in how Gen-AI works.

In a recent paper, “Hallucination is Inevitable: An Innate Limitation of Large Language Models,” researchers at the University of Singapore come to a stark conclusion: “We present a fundamental result that hallucination is inevitable for any computable LLM, regardless of model architecture, learning algorithms, prompting techniques, or training data.”

The obvious rethink [needed in AI] is to admit that there are upper bounds to Gen-AI’s capabilities. That implies limits on uses; without human oversight Gen-AI cannot be used automatically in any safety-critical decision-making.

Tech optimists are adamant that the hallucinations problem will be solved in the next 12 to 18 months. But, if we can’t look inside a black box, we can’t figure out what’s going wrong. We can’t fix this without a fundamental rethink.

The obvious rethink is to admit that there are upper bounds to Gen-AI’s capabilities. That implies limits on uses; without human oversight Gen-AI cannot be used automatically in any safety-critical decision-making. There is, therefore, a crying need to impose regulations on its use to avert catastrophic societal risk.

The rethink that social activist and author, Naomi Klein, suggests is to recognize that we’ve been trying to solve the wrong problem.

“Warped hallucinations are indeed afoot in the world of AI, however – but it’s not the bots that are having them; it’s the tech CEOs who unleashed them, along with a phalanx of their fans, who are in the grips of wild hallucinations, both individually and collectively … Generative AI will end poverty, they tell us. It will cure all disease. It will solve climate change. It will make our jobs more meaningful and exciting. It will unleash lives of leisure and contemplation, helping us reclaim the humanity we have lost to late capitalist mechanization. It will end loneliness. It will make our governments rational and responsive. These, I fear, are the real AI hallucinations and we have all been hearing them on a loop ever since Chat GPT launched at the end of last year.

There is a world in which generative AI, as a powerful predictive research tool and a performer of tedious tasks, could indeed be marshalled to benefit humanity, other species and our shared home. But for that to happen, these technologies would need to be deployed inside a vastly different economic and social order than our own, one that had as its purpose the meeting of human needs and the protection of the planetary systems that support all life.”

Bottom line:
Let’s be leery of pipe dreams. The utopia of tech bro fantasies remains out of reach.