Welcome to AI Decoded, Fast Company’s weekly LinkedIn newsletter that breaks down the most important news in the world of AI. I’m Mark Sullivan, a senior writer at Fast Company, covering emerging tech, AI, and tech policy.

This week, I’m looking at the growing competitive pressures facing OpenAI. The pace of advancements in AI R&D at competitors like Apple, Google, and Meta appears to be picking up. Also, affecting all of these players is an increasingly worrisome shortage of the computing power needed to run large AI models.

If a friend or colleague shared this newsletter with you, you can sign up to receive it every week here. And if you have comments on this issue and/or ideas for future ones, drop me a line at sullivan@fastcompany.com, and follow me on Twitter @thesullivan.

OpenAI’s worrisome week 

We’re in the eighth month of the generative AI arms race, a flurry of activity that was kicked off by the release of OpenAI’s ChatGPT in late 2022. While other big tech players like Meta and Google had for years been working on the large language models that power new AI chatbots, OpenAI was the first to understand that chatbots could achieve almost humanlike communication skills when given huge amounts of training data (scraped from the web) and massive computing power. OpenAI showed those results not by publishing research papers, but by letting people see for themselves with ChatGPT, and the effect was thunderous. OpenAI’s models came to be regarded as the state of the art, and rightly so.

But after eight months in the limelight, OpenAI doesn’t look nearly as unassailable as it did just three months ago. A growing number of developers who rely on OpenAI’s models have in recent weeks observed a decrease in the speed and accuracy of the models’ output (OpenAI has denied that performance is degrading). This is almost certainly related to a dearth of available computing power for running the company’s models. OpenAI, whose models run on Microsoft Azure servers, no longer enjoys the access to computing power that gave it its initial lead in the LLM race.

A well-placed source tells me that Microsoft executives (including Satya Nadella) now meet weekly with OpenAI to manage the server resources allocated to running the OpenAI LLMs. OpenAI is likely asking Microsoft for more GPU power, while Microsoft is no doubt asking OpenAI to find ways to economize. Microsoft is so concerned about the compute shortage that it has begun signing deals with smaller cloud startups to access more servers suited to AI. Meanwhile Meta, Google, and Apple have just as much money, as well as their own chip designs, for AI work.

And that’s not the only problem now rearing up against OpenAI. Meta just released a new open-source (free and available) LLM called Llama 2 that may rival OpenAI’s GPT-4 model. Apple is also now reportedly developing its own ChatGPT rival in hopes of catching up to OpenAI. (More on that below.) More significantly, both Google and Meta have figured out how to let LLMs react to and output images and words; OpenAI said its latest model, GPT-4, would be multimodal, but so far, its currency is words and computer code.

To top things off, the FTC has become very curious about OpenAI’s model development practices and business model. The agency, headed by Lina Khan, sent OpenAI a letter last week containing 20 pages worth of questions.

Apple has quietly got its GPT game on

As we predicted back in March, Apple has been pulled into the AI arms race, along with all the other FAANG companies.

Bloomberg‘s Mark Gurman reported Wednesday that Apple has been quietly developing its own generative AI models, and may try to compete with OpenAI and Google in the chatbot wars. Gurman’s (unnamed) sources say Apple management is still deciding how the company might publicly release the technology, which has a tendency to invent facts and, at times, invade privacy.

Apple has reportedly developed a new framework, referred to internally as “Ajax,” to develop LLMs. Gurman reports that the LLM chatbot project (referred to by some in the company as “Apple GPT”) has become “a major effort” within Apple, already involving collaboration between several teams.

Apple’s stock bounced up by 2.3% (that’s a quick gain of $60 billion in market cap) after the Bloomberg story appeared, serving as a reminder to all that big tech companies are beholden to the beliefs and whims of investors—and the investment community is all in on generative AI.

Why Google adding images to Bard is important

Some have likened the development of large language models to the development of human babies. Babies learn a lot about the world on their own (through their senses), and through their parents. The major difference, of course, is that while babies have all five senses to absorb information, LLMs have only words taken from the internet through which to learn about the world, as well as some human feedback on their output.

That’s why Google’s announcement last week that it has given its Bard LLM image support is important. Bard is certainly not the most performant LLM out there—in fact it’s poor in a number of ways relative to peers—but it’s become the first publicly available LLM chatbot with the gift of sight, if you will. Users can now input an image as a prompt, and Bard can analyze the image and provide more information about it. Imagine taking a photo of your lunch and asking Bard for an ingredient and caloric breakdown (hat tip: @BenBajarin). Or turning your handwritten meeting notes into organized text. Yes, it makes mistakes, glaring ones, but further training refinements may correct those.

Today it’s just still images, but tomorrow, Bard might be able to process real-time full motion video. It may be able to continually learn from an array of photo sensors placed within a nature preserve, for example. Or it might learn by digesting the entire corpus of YouTube videos that Google owns. If you were an extraterrestrial dropped to the earth, you’d find few ways of learning your new environment faster.

Meta has also been putting its research muscle behind multimodal AI. Last week the company announced a new text-to-image model called CM3leon that generates images from text prompts at high quality, as well as writes captions for existing images. So what of OpenAI? The company may have hesitated to allow its GPT-4 model to process images, The New York Times reports, because it’s afraid the model might recognize the faces of real people and say things about them.

More AI coverage from Fast Company:

From around the web:


Leave a Reply

Your email address will not be published. Required fields are marked *