From the article:
That Google memo about having “no moat” in AI was real — and Google’s AI boss disagrees with it
Just a couple of months ago, a leaked memo said to be from a Google researcher cast doubt on the company’s future in AI, stating that it has “no moat” in the industry — and now, we seemingly have confirmation that it was real. In an interview with Decoder, Demis Hassabis, the CEO of Google’s DeepMind, told The Verge that although he believes the memo was legitimate, he disagrees with its conclusions.
“I think that memo was real. I think engineers at Google often write various documents, and sometimes they get leaked and go viral,” Hassabis said. “I think it’s interesting to listen to them, and then you’ve got to chart your own course. And I haven’t read that specific memo in detail, but I disagree with the conclusions from that.”
The memo, which was obtained by SemiAnalysis from a public Discord server, says that neither Google nor OpenAI have what they need to succeed in the AI industry. Instead, the researcher claims “a third faction has been quietly eating our lunch”: open-source AI models that the researcher says are “faster, more customizable, more private, and pound-for-pound more capable.”
But Hassabis is less pessimistic about Google’s future in the AI industry. He believes that the competitive nature of the company’s researchers will help push Google to the forefront of AI, adding that the newly merged Google Brain and Google DeepMind teams, which Hassabis was asked to lead, will likely result in more breakthroughs.
“Look at the history of what Google and DeepMind have done in terms of coming up with new innovations and breakthroughs,” Hassabis said. “I would bet on us, and I’m certainly very confident that that will continue and actually be even more true over the next decade in terms of us producing the next key breakthroughs just like we did in the past.”
And I disagree with it too. And it’s not because of how good the models are in technical terms, the corporate juggernauts are only just ahead of OSS on that front… it’s server space and the money to acquire it that is the moat.
An average internet user will not install the Vicunas and the Pygmalions and the LLaMAs of the LLM space. Why?
For one, the field is too complicated to get into, but, more importantly, a lot of people can’t.
Even the lowest complexity models require a PC + graphics card with a fairly beefy amount of VRAM (6GB at bare minimum), and the ones that can go toe-to-toe with ChatGPT are barely runnable on even the most monstrous of cards. No one is gonna shell out 1500 bucks for the 4090 just so they can run Vicuna-30B.
They are gonna use online, free-to-use, no BS, no technical jargon LLM services. All the major companies know that.
ChatGPT and services like it have made the expectation: “just type it in, get amazing response in seconds, no matter where”.
OSS can’t beat that, at least not right now. And until it can, the 99% will be in Silicon Valley’s jaws.
Have you looked into AIHorde?
It’s clearly harder to use than the commercial alternatives but at first glance it doesn’t seem to bad.
It looks about as complicated as setting up any of the other volunteer compute projects (like SETI@home).
I didn’t know about it, but it looks really neat. Gonna give it a spin to help me summarize documentation.
Edit: I meant to reply to the article, sorry 😅
Given the pace of oss optimisation, I fully expect the requirements for a gpt3.5 equivalent performance model to be much lower in the coming year. The biggest issues are around training or fine tuning right now. Inference is cheaper, resource wise. For truly large models, the moat is most definitely gpu compute and power constraints. Those who own their own gpu farms will be at an advantage until there is significant increase in cloud gpu capacity - right now, cloud gpu is at a premium, and can also include wait time for access. I don’t expect this to change in the next year or two.
Tl;dr; moat is real, but it’s gpu and power constraints.
I hope to god you are right. What will truly be a revolution is if somehow these models can be transitioned to CPU-bound rather than GPU without completely tanking performance. Then we can start talking about running it on phones and laptops.
But I don’t know how much more you can squeeze out of the LLM stone. I’m surprised that we got what was essentially a brute-forcing of concepts, with massive catalogs of data, rather than one more hand-crafted/built from scratch. Maybe there is another way to go about? God I hope so, so OSS can use it before the big guys convince governments to drop the hammer.
I can see most individuals and SMBs going with specialist “good enough” models which they can run on prem/ locally, leaving the truly huge systems to those with compute to spare. The security model for these MAAS systems is pretty much “trust me bro”. A lot of companies will not want to, or be able to, trust such a system. PI/CID can not be left in the hands of the ai as a service company. They will have to either go on prem, or stand up their own models in their private cloud. Again, this limits model size for orgs, available compute etc. This points to using available models, optimised, etc. OSS FTW (I hope)