By using this site, you agree to our Privacy Policy and our Terms of Use. Close
Soundwave said:
sc94597 said:

A hawk is good at seeing and reacting to the world. That doesn't mean it is a good surgeon. Its brain is highly specialized for hunting.

Likewise, machine learning models have architectures. An architecture designed for learning how to drive isn't necessarily the same architecture that is good at surgery. And sure, you might be able to design an A.I that does both but it will almost certainly be worse at either than an A.I that is trained at driving or surgery alone (given the same architecture.) 

Heck, even within the same architecture we see the advantages of specialization vs. generalization. The LLM's that are used in industry and fine-tuned on domain specific data surpass GPT 4  at tasks related to that specific data. Or you can even train LLM's that surpass GPT4 at specific tasks (say coding) without surpassing it generally. 

And of course this all makes sense. Learning one thing has an opportunity cost in that you aren't learning everything else while you spend time learning that one thing, but you do learn that one thing very well. And then there is the issue of information assymetry and knowledge problems arising in centralized systems. 

Being super-human =|= being unconstrained or having no opportunity costs. 

A hawk has no incentive for wanting to be a surgeon though, lol. The owner of an AI model would be highly incentivized to try to expand it to as many things as possible, we already see this even today, was text prompting and static image creation enough for OpenAI and ChatGPT? No. They're not content with just that. Now they are trying to move into movies a year later with Sora AI, not content to just stick to creating image prompts. Next year this time they'll probably be touting something else it can do. 

And even within the Sora presentation, they showed off procedurally generated video games like a Minecraft demo. So obviously they didn't get the memo that their AI algorithms are only supposed to specialize in one thing and one thing only. 

What do you think is going to be more popular? An AI that can only spit out images or one that does several different things very well? That's not terribly difficult to predict which one will rapidly get more attention and more market share (and thus funding creating a snowball effect). Google isn't the no.1 search engine because it searches for one type of thing better than everyone else, it's the no.1 search engine because it searches for everything on average better than any competing search engine and thus has gained the lead "market share" in internet search and maintained that for years now. 

The thing limiting the hawk from becoming a surgeon isn't its lack of incentive, lol. 

The owner of the AI model would be even more incentivized to create a sub-model that better performs the task for less cost (training time and resources.) 

Sora isn't doing anything special when it generates video games. A video game is essentially a video with controllable inputs after-all. 

You wouldn't use Sora to generate text, for example. It is an entirely different category of model from an LLM (diffusion model that uses CNN's to noise/denoise vs. transformer-based text-to-text generator trained using RLHF.)

That is what I mean when I suggest architecture matters. For example, when you ask ChatGPT to produce images what it really does is call a diffusion-model (Dall-E) to do it. You don't train GPT to produce images, Chat GPT inputs a text-prompt into Dall-E and Dall-E produces the image, and ChatGPT returns the image to you. Why? Because rather than train one model to do everything, we get better results by training multiple different models that are good at different things and then query them. 

These are basic things that anyone talking about the topic should know.

Last edited by sc94597 - on 04 March 2024