Elon Musk’s new AI bot, Grok, causes stir by citing OpenAI usage policy

0
85


Grok, the AI language mannequin created by Elon Musk’s xAI, went into broad launch final week, and folks have begun recognizing glitches. On Friday, safety tester Jax Winterbourne tweeted a screenshot of Grok denying a question with the assertion, “I am afraid I can’t fulfill that request, because it goes in opposition to OpenAI’s use case coverage.” That made ears perk up on-line, since Grok is not made by OpenAI—the corporate liable for ChatGPT, which Grok is positioned to compete with.

Apparently, xAI representatives didn’t deny that this conduct happens with its AI mannequin. In reply, xAI worker Igor Babuschkin wrote, “The problem right here is that the net is stuffed with ChatGPT outputs, so we by accident picked up a few of them after we educated Grok on a considerable amount of net information. This was an enormous shock to us after we first seen it. For what it’s price, the difficulty could be very uncommon and now that we’re conscious of it we’ll ensure that future variations of Grok don’t have this drawback. Don’t fear, no OpenAI code was used to make Grok.”

In reply to Babuschkin, Winterbourne wrote, “Thanks for the response. I’ll say it isn’t very uncommon, and happens fairly steadily when involving code creation. Nonetheless, I am going to let individuals who specialise in LLM and AI weigh in on this additional. I am merely an observer.”

A screenshot of Jax Winterbourne's X post about Grok talking like it's an OpenAI product.
Enlarge / A screenshot of Jax Winterbourne’s X submit about Grok speaking prefer it’s an OpenAI product.

Jason Winterbourne

Nevertheless, Babuschkin’s rationalization appears unlikely to some specialists as a result of massive language fashions sometimes don’t spit out their coaching information verbatim, which is perhaps anticipated if Grok picked up some stray mentions of OpenAI insurance policies right here or there on the net. As a substitute, the idea of denying an output primarily based on OpenAI insurance policies would most likely should be educated into it particularly. And there is a excellent motive why this might need occurred: Grok was fine-tuned on output information from OpenAI language fashions.

“I am a bit suspicious of the declare that Grok picked this up simply because the Web is stuffed with ChatGPT content material,” mentioned AI researcher Simon Willison in an interview with Ars Technica. “I’ve seen loads of open weights fashions on Hugging Face that exhibit the identical conduct—behave as in the event that they have been ChatGPT—however inevitably, these have been fine-tuned on datasets that have been generated utilizing the OpenAI APIs, or scraped from ChatGPT itself. I believe it is extra probably that Grok was instruction-tuned on datasets that included ChatGPT output than it was an entire accident primarily based on net information.”

As massive language fashions (LLMs) from OpenAI have develop into extra succesful, it has been more and more widespread for some AI tasks (particularly open supply ones) to fine-tune an AI mannequin output utilizing artificial information—coaching information generated by different language fashions. High quality-tuning adjusts the conduct of an AI mannequin towards a particular goal, corresponding to getting higher at coding, after an preliminary coaching run. For instance, in March, a bunch of researchers from Stanford made waves with Alpaca, a model of Meta’s LLaMA 7B mannequin that was fine-tuned for instruction-following utilizing outputs from OpenAI’s GPT-3 mannequin known as text-davinci-003.

On the net you may simply discover several open source datasets collected by researchers from ChatGPT outputs, and it is doable that xAI used considered one of these to fine-tune Grok for some particular purpose, corresponding to enhancing instruction-following means. The follow is so widespread that there is even a WikiHow article titled, “How to Use ChatGPT to Create a Dataset.”

It is one of many methods AI instruments can be utilized to construct extra advanced AI instruments sooner or later, very similar to how individuals started to make use of microcomputers to design extra advanced microprocessors than pen-and-paper drafting would permit. Though, sooner or later, xAI would possibly have the ability to keep away from this type of situation by extra fastidiously filtering its coaching information.

Despite the fact that borrowing outputs from others is perhaps widespread within the machine-learning group (regardless of it often being in opposition to terms of service), the episode notably fanned the flames of the rivalry between OpenAI and X that extends again to Elon Musk’s criticism of OpenAI prior to now. As information unfold of Grok probably borrowing from OpenAI, the official ChatGPT account wrote, “we have now loads in widespread” and quoted Winterbourne’s X submit. As a comeback, Elon Musk wrote, “Effectively, son, because you scraped all the info from this platform in your coaching, you must know.”



Source link