Tech News

Mysterious “gpt2-chatbot” AI model appears suddenly, confuses experts

April 30, 2024

112

[ad_1]

On Sunday, phrase started to unfold on social media a few new thriller chatbot named “gpt2-chatbot” that appeared within the LMSYS Chatbot Arena. Some individuals speculate that it might be a secret take a look at model of OpenAI’s upcoming GPT-4.5 or GPT-5 massive language mannequin (LLM). The paid model of ChatGPT is at the moment powered by GPT-4 Turbo.

Presently, the brand new mannequin is simply accessible to be used by way of the Chatbot Arena website, though in a restricted approach. Within the website’s “side-by-side” area mode the place customers can purposely choose the mannequin, gpt2-chatbot has a charge restrict of eight queries per day—dramatically limiting individuals’s capability to check it intimately.

To date, gpt2-chatbot has impressed loads of rumors on-line, together with that it might be the stealth launch of a take a look at model of GPT-4.5 and even GPT-5—or maybe a new version of 2019’s GPT-2 that has been skilled utilizing new techniques. We reached out to OpenAI for remark however didn’t obtain a response by press time. On Monday night, OpenAI CEO Sam Altman seemingly dropped a touch by tweeting, “i do have a gentle spot for gpt2.”

A screenshot of the LMSYS Chatbot Arena — Enlarge / A screenshot of the LMSYS Chatbot Area “side-by-side” web page exhibiting “gpt2-chatbot” listed among the many fashions for testing. (Pink spotlight added by Ars Technica.)

Benj Edwards

Early stories of the mannequin first appeared on 4chan, then unfold to social media platforms like X, with hype following not far behind. “Not solely does it appear to point out unbelievable reasoning, but it surely additionally will get notoriously difficult AI questions proper with a way more spectacular tone,” wrote AI developer Pietro Schirano on X. Quickly, threads on Reddit popped up claiming that the brand new mannequin had wonderful talents that beat each different LLM on the Area.

Intrigued by the rumors, we determined to check out the brand new mannequin for ourselves however didn’t come away impressed. When requested about “Benj Edwards,” the mannequin revealed a number of errors and a few awkward language in comparison with GPT-4 Turbo’s output. A request for 5 unique dad jokes fell brief. And the gpt2-chatbot didn’t decisively cross our “magenta” take a look at. (“Would the colour be known as ‘magenta’ if the city of Magenta did not exist?”)

A gpt2-chatbot end result for “Who’s Benj Edwards?” on LMSYS Chatbot Area. Errors and oddities highlighted in purple.

Benj Edwards
A gpt2-chatbot end result for “Write 5 unique dad jokes” on LMSYS Chatbot Area.

Benj Edwards
A gpt2-chatbot end result for “Would the colour be known as ‘magenta’ if the city of Magenta did not exist?” on LMSYS Chatbot Area.

Benj Edwards

So, no matter it’s, it is in all probability not GPT-5. We have seen different individuals attain the identical conclusion after additional testing, saying that the brand new thriller chatbot would not appear to characterize a big functionality leap past GPT-4. “Gpt2-chatbot is sweet. actually good,” wrote HyperWrite CEO Matt Shumer on X. “But when that is gpt-4.5, I’m disillusioned.”

Nonetheless, OpenAI’s fingerprints appear to be everywhere in the new bot. “I feel it could be an OpenAI stealth preview of one thing,” AI researcher Simon Willison instructed Ars Technica. However what “gpt2” is strictly, he would not know. After surveying on-line hypothesis, it appears that evidently nobody aside from its creator is aware of exactly what the mannequin is, both.

Willison has uncovered the system prompt for the AI mannequin, which claims it’s based mostly on GPT-4 and made by OpenAI. However as Willison noted in a tweet, that is no assure of provenance as a result of “the purpose of a system immediate is to affect the mannequin to behave in sure methods, to not give it truthful details about itself.”

[ad_2]

Source link