OpenAI unveils easy voice assistant creation at 2024 developer event

0
26


Benj Edwards

On Monday, OpenAI kicked off its annual DevDay occasion in San Francisco, unveiling 4 main API updates for builders that combine the corporate’s AI fashions into their merchandise. Not like final yr’s single-location event that includes a keynote by CEO Sam Altman, DevDay 2024 is extra than simply in the future, adopting a world strategy with further occasions deliberate for London on October 30 and Singapore on November 21.

The San Francisco occasion, which was invitation-only and closed to press, featured on-stage speakers going by way of technical shows. Maybe essentially the most notable new API function is the Realtime API, now in public beta, which helps speech-to-speech conversations utilizing six preset voices and allows builders to construct options similar to ChatGPT’s Advanced Voice Mode (AVM) into their purposes.

OpenAI says that the Realtime API streamlines the method of making voice assistants. Beforehand, builders had to make use of a number of fashions for speech recognition, textual content processing, and text-to-speech conversion. Now, they will deal with the complete course of with a single API name.

The corporate plans so as to add audio enter and output capabilities to its Chat Completions API within the subsequent few weeks, permitting builders to enter textual content or audio and obtain responses in both format.

Two new choices for cheaper inference

OpenAI additionally introduced two options which will assist builders steadiness efficiency and value when making AI purposes. “Mannequin distillation” gives a method for builders to fine-tune (customise) smaller, cheaper fashions like GPT-4o mini utilizing outputs from extra superior fashions corresponding to GPT-4o and o1-preview. This doubtlessly permits builders to get extra related and correct outputs whereas operating the cheaper mannequin.

Additionally, OpenAI introduced “immediate caching,” a function much like one introduced by Anthropic for its Claude API in August. It hurries up inference (the AI mannequin producing outputs) by remembering often used prompts (enter tokens). Alongside the way in which, the function gives a 50 p.c low cost on enter tokens and sooner processing occasions by reusing not too long ago seen enter tokens.

And final however not least, the corporate expanded its fine-tuning capabilities to incorporate photos (what it calls “imaginative and prescient fine-tuning”), permitting builders to customise GPT-4o by feeding it each customized photos and textual content. Principally, builders can educate the multimodal model of GPT-4o to visually acknowledge sure issues. OpenAI says the brand new function opens up prospects for improved visible search performance, extra correct object detection for autonomous autos, and presumably enhanced medical picture evaluation.

The place’s the Sam Altman keynote?

OpenAI CEO Sam Altman speaks during the OpenAI DevDay event on November 6, 2023, in San Francisco.
Enlarge / OpenAI CEO Sam Altman speaks through the OpenAI DevDay occasion on November 6, 2023, in San Francisco.

Getty Photos

Not like final yr, DevDay is not being streamed dwell, although OpenAI plans to submit content material in a while its YouTube channel. The occasion’s programming consists of breakout periods, neighborhood spotlights, and demos. However the largest change since final yr is the dearth of a keynote look from the corporate’s CEO. This yr, the keynote was dealt with by the OpenAI product workforce.

On final yr’s inaugural DevDay, November 6, 2023, OpenAI CEO Sam Altman delivered a Steve Jobs-style live keynote to assembled builders, OpenAI workers, and the press. Throughout his presentation, Microsoft CEO Satya Nadella made a shock look, speaking up the partnership between the businesses.

Eleven days later, the OpenAI board fired Altman, triggering every week of turmoil that resulted in Altman’s return as CEO and a new board of directors. Simply after the firing, Kara Swisher relayed insider sources that stated Altman’s DevDay keynote and the introduction of the GPT store had been a precipitating issue within the firing (although not the key factor) as a result of some inside disagreements over the corporate’s extra consumer-like path for the reason that launch of ChatGPT.

With that historical past in thoughts—and the concentrate on builders above all else for this occasion—maybe the corporate determined it was finest to let Altman step away from the keynote and let OpenAI’s expertise grow to be the important thing focus of the occasion as an alternative of him. We’re purely speculating on that time, however OpenAI has definitely skilled its share of drama over the previous month, so it could have been a prudent determination.

Regardless of the dearth of a keynote, Altman is current at Dev Day San Francisco right now and is scheduled to do a closing “fireplace chat” on the finish (which has not but occurred as of this writing). Additionally, Altman made a statement about DevDay on X, noting that since final yr’s DevDay, OpenAI had seen some dramatic adjustments (actually):

From final devday to this one:

*98% lower in value per token from GPT-4 to 4o mini
*50x improve in token quantity throughout our methods
*glorious mannequin intelligence progress
*(and a little bit little bit of drama alongside the way in which)

In a follow-up tweet delivered in his trademark lowercase, Altman shared a forward-looking message that referenced the corporate’s quest for human-level AI, typically referred to as AGI: “excited to make much more progress from this devday to the following one,” he wrote. “the trail to agi has by no means felt extra clear.”



Source link