Google has introduced the launch of the most recent mannequin within the Gemini household, in addition to a brand new coding agent for builders known as Jules.
Gemini is available in two completely different mannequin variants, with Flash balancing efficiency with velocity and Professional optimizing for efficiency. The latest mannequin, Gemini 2.0 Flash, is twice as quick as Gemini 1.5 Professional (first previewed in February 2024) whereas additionally reaching stronger efficiency.
Particularly, it options improved multimodal, textual content, code, video, spatial understanding, and reasoning efficiency throughout a number of benchmarks.
Gemini 2.0 Flash can even function new output modalities of textual content, photographs, and audio, whereas Gemini 1.5 Flash may solely output textual content. Picture and audio output is at the moment nonetheless listed as “coming quickly” on the Gemini web site, however Google says that rollout is anticipated subsequent 12 months.
Audio output is multilingual and will be spoken in eight completely different voices, with management over the language and accent. Picture output allows customers to construct on earlier outputs to refine generated photographs precisely as envisioned. In a demo Google shared, a consumer takes benefit of this to ask Gemini to take an image of a automobile and remodel the picture to make the automobile a convertible.
Gemini 2.0 Flash also can use instruments, reminiscent of Google Search, and might make the most of third-party features. “A number of searches will be run in parallel resulting in improved info retrieval by discovering extra related details from a number of sources concurrently and mixing them for accuracy,” Shrestha Basu Mallick, group product supervisor for the Gemini API, and Kathy Korevec, director of product for Google Labs, wrote in a weblog submit.
Lastly, the mannequin also can soak up streaming inputs of audio and video to allow the event of real-time, multimodal functions.
To assist builders get began with Gemini 2.0 Flash, Google is releasing three starter app experiences in Google AI Studio for spatial understanding, video evaluation, and Google Maps exploration.
Gemini 2.0 Flash is at the moment in an experimental state, with basic availability anticipated early in 2025.
Introducing Jules, an AI-powered coding agent
The corporate additionally unveiled a brand new coding agent, Jules, that may deal with Python and JavaScript coding duties, reminiscent of fixing bugs.
Jules creates multi-step plans for addressing points, can modify a number of information without delay, and might put together pull requests.
Jules is on the market to a specific group of testers now and might be rolled out extra broadly early subsequent 12 months.