A model and an agent, working in one conversation — Gemini's Interactions API, with a real app

I gave a little app one sentence: BBQ and live music in Austin this afternoon, I've got my dog, keep it under a hundred bucks.
It pulled real Austin spots onto a live map, then handed the whole plan to an agent in the same conversation, and the agent wrote me a finished itinerary and saved it to a file. A model and an agent, splitting the work, connected by a single ID.
This week Google made the Interactions API generally available and called it the primary way to build with Gemini now, for both models and agents. If you build on Gemini, this changes how you do it. Here's the short version.
TL;DR
- What it is: The Interactions API — one call,
interactions.create, for both models and agents. GA this week. - The old way:
generateContent. One model, one answer, stateless. Memory, tools, agents — you wired all of it up yourself. - The trick: Google holds the conversation now. The next call inherits everything by passing one ID. No resending history.
- The catch: it's two turns, not one magic call. Maps and Search can't share a request, and the agent can't use Maps or your own functions yet.
Why it matters
Normally, every time your app talks to Gemini, it starts over — no memory of the last thing you said. So to hold a real conversation, your code resends the entire history on every message. It's hidden well enough that it feels like the model remembers, but behind the scenes you're hauling the whole transcript back and forth. Slower, pricier, and a pain to track once it gets long.
Think of the old way like calling support and re-explaining your whole problem to every new person. The new way, they already have your file open.
How it works
One surface, interactions.create, that does models, agents, tools, and memory. Gemini holds the conversation on its side. Your next call doesn't resend anything — it passes the previous interaction's ID and inherits the whole thread.
Picture a relay race. The first runner runs their leg, then passes the baton to the next. Same race, nothing restarts. That baton is the ID. In my app, the first leg is a model with the Google Maps tool — it finds the real places, the hours, the dog-friendly patios.
And here's the part I keep coming back to. The second leg isn't another model. It's an agent, Antigravity, in its own sandbox. Same interactions.create, you just pass an agent instead of a model. It picks up everything the model found and writes the file. No glue code. One ID.

Demo
I built Surya's Concierge to make this concrete. Tell it the afternoon you want, and watch a model and an agent split the work in one conversation, with a panel showing the real API calls underneath.
Here's the demo:
Open the hood and you see the first call hit a model with the Maps tool, the previous_interaction_id light up as it hands off, then the agent spin up its sandbox and write the file. All live, nothing faked.
Honest take
Let me be straight with you. It looks like one magic call. It isn't.
It's two turns. The model grounds in Maps, the agent writes the file — stitched into one conversation, which is the good part, but you can't put every tool in one request. Maps and Search can't run together; I wanted both and had to drop one. The agent has limits too: right now it can't use Maps or call your own functions, which is exactly why the model grounds first and hands off. And it isn't instant or free — each agent run spins up a sandbox, takes one to three minutes, and costs a little.

So the real question isn't whether this does everything in one shot. It doesn't. It's whether the shape is right: one call, one conversation, models and agents in the same place instead of three systems you glue together. And it is.
Close
I'll be building on this. A fast model that gathers, an agent that executes, one thread — that covers a lot of real jobs, and not wiring the plumbing myself is what saves the time. generateContent is fine for simple one-shot calls. But for anything with a conversation, tools, or an agent in it, this is the way now.
All opinions are my own and do not belong to my employer.
Full demo above on YouTube. Want me to push the agent harder — real code, a longer job? Drop a comment on the video.