How to build a custom vision agent
Join Gemini Enterprise Agent Ready (GEAR) for the latest agent resources → https://goo.gle/4xpCT5W
John's GitHub repo → https://goo.gle/4uuvsYi
Ever wanted to turn your webcam into a generative media engine? Google Developer Expert, John Capobianco shares how he created a custom vision agent using the Google Gemini ecosystem and MCP, capturing live photos with Nano Banana Pro and animating them using Veo 3. It even processes natural language prompts and real-time ASL conversations.
Speakers: John Capobianco
Products Mentioned: Gemini, Model Context Protocol, Nano Banana, Veo 3
Google Cloud Tech
Helping you build what's next with secure infrastructure, developer tools, APIs, data analytics and machine learning....