Deer in the Living Room: A Gemini Home Story

▼ Summary
– Google’s Gemini only processes and summarizes event clips from videos rather than analyzing entire footage to conserve computing resources.
– The Gemini model is not multimodal, focusing solely on visual elements and excluding audio from recordings to prevent conversation analysis.
– The Ask Home chatbot feature allows users to ask questions about home events, retrieve video clips, and create automations based on device status and footage.
– Ask Home excels at creating automations from natural language requests and finding past event clips, though Gemini’s video understanding has some limitations.
– Video footage is retained for 60 days on the Advanced plan and not used for training unless manually lent to Google, while user interactions with Gemini help refine the model.
When you subscribe to Google’s Gemini Home service, you’re not uploading every second of your security footage for analysis. Google states that sending complete video streams would consume excessive processing power, so the system focuses only on event clips. These brief video segments are analyzed and summarized, then compiled into a Daily Brief. This report typically outlines routine activities like family members moving between rooms or delivery personnel leaving packages.
A key point to understand is that the underlying Gemini model operates with a specific limitation. It functions as a vision-only AI, meaning it interprets the visual content of your videos but completely ignores any audio. Sounds such as unusual noises or conversations picked up by your cameras remain outside its processing scope. This design choice appears intentional, serving as a privacy measure to prevent the AI from repeating or analyzing private discussions.
Subscribers to the AI-enhanced plan also gain access to a feature called Ask Home. This conversational chatbot allows you to pose questions about occurrences in your house, drawing from both your smart device statuses and your video history. You can inquire about specific events, request to view related video clips, and even set up automated routines through dialogue.
While Gemini’s video comprehension isn’t flawless, the Ask Home feature demonstrates particular strength in building automations. Previous versions of the Home app allowed for manual automation setup, but the new AI can construct these routines based on plain English requests. The system’s high success rate in creating accurate automations likely stems from the constrained number of available smart home actions. Ask Home also proves reliable at locating past event recordings, provided your search criteria are precise and detailed.
Under the Advanced Gemini Home plan, your video history remains accessible for 60 days, setting the boundary for how far back you can question the assistant. Google explicitly notes that it does not hold onto this footage for training its AI models, with one exception. You may voluntarily “lend” your videos to Google through a little-known setting within the Home app. If you enable this option, Google may retain the footage for as long as eighteen months, or until you choose to withdraw permission. It’s important to recognize that your direct engagements with Gemini, including the prompts you type and your feedback on its responses, are utilized to enhance the model’s performance.
(Source: Ars Technica)





