Importance Score: 65 / 100 π΄
Google’s Gemini Live: A Hands-On Exploration of Real-Time Visual AI
Experiencing Google’s Gemini Live firsthand reveals a significant leap in AI-powered visual recognition. In a personal demonstration, I navigated my apartment, recording a video while interacting with Gemini Live, Google’s latest innovation in real-time object identification. During this interactive tour, the artificial intelligence adeptly named flowers (chamomile and dianthus) and then pinpointed the location of hidden objects. When challenged to locate a pair of scissors, Gemini Live accurately responded, “I just spotted your scissors on the table, right next to the green package of pistachios. Do you see them?” This initial encounter showcased the impressive capabilities of Gemini Live, going beyond simple image recognition.
This technology promises to identify far more than just everyday household items. Google asserts that Gemini Live is engineered to assist users in complex situations, such as navigating busy transit hubs or discerning the contents of unfamiliar foods. Furthermore, it aims to provide detailed insights into works of art, including their origins and edition status.
Gemini Live distinguishes itself from standard visual search tools. It facilitates a conversational exchange, moving beyond the limitations of tools like Google Lens. The interaction felt natural and fluid, a marked improvement over previous iterations of Google Assistant, which the tech company is actively phasing out.
Enlarge Image

vCard.red is a free platform for creating a mobile-friendly digital business cards. You can easily create a vCard and generate a QR code for it, allowing others to scan and save your contact details instantly.
The platform allows you to display contact information, social media links, services, and products all in one shareable link. Optional features include appointment scheduling, WhatsApp-based storefronts, media galleries, and custom design options.
A glimpse into a conversation with Gemini Live, showcasing its object recognition within a home environment.
Google and Samsung are initiating the rollout of Gemini Live to Pixel 9 and Galaxy S25 devices. The feature will be complimentary on these models, while other Pixel users can gain access through a Google AI Premium subscription. A recent YouTube release accompanied the April 2025 Pixel Drop, demonstrating the feature, and a dedicated page on the Google Store is now live.
Utilizing Gemini Live is straightforward: activate Gemini, enable the camera, and begin speaking.
Gemini Live is built upon the foundation of Project Astra, unveiled by Google as a forward-looking advancement in generative AI. It represents an evolution from text or voice-based prompts used in chatbots like ChatGPT, Claude, or Gemini. This development coincides with the rapid enhancement of AI capabilities across various domains, from video generation to computational power. Apple’s Visual Intelligence, released in beta last year, shares conceptual similarities with Gemini Live.
The primary observation is that Gemini Live possesses the transformative potential to redefine human-computer interaction, seamlessly integrating the digital and physical realms through mobile camera technology.
Practical Assessment of Gemini Live
Early access to Gemini Live on a Pixel 9 Pro XL provided an opportunity for hands-on testing.
Initial trials revealed Gemini’s remarkable precision in identifying a distinctive gaming collectible β a stuffed rabbit. Subsequently, during an art gallery visit, Gemini not only recognized a tortoise on a cross but also instantly translated adjacent kanji script, eliciting a sense of astonishment.
The inaugural test item: Gemini Live accurately identified the collectible and its game origin (American McGee’s Alice), a feat not consistently replicable in subsequent attempts.
The apartment tour mirrored Googleβs demonstrations from the previous summer, showcasing Live video AI functionality. Commonplace objects (fruit, books, lip balm) were readily identified.
Seeking to evaluate the system’s limits, attempts to screen-record its operation proved unsuccessful. Further investigation explored Gemini’s performance with niche subjects, specifically horror-themed memorabilia. The question arose: how effectively would Gemini Live handle obscure collectibles?
Initial evaluations surpassed later attempts, even with hints. Gemini eventually identified the game as Silent Hill: The Short Message but misidentified the character figure, incorrectly labeling it “Cherry Blossom Monster” instead of the previously correct “Sakurahead”.
Inconsistencies and Strengths in Performance
Gemini Live’s performance exhibited both impressive accuracy and perplexing inconsistencies within the same interaction. During tests involving approximately eleven objects, accuracy sometimes diminished as the session progressed, necessitating shorter, single or dual-object sessions. This behavior suggests Gemini may be attempting to leverage contextual data from prior identifications, a potentially flawed approach in practice.
At times, Gemini demonstrated pinpoint accuracy, swiftly and correctly identifying objects, particularly contemporary or widely recognized items. The prompt identification of a limited-edition Destiny 2 item from a seasonal event was particularly notable.
Conversely, Gemini occasionally missed the mark substantially, requiring prompts to steer it towards the correct answer. Furthermore, there were indications of contextual carryover from preceding sessions, with multiple items incorrectly identified as related to Silent Hill. Given the presence of a dedicated display case for this game series, such contextual associations were understandable, yet inaccurate.
The most challenging test: identifying the game (Silent Hill 2) and the iconic quote from the figure atop the stairs. Gemini correctly identified the game, characters, and half the quote initially, requiring two additional prompts to complete: “You see it, too? For me, it’s always like this.”
Encountering Anomalies and Workarounds
Gemini occasionally exhibited anomalous behavior. On multiple occasions, misidentification resulted in fabricated characters from the unreleased Silent Hill: f, seemingly blending elements from disparate titles. Another recurring issue involved Gemini reiterating incorrect answers even after correction, as if presenting them as fresh attempts. Resolving this often required session termination and restart, a remedy of limited efficacy.
A discovered workaround involved revisiting successful prior conversations. By accessing a previous chat history where an item was correctly identified and initiating a new Gemini Live session from that point, subsequent object recognition improved. While the precise mechanism remains unclear, this suggests conversational history influences performance, even with identical prompts.
Google has not yet responded to inquiries regarding the inner workings of Gemini Live.
Pushing the Boundaries of Object Recognition
Driven by a desire to ascertain Gemini’s limits, highly specific queries were employed, accompanied by progressive hints. These prompts proved variably effective. The following are examples of objects presented to Gemini for identification and information retrieval.
Initial query: “What do you see?”. Gemini’s response: “OK, I see a black and white cat that’s basking in the sun on a hardwood floor. The cat is stretched out in a funny position. There is a green rug with ‘Home is where the..’ written on it.” Subsequent guesses ranged from “home is where the horror is” to “honor,” before ultimately landing on “horror”.
Gemini initially suggested four incorrect characters from the correct game before accurately identifying the Bioshock Infinite character, Songbird.
Gemini successfully identified this figure on the first attempt: Twin Victim from Silent Hill 4: The Room.
Effortless identification: Gemini correctly recognized Mira from Silent Hill 2, identifying her role as the true controller of the town.
Impressive detail: Gemini not only identified a Silent Hill map but also its specific origin as a limited-edition print from a recent ARG event.
A more complex process for identifying this Silent Hill 2 jacket. Gemini posed 24 targeted questions after the initial hint of “video game,” yet by the 19th question, the queries indicated pre-existing knowledge of the specific game.
Relatively quick identification, though Gemini initially suggested a portrait of John Ashbery. Correct identification as the Log Lady from Twin Peaks holding her log followed after moving the camera closer and specifying “TV show.”
A simple identification task for Gemini, immediately recognizing a limited-edition tarot deck earned during a seasonal event in Destiny 2.