My morning was spent exploring Gemini’s new integration with Chrome, an AI-powered assistant embedded directly into the browser. This enhancement allows users to initiate conversations by clicking a button located in the top-right corner of Chrome, eliminating the need to navigate to the chatbot’s web app. A notable feature of this integration is Gemini’s ability to “see” the content displayed on your screen as you browse.
The launch of Gemini in Chrome appears to be a preliminary step in Google’s broader vision to make AI increasingly interactive. During my use, I found myself wishing for the assistant to perform more functions than it currently could. At this stage, access to the early version of Gemini in Chrome is limited to AI Pro or AI Ultra subscribers, or users operating the Beta, Dev, or Canary versions of Chrome.
I began testing Gemini by summarizing articles from Technology News and gathering gaming news from their homepage. It highlighted new Game Boy titles added to Nintendo’s Switch Online service, information about an upcoming film adaptation of Elden Ring, and a significant update for Valve’s Steam Deck. However, Gemini can only analyze content that’s visible on the page, meaning that certain sections, like comments, need to be expanded for adequate summarization.
Gemini tracks your activity across tabs but can only provide insights from one tab at a time. For users who prefer not to type, the assistant includes a “Live” feature accessible via a button in the dialogue box. This allows users to ask questions verbally, and Gemini responds with spoken answers.
This feature proved particularly useful while watching YouTube videos. For instance, I was able to ask Gemini about the tool used in a bathroom remodeling tutorial, and it promptly identified a nail gun. In another video, it accurately recognized a capacitor on a motherboard along with the tools the YouTuber utilized. Gemini can summarize video content, but its accuracy can vary, especially if there aren’t labeled chapters available.
One of the standout applications for Gemini is its ability to extract recipes from YouTube videos, eliminating the need to manually jot them down or search the video descriptions. I also tested its capacity to locate waterproof bags on Amazon’s search page successfully.
Nonetheless, Gemini’s performance wasn’t flawless. For example, when I inquired about MrBeast’s location during a video about ancient Mayan sites, it stated it lacked real-time information to pinpoint his exact whereabouts. Upon repeating the question, it provided an answer based on the video description—Mexico. Similarly, when seeking a purchase link for a specific tool featured in a video, it again noted a limitation regarding real-time data, yet it managed to suggest alternative products when requested.
At times, I found the responses from Gemini rather lengthy for the small dialogue window provided in Chrome. While it can be resized, screen space on my 13-inch MacBook Air was limited. Since one of the key advantages of AI is time efficiency through succinct answers, Gemini does not always deliver on that expectation without specific prompts. Additionally, its follow-up questions could become somewhat repetitive.
Despite these challenges, I envision Google expanding the capabilities of Gemini in Chrome beyond mere Q&A. The company’s desire to develop an “agentic” AI, capable of performing tasks autonomously, hints at future enhancements. For instance, after asking Gemini to summarize a restaurant menu, I contemplated the potential of requesting it to place an order for pickup—a task it currently cannot execute. Looking ahead, I anticipate features that might allow Gemini to bookmark travel research pages or compile YouTube cooking videos to my Watch Later playlist. With Project Mariner’s forthcoming “Agent Mode” destined for the Gemini app, Google appears to be moving closer to making such functionalities available in Chrome as well.