Google AI Blog→ original

Gemini enters agent mode: Google's announcements at I/O 2026

At the I/O 2026 conference, Google presented a new vision of Gemini as a full-fledged agent capable of independently handling complex tasks in Workspace, the br

Gemini enters agent mode: Google's announcements at I/O 2026
Source: Google AI Blog. Collage: Hamidun News.
◐ Listen to article

On Google's annual I/O 2026 conference, Sundar Pichai announced the transition into the era of Gemini agents — AI assistants that will independently perform complex tasks in browsers, applications, and Google services.

What are Gemini Agents

The new generation of Gemini is no longer just a chatbot that answers questions. It's a full-fledged agent capable of seeing the screen, making decisions, and performing multi-step operations. When you give it a task like "book me a flight ticket for next Tuesday," the agent navigates the browser on its own, checks schedules, compares prices, and completes the purchase.

This functionality extends beyond the web. Agentic versions of Gemini are integrated into Android, Google Workspace (Gmail, Docs, Sheets, Slides), and even search. In Gmail, the agent can sort emails, compile brief summaries of conversations, and respond to standard requests. In Sheets — build charts based on data, automatically fill cells, and find patterns. In Docs — refactor text, search for contradictions, improve readability.

Under the hood, this works thanks to significant improvements in Gemini's ability to interpret pixels on screen and generate meaningful actions. The model has become more precise in its logic and less prone to random errors.

Usage Examples

For companies, the possibilities within Workspace are particularly interesting. Suppose you need to prepare a quarterly report: the agent takes data from analytics, inserts it into a spreadsheet, draws charts, writes conclusions. A task that would take two hours will be completed in minutes.

Let's consider several specific scenarios:

  • Automatic CRM population based on incoming emails
  • Building presentations by template with your data
  • Analyzing large datasets and identifying trends
  • Formatting documents with the required structure and templates
  • Transforming content between formats

For example, the marketing department can use the agent to collect metrics from different systems, analyze them, and create statistics. HR will be able to automate processing of time-off requests, quickly update employee registries. Engineers will be able to assign the agent to write unit tests based on source code.

Where There May Be Pitfalls

Of course, Google doesn't hide the fact that these are the first versions. The agent can make a mistake if it encounters a non-standard interface or a page requiring a CAPTCHA. Sometimes the result differs slightly from what the user envisioned. So in critical operations — for example, when processing large monetary transactions — human oversight is still needed.

Additionally, there's the question of confidentiality. The agent sees the entire screen, including private data. Google assures that everything is encrypted and processed in accordance with privacy policy, but corporate IT departments still need to verify this. There's also the compatibility question: not all websites and applications are willing to inform the browser agent about their structure. Some may interpret the agent's activity as unsafe and block it.

What This Means

This is a watershed moment for AI. Previously, neural networks helped humans (suggestions, editing, search), but now they can act relatively independently. This opens enormous potential for automating routine work. But it will also require companies to reconsider their processes and strengthen data control.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.
What do you think?
Loading comments…