3DNews AI→ original

Google Gemini 3.5 Flash can now operate a computer on the user's behalf

Google added Computer Use to Gemini 3.5 Flash — the model can now operate a computer on the user's behalf: it clicks on the screen, fills out forms, and…

AI-processed from 3DNews AI; edited by Hamidun News
Google Gemini 3.5 Flash can now operate a computer on the user's behalf
Source: 3DNews AI. Collage: Hamidun News.
◐ Listen to article

Google Gemini 3.5 Flash has received the Computer Use feature—the ability to autonomously control a computer: press buttons, fill out forms, switch between applications, and perform multi-step tasks without human involvement. Google positions it as a corporate tool for automating operational processes, accessible through the Vertex AI cloud platform.

How Computer Control Works

The principle resembles an operator working at a screen: the model receives a screenshot, analyzes the interface, determines the next action—a click, text input, page scroll—and repeats the cycle until the task is completed. Gemini 3.5 Flash sees the screen the same way a human does, but acts faster and without fatigue. The choice of the Flash version is deliberate: it's the fastest model in the Gemini 3.5 lineup. For agentic tasks with long chains of sequential actions, response speed is critical—a slow agent accumulates delays with each step, and scenarios like automating dozens of forms turn into multi-hour processes. Flash solves this problem through low latency.

Among the stated capabilities:

  • Browser navigation and web form interaction
  • Desktop application control through GUI
  • Multi-step task execution without user intervention
  • Operation through Vertex AI with corporate access control
  • Action logging for audit and security compliance

For Whom and How to Get Access

Google divides Gemini development into two directions. The first is deep integration with Workspace: an intelligent assistant in Gmail, Docs, Sheets, and Slides that helps a broad audience without special technical knowledge. The second is agentic capabilities for the corporate sector, which Computer Use belongs to. The feature is available through Vertex AI—Google's corporate cloud platform. Companies will be able to embed agents in their own processes: automate work with legacy systems without APIs, delegate routine browser operations to finance or HR teams, and build internal tools based on Gemini with centralized management and logging.

It is important to understand that Computer Use is not simply an "automatic clicker." We are talking about a full-fledged agentic scenario where the model independently plans a chain of steps and adapts to results: if a page loads with a delay or an unexpected pop-up appears—the agent sees this and reacts.

Competition for Screen Control

The market for AI agents working with computer interfaces has become significantly more competitive over the past year. Anthropic released Claude Computer Use in October 2024, OpenAI launched Operator in early 2025, and Microsoft integrated agentic scenarios into Copilot Studio for Azure. Now Google joins them with its implementation based on one of the fastest models. Competition unfolds along several axes: interface recognition accuracy, speed of executing action chains, security, and corporate audit capabilities.

Google has a structural advantage that competitors lack: Gemini operates in an ecosystem where Gmail, Drive, and Calendar are already deployed. An agent that simultaneously sees the screen and has native access to corporate data through API gains a fundamentally different level of context—without additional prompt loading.

"We are building AI that not only answers questions, but performs work," —such is

Google's overall position regarding Gemini's strategy as an agentic platform.

What This Means

Computer control is transitioning from experimental capabilities to a standard product feature across all major AI providers. For business, this means real operational task automation right now—without rewriting legacy systems, without developing API integrations for each scenario, and without involving developers for basic automation. The question is no longer "will this work," but rather "who will implement it faster."

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Need AI working inside your business — not just in your newsfeed?

I build production AI for companies — custom CRM, internal tools, autonomous agents, workflow automation. Owned by you, shaped to your process, no per-seat tax. Built by Zhemal Khamidun, CPO of AlpinaGPT (AI platform, 6,000+ users).

What do you think?
Loading comments…