Topic: computer use
-
Google's New AI Browses the Web Like a Human
Google has launched Gemini 2.5 Computer Use, an AI model that mimics human web browsing to automate interactions with websites lacking API access, such as completing online forms. This technology excels in user interface testing and digital navigation, building on prior agent-driven projects like...
Read More » -
Google Restructures Chrome Team as OpenClaw Gains Popularity
Google is restructuring its Project Mariner team, signaling a strategic shift away from standalone browser-based AI agents toward more versatile systems that integrate with computer operating systems. Browser automation agents have seen low user adoption and failed to meet commercial expectations...
Read More » -
GPT-5.4 Shatters Professional Benchmark Records
OpenAI has launched GPT-5.4, a powerful frontier model for professional work, available in standard, specialized "Thinking," and high-performance "Pro" configurations. The model shows strong performance, matching or surpassing experts in 83% of professional knowledge evaluations and achieving a 7...
Read More » -
OpenAI Launches GPT-5.4: Supercharged for Knowledge Work
OpenAI has launched GPT-5.4, a major update featuring specialized variants like GPT-5.4 Thinking and GPT-5.4 Pro to address complex tasks and compete in a crowded AI market. A key advancement is enabling **agentic workflows** for computer interaction, allowing the model to automate digital tasks ...
Read More » -
Claude Sonnet 4.5 Launches to Power Next-Gen AI Agents
Anthropic has launched Claude Sonnet 4.5, an AI model capable of 30 hours of autonomous operation, demonstrated by independently coding a functional chat app with 11,000 lines. The model is positioned as the world's leading AI for real-world agents and coding, excelling in sectors like cybersecur...
Read More » -
UiPath Partners with OpenAI to Automate Workflows
UiPath and OpenAI have partnered to integrate advanced AI models into enterprise workflows, aiming to accelerate the return on investment from agentic AI initiatives by simplifying development and deployment. The collaboration includes the introduction of a performance benchmark for computer-use ...
Read More » -
Scale Document Analysis with Vision Language Models
Vision Language Models (VLMs) merge visual and textual interpretation, enabling advanced document analysis by understanding the interplay between text placement and imagery. VLMs excel in tasks requiring visual context, such as identifying checked documents or interpreting screen contents, where ...
Read More » -
Claude 4.5 Boosts AI Agents Amid Cybersecurity Concerns
Anthropic has released Claude Opus 4.5, a new AI model that excels in coding, AI agent development, and computer interaction, with enhanced capabilities for research and software integration. The model faces persistent cybersecurity vulnerabilities, including susceptibility to sophisticated promp...
Read More »