OpenAI has launched a research preview of Operator, a general-purpose AI agent capable of independently performing tasks by taking control of a web browser. This feature is first available to U.S. users on ChatGPT’s $200 Pro subscription plan, with plans to expand to additional user tiers in the future.
OpenAI Operator explainedOperator can automate various tasks, including booking travel accommodations, making restaurant reservations, and online shopping. Users can select from categories such as shopping, delivery, dining, and travel within the Operator interface. When activated, a dedicated web browser window pops up, showing users the actions Operator performs alongside explanations. Users can maintain control of their screens while Operator operates in its own browser environment.
OpenAI claims superintelligence is closer than you think
The AI agent is powered by a Computer-Using Agent (CUA) model, which combines the vision capabilities of the GPT-4o model with advanced reasoning. CUA interacts with the front end of websites without requiring developer-focused APIs. This functionality allows it to use buttons, navigate menus, and fill out forms as a human would. OpenAI collaborates with various companies, including DoorDash, eBay, Instacart, and Priceline, ensuring Operator adheres to their terms of service agreements.
OpenAI states that the CUA model is designed to ask for user confirmation before finalizing tasks that have external effects, such as submitting an order or sending an email. Despite its capabilities, OpenAI cautions that CUA may not perform reliably in all scenarios and struggles with complex tasks like creating detailed slideshows, managing intricate calendars, or navigating non-standard web interfaces.
For sensitive tasks, such as banking transactions, user supervision is required. Operator does not collect or screenshot user data, and it mandates direct oversight on particularly sensitive sites like email and financial services, enabling users to address any errors promptly.
Operator has certain limitations. OpenAI enforces rate limits—both daily and task-dependent—and specifies that certain tasks, like sending emails or deleting calendar events, will be refused for security reasons. OpenAI plans to revise these restrictions in the future, although no specific timeline is provided.https://www.youtube.com/watch?v=m0Cjiq8P6iU
Operator may also encounter difficulties with complex web interfaces, password fields, and CAPTCHA checks, prompting the user to intervene at that point. OpenAI acknowledges the safety risks associated with AI systems that can take actions on the web, emphasizing the necessity to prevent potential exploits by malicious actors.
OpenAI has implemented several safety measures. The agent requests user control input during sensitive transactions and conducts user confirmations before significant actions. Operator rejects specific high-risk tasks and requires direct supervision on sensitive platforms. Investigative measures include cautious navigation to prevent prompt injections, a monitoring system to pause operations during suspicious activities, and an automated detection pipeline for updated safeguards.
What is OpenAI’s Operator, and how does it work?Operator is a general-purpose AI agent that can autonomously perform tasks on the web using a dedicated browser. It interacts with websites by clicking buttons, navigating menus, and filling forms.
How is Operator different from other AI tools like Siri, Alexa, or Google Assistant?Unlike traditional assistants, Operator doesn’t just process information; it can perform actions on the web, like booking accommodations or ordering groceries, by interacting with websites directly.
What tasks can Operator perform autonomously?It can handle repetitive tasks like booking travel, ordering food, making reservations, and shopping online.
Why is Operator being launched as a research preview first?The research preview allows OpenAI to gather feedback, improve safety, and refine the tool before wider deployment.
What is the Computer-Using Agent (CUA) model, and how does it enable Operator to interact with websites?CUA combines GPT-4o’s vision capabilities with advanced reasoning, enabling Operator to see and interact with graphical user interfaces like buttons and forms.
Can Operator perform complex tasks like creating slideshows or managing calendars?Not yet. Operator struggles with complex interfaces and specialized workflows.
What are the rate limits or task limitations for using Operator?Operator has dynamic daily and task-specific usage limits, and it cannot perform tasks like sending emails or handling CAPTCHAs.
How does Operator handle sensitive tasks like banking or entering payment details?It requires user supervision for sensitive actions, like inputting payment or login details, and does not store such data.
How does OpenAI ensure the safety and reliability of Operator?Operator is designed with safeguards, including user confirmations, takeover mode for sensitive inputs, and monitoring for malicious activity.
What safeguards are in place to prevent Operator from making mistakes or being misused?It asks for user confirmation before completing significant actions and employs monitoring systems to pause tasks if suspicious activity is detected.
How does Operator handle privacy concerns, and can users opt out of data collection?Users can opt out of data collection, delete browsing data, and control privacy settings through Operator’s interface.
What happens if Operator encounters phishing attempts or malicious websites?It’s trained to detect and ignore malicious inputs, and a monitoring system can pause tasks if something suspicious occurs.
Who can use Operator, and how much does it cost?Currently, Operator is available to U.S. users on ChatGPT’s $200 Pro subscription plan.
When will Operator be available outside the U.S., especially in Europe?OpenAI plans to roll it out globally, but Europe may take longer due to regional considerations.
Will Operator eventually be included in all ChatGPT subscription tiers?Yes, OpenAI plans to expand access to Plus, Team, and Enterprise tiers.
Will developers be able to build custom tools using the CUA model in the future?Yes, OpenAI plans to release the CUA model in the API for developers to create their own agents.
Which companies is OpenAI collaborating with for Operator, and how does this benefit users?OpenAI is partnering with companies like DoorDash, Instacart, and Uber to optimize Operator’s functionality while respecting terms of service.
Featured image credit: OpenAI
All Rights Reserved. Copyright , Central Coast Communications, Inc.