Orca Explores the Human-Guided Future of AI Agents

2025-06-09

Researchers at the University of California, San Diego have introduced Orca, an open-source system that demonstrates how large language models (LLMs) can assist users on the web—not by controlling, but by guiding interactions. In a peer-reviewed white paper, the research team highlighted significant improvements in task speed and accuracy during evaluations, offering early evidence of the potential for human-AI collaborative agents in real-world workflows.

Orca is designed to help users extract meaningful insights from the web, acting as a "co-pilot" or assistant for decision-making rather than an autonomous browser agent.

The system provides a range of features, including summarizing lengthy web pages, extracting structured data from unstructured content, tracking changes during browsing sessions, and comparing statements from multiple sources. It can search, scroll, click, and interact with websites based on commands, enabling users to delegate repetitive or context-heavy tasks while maintaining control over the process.

In a lab study involving eight participants, researchers found that Orca accelerated web exploration, encouraged broader information gathering, and increased user trust in the results.

Participants appreciated the ability to visually organize pages, selectively delegate tasks to AI, and maintain control over information sources. For example, one participant used Orca to compare Yelp options side-by-side, while another preferred filtering Reddit posts for product research. The spatial layout and bulk interactions were particularly praised for reducing context-switching costs and making complex workflows more manageable.

Notably, the researchers emphasized shared control as a core design principle—users initiate actions and retain command, which helps build trust and adoption. This focus on shared control supports transparency and trust, qualities the researchers consider crucial for building user confidence and ensuring autonomy in AI-assisted workflows.

The Orca system is implemented as an Electron application, with its front end built on React. Each webpage is loaded into its own isolated webview, while the "Web Canvas" interface, used for organizing and interacting with multiple pages, was constructed using the open-source tldraw library.

All language-based functionalities, such as summarization, extraction, and automation, are powered by the Claude 3.7 Sonnet model. Behind the scenes, Orca employs a custom HTML refinement and proxy pipeline architecture, transforming raw webpage content into structured representations usable by LLMs. These pipelines are shared across various functions and are designed to allow user intervention during execution.

The open-source release is positioned as a research prototype rather than a production-ready tool, aiming to help developers explore future collaborative agent workflows. Despite its promise, the researchers noted performance limitations under heavier workloads: "A MacBook Pro with an M4 Max and 36GB of unified memory can handle approximately 80 webpages before freezing."

The positive outcomes of Orca in human-AI collaboration systems provide a glimpse into what future interactions between users and assisting agents might look like—where AI agents support but do not replace users in high-context, decision-intensive workflows.

At the time of writing, Orca is not alone in this space, sharing the field with other emerging tools. Further examples can be seen in OpenAI's Operator and the redesigned Opera Neon browser.