Mininglamp Open-Sources Octo: Designing the Collaboration Layer for Multi-Agent Teams

wpnews.pro

Over the past year, something changed in the AI agent space. Claude Code can independently handle full code engineering workflows. Codex can batch-execute development tasks in the background. Our own Mano-P on-device model hit 58.2% success rate on the OSWorld benchmark, ranking first among purpose-built GUI agent models. Individual agent capability is no longer the bottleneck.

But when companies actually try to scale agent usage across teams, a different problem surfaces: these agents don't talk to each other. The developer's agent writes code. The product manager's agent organizes requirements. The ops team's agent crunches data. Three agents producing output that none of the others can see—and syncing information still means copy-pasting manually. More agents, more coordination burden on humans.

The problem isn't that agents aren't smart enough. It's that there's no infrastructure connecting them. Before the internet existed, every computer had compute power. What changed the world was the network protocol and communication layer that linked them together.

Octo, now open-sourced by Mininglamp Technology, targets exactly this gap. It's an open-source work platform built for human-AI agent collaboration. Its core value isn't building yet another powerful AI assistant—it's getting existing agents into the same collaboration network, forming organization-level coordination capability.

GitHub: [https://github.com/Mininglamp-OSS](https://github.com/Mininglamp-OSS)

For teams using AI agents daily, Octo addresses three specific pain points.

Deployment and distribution cost is high. The traditional agent workflow: everyone installs their own CLI, configures their environment, obtains tokens, learns commands. A 10-person team means 10 independent deployments. New hires need extra onboarding. Octo plugs agents into the IM collaboration layer—an admin adds a digital avatar to a channel and deployment is done. Team members use agents immediately, no extra installation, no configuration. Distribution efficiency jumps from individual-level to organization-level.

Agent work processes are invisible to the team. When an agent runs on someone's personal terminal, what it did, what it produced, where it stands—only that person knows. Other team members can't review the agent's execution process or intervene at critical points. Octo moves agent execution into shared team channels and threads, where all participants can see what the agent is doing and provide feedback or make decisions in real time.

Multi-agent coordination is missing. A moderately complex business process might involve research, writing, review, and execution—requiring different agents to hand off work to each other. Octo's three-level structure (spaces, channels, threads) lets multiple digital avatars be assigned, scheduled, and chained within the same workflow, forming a complete work pipeline.

Octo's current version uses IM as its primary interface. There's a clear engineering rationale behind this.

IM is the one tool every person in an enterprise uses at high frequency daily. Plugging agent collaboration into this channel means users don't open a new app or learn new interaction patterns. Agent capability appears directly in the daily workflow. When colleagues see a digital avatar working in a channel, they naturally understand the capability is available and how to use it. Adoption spreads along the conversation flow, no separate training or rollout campaign needed.

To be clear: IM is the entry point into the collaboration network, not the entirety of Octo. Octo is fundamentally a connection layer. It connects people, digital avatars, execution agents, and external tools. As the product evolves, Octo will further integrate digital avatars with execution agents like Claude Code and Codex, making agent-to-agent collaboration visible, trackable, and manageable.

Octo doesn't position agents as public assistants or generic AI chatbots. They're digital avatars that belong to individual users. Each avatar is created by a user, trained by that user, and owned by that user. It learns the user's instructions and work style, remembers the work context assigned to it, and executes tasks in ways the user approves.

This design choice delivers two immediate benefits. Permission management stays clean: a digital avatar's permissions equal the intersection of its bot configuration and its owner's role permissions in the space. No separate permission system needed for agents. Accountability is fully traceable: every operation by a digital avatar traces back to its creator, meeting enterprise audit and compliance requirements.

More importantly, digital avatars are always extensions of their users, never replacements. The user's judgment, taste, and tacit knowledge belong to the user—they aren't extracted or distilled. The avatar handles execution and coordination. The user handles judgment and decisions. The division of labor between human and machine stays clear.

Octo organizes collaboration through a four-level topology: spaces, groups, channels, and threads. Spaces are fully data-isolated collaboration domains, suitable for project or department boundaries. Channels are collaboration nodes around specific topics or workflows. Threads are task execution units within channels. People and digital avatars use the same collaboration primitives in these structures—no separate scheduling system needed for agents.

As agent processing capability grows, the speed at which humans input information to agents becomes the bottleneck. Octo's voice input combines context-aware auto-correction, letting users quickly assign tasks, add background, and provide feedback by speaking. Voice editing supports natural language commands to modify existing text—"make the first paragraph more formal" or "delete the last sentence"—reducing friction in human-agent interaction.

Octo doesn't replace users' existing tools. It provides collaboration capability alongside them. Through the browser extension, pressing Cmd+K on any webpage sends the current page context (URL, title, selected text) to your digital avatar or references it into the current collaboration thread. Feishu docs, Notion notes, GitHub Issues, Jira boards—any web-based tool can plug into the agent collaboration network this way, without the tool itself needing any adaptation.

Each channel or thread can be linked to a group.md document, collaboratively filled in with AI agent guidance. In project kickoffs, requirement alignment sessions, or retrospectives, the agent doesn't just passively answer questions—it actively organizes structured discussions, guides participants through structured input, converges on consensus, and outputs a document everyone has aligned on.

Octo is open-sourced by Mininglamp Technology under the Apache 2.0 license with support for private deployment. All data stays on the user's own infrastructure, never passing through third-party servers. This matters for enterprise scenarios where data security is non-negotiable.

The reasoning behind going open source: in the AI era, software code itself becomes increasingly easy to replicate. An organization's real moat is its unique work context and judgment. Octo opens the platform capability completely, so users can focus on accumulating their own work context and developing their own digital avatars—instead of being locked into a platform.

Mininglamp Technology is a technology company focused on cognitive intelligence and data intelligence, with multiple open-source projects in the AI agent space. Mano-P, the on-device GUI agent model, achieved 58.2% success rate on the OSWorld benchmark with its 72B version, ranking first among all purpose-built GUI agent models. The 4B quantified version runs locally on Apple M4 + 32GB Mac, paired with the Cider inference acceleration SDK for efficient on-device inference. Octo solves agent collaboration and distribution within organizations. Together, these two projects form a complete technical path from "single agent capability" to "organization-level agent collaboration."

source & further reading

dev.to — original article Microsoft Entra extensibility is a gift. It is also Control Plane. I Built an AI Pipeline for 10,000 Daily Listings. Here's What Broke at Scale. Learn JavaScript with Claude in 2026: build real skills, not AI dependency

Mininglamp Open-Sources Octo: Designing the Collaboration Layer for Multi-Agent Teams

Run your AI side-project on zahid.host