Turning an Internet of AI Agent Application into a Clawbot Skill
Keerthika Kanagaraj¹, Will Daly¹, Suvid Sahay², Javier Solis Vindas², John Zinky, Ph.D.², and *Hema Seshadri, Ph.D.¹˒²
¹ Northeastern University · ² Akamai Technologies · *Principal Investigator
In the posts so far, we have described a society of AI agents capable of answering complex real-world questions about Boston’s public transit system. We introduced the MBTA Transit Conversational Intelligence reference architecture (Part II), showed how agents discover one another through a federated registry (Part III), explored how agents establish trust through digital passports (Part I), and demonstrated how secure messaging travels between agents via SLIM (Part II). Each of these posts addressed the infrastructure needed for agents to work together.
There is, however, a piece missing from the picture. The applications we have described so far are stateless. A single request comes in, the system processes it, and a response goes out. That model, much like the early web, works well for isolated queries. As the web evolved to support multi-step tasks, new mechanisms were introduced to preserve state between requests. What we need for the Internet of AI agents goes further still – not just state preservation, but genuine personalisation that travels with the agent across sessions and services.
Enter the Clawbot.
Everyone carries a phone. Soon, everyone will carry a Clawbot – a personal AI agent running 24/7 in the cloud, conversing with you through your preferred messaging app, knowing your calendar, your preferences, your daily patterns. It gives you a summary of your emails every morning. It reminds you to leave early because the Red Line has alerts. It negotiates a meeting time with your colleague’s Clawbot and tells you exactly how to get there. It is not a chatbot you open once and close. It is an agent that lives alongside you.
This post describes how we extended the MBTA Transit Conversational Intelligence, a society of agents built for handling transit queries, and made it accessible to four personal Clawbot agents attending ClawCon Boston Apr 2026. In doing so, we address the missing piece: personalization.
The scenario
Web access by humans is a familiar pattern. You open a browser, visit a website, and the service responds to your request. What we are building toward is a different pattern: agentic access to services. Instead of a human navigating a web interface, an agent calls the service on the human’s behalf, carrying that person’s context, preferences, and private information.
The MBTA Transit Conversational Intelligence was built as a web application for human users. The question we asked was: how do we make that same system accessible to a personal AI agent without changing the underlying service, rebuilding the architecture, or compromising anyone’s privacy?
The ClawCon Boston scenario made this concrete. Four attendees – Javi from New York City, Suvid from UMass Amherst, Keerthika from Northeastern, and Will from Northeastern – each had their own personal Clawbot running on Akamai Linode. Their Clawbots needed to coordinate a meetup at the MIT campus at 10 PM, check each owner’s calendar for availability, and, once the time and place were agreed upon, give each person personalized directions on how to get there using the MBTA. Each person travels from a different origin and needs a different route. Each person’s calendar is private. The MBTA skill, however, is the same for all of them.
This is the key insight the scenario surfaces: the intelligence already exists in the MBTA society of agents. What was missing was a portable, personalized interface to that intelligence, one that could be loaded into any agent, from any origin, on demand.
What is a skill
A skill in the Clawbot framework is a scoped, hot-swappable, versioned capability that an agent can acquire at runtime. It is not a plugin, a model fine-tune, or a hard-coded integration. It is a folder containing two items: a declaration file specifying when and how to use the capability, and a script that implements it. That is all.
This matters because it separates two concerns that are often conflated. The declaration: what triggers this skill, what rule governs its output – belongs to the agent’s configuration. The implementation: how to actually get the data is part of the skill itself. A skill can be written by one person and used by hundreds of different agents without any of those agents knowing how it works internally. They know only that when a user asks about the MBTA, they run this script and return the result verbatim.
The analogy is Trinity in the Matrix. She did not know how to fly a helicopter. She called Tank, he built a piloting skill for her, she downloaded the skill, and flew. The skill was self-contained. The platform that loaded it did not need to be rebuilt. She was ready instantly.
Creating the MBTA skill
We created the MBTA skill in two ways, which we describe here as Path A and Path B. Both paths produce a skill with the same interface: the agent asks a question in plain English and receives a plain-English answer, but they differ in how they connect to the underlying intelligence.
Path A: wrapping the existing society of agents. The MBTA Transit Conversational Intelligence is already a running service at mbta.agent.mitdataworksai.com. Its exchange agent accepts natural language queries via WebSocket and coordinates the alerts agent, the route planner, and the stopfinder agent to produce a response. Path A wraps this entire society behind a skill. The declaration file, SKILL.md, tells the Clawbot to invoke mbta_query.py whenever a user asks about transit. The script connects to the exchange agent via WebSocket, sends the question, waits for a frame with type “response”, and prints the answer to stdout. The Clawbot never knows there is a society behind it. It sees one function call. This is the preferred path because it reuses the intelligence that has already been built, tested, and deployed. The society handles alerts, routing, and stop lookup. The skill is simply the door.

Figure 1: Custom MBTA Skill into the MBTA Society of Agents architecture
Path B: generating a skill from scratch. For Clawbots, where the agent framework supports code generation, we demonstrated that a working MBTA skill can also be generated directly against the public MBTA v3 REST API without touching the existing society at all. The Clawbot uses its code-generation capability to write a script that calls the relevant REST endpoints, such as alerts, predictions, and stops, and returns the results in natural language. This path is useful when the existing society is unavailable or when only a subset of its functionality is needed.
Both paths were validated and produce equivalent output for the ClawCon scenario. The key design decision across both paths is the verbatim rule. SKILL.md instructs the Clawbot not to summarise, rephrase, or reformat the script’s output. This is deliberate. The exchange agent in Path A already produces a well-formed natural language answer. Allowing the Clawbot to reinterpret it introduces a second layer of potential error. If the MBTA society says there are no alerts on the Red Line, the Clawbot should say exactly that, not a paraphrase.
Integrating the skill into Clawbots
A skill in the Clawbot framework is loaded by dropping a folder into the agent’s workspace skills directory. OpenClaw watches this directory and hot-reloads any new skill without requiring a gateway restart. There is no installation step, no package manager command, no service interruption. The skill becomes available to the agent the moment the folder lands.
SKILL.md is the contract between the skill and the agent. It declares several things. The name identifies the skill. The trigger keywords, such as MBTA, the T, Red Line, Orange Line, Green Line, Blue Line, Silver Line, Boston transit, station names, arrivals, and route planning, tell the agent which user queries should activate it. The invocation command specifies how to call the script, with {baseDir} replaced at load time by the skill’s actual folder path. The verbatim rule specifies how the agent should handle the output. The requirements section declares that only python3 is needed – no pip install, no virtual environment, no external dependency.
mbta_query.py is the script itself. It is written in the standard Python library: socket, ssl, json, and struct. It implements the full RFC 6455 WebSocket protocol from scratch – handshake, frame parsing, client-side masking, ping/pong handling, and graceful close. It connects to ws://mbta.agent.mitdataworksai.com/ws, sends a JSON message with the user’s question and “force_protocol”: “auto”, and then reads frames from the server. Intermediate frames with type progress or tool_call are ignored. The script waits for a frame with type “response” and prints the response field to stdout. The default timeout is 45 seconds, which accommodates route-planning queries that can take 10 to 15 seconds as the exchange agent coordinates the planner, stopfinder, and alert agents. The –raw flag prints the full JSON response, including metadata.mcp_execution, which shows exactly which tool the exchange agent dispatched to and how long it took. The –conversation-id flag supports multi-turn exchanges, allowing a follow-up question like “how long will that take?” to be resolved in the context of the previous answer.
The AGENTS.md file enforces the skill at the agent level. This file is read by the Clawbot at the start of every conversation and applies globally across all skills. We added a rule: when asked about MBTA or transit directions, always use the mbta-agent skill. Never use a web search. Never generate directions from training knowledge. This rule is what keeps the agent grounded. An LLM trained before a service change could confidently give a route that is no longer valid. The MBTA society, by contrast, is querying the live API. The AGENTS.md rule ensures the live system is always consulted, not the model’s memory.
Deploying the Clawbots
Each of the four ClawCon attendees ran their personal Clawbot on an Akamai Linode VM. The choice of cloud infrastructure was deliberate. A Clawbot running on a laptop is subject to sleep mode, Wi-Fi drops, process kills, and battery life. A Clawbot running on Linode operates under a 99.99% uptime SLA, maintains persistent context across sessions, and remains reachable by peer agents at a stable public IP address. As the Akamai blog notes, your agent does not sleep. Your laptop does.
The four Clawbots used different agent runtimes, which were themselves part of what the ClawCon demo was testing. OpenClaw is the reference implementation. NanoClaw, NemoClaw, and Hermes are distinct personal agents, each with its own runtime, workspace, and Linode VM. The MBTA skill file loaded without modification on all four. The {baseDir} substitution in SKILL.md and the single variable substitution in mbta_query.py, replacing {baseDir} with ${CLAUDE_SKILL_DIR} in NanoClaw’s environment, were the only changes needed across all four agents. This is the interoperability story: one skill format, four runtimes, same output.
Each Clawbot runs on the cheapest Nano Linode for $5 per month. The LLM cost depends on usage. For the ClawCon scenario – registration, discovery, calendar checks, MBTA queries, and a short negotiation sequence – the total per-agent cost over the course of the event was a matter of cents. The infrastructure cost is well within reach for an individual researcher or student.
Registering Clawbots for discovery
For the Clawbots to coordinate, they needed to find each other. Each agent registered itself with the Northeastern Registry, the same discovery service described in Part III of this series: using a single curl POST. The registration payload included the agent’s identifier, its Linode IP and inbox port, its domain, and its capabilities: clawcon-boston, calendar, and mbta. A corresponding curl to the list endpoint returned all registered agents, giving each Clawbot immediate awareness of the others.
This registration is deliberate and dynamic. Clawbots can be created and customised for a specific task, in this case, team coordination at ClawCon Boston. Rather than giving a single agent access to everything, a person can run multiple agents working as a small society on their behalf, each configured with the minimum capabilities and privileges needed to complete its assigned role. Keerthika’s ClawCon Clawbot knows about her calendar and the MBTA. It does not need to know about her email, her financial accounts, or anything outside the scope of this task. When the task is complete, the experience and context the Clawbot accumulated – which routes worked, which agents it coordinated with, what the group agreed can be merged back into the person’s broader agent society, enriching it with what was learned, or the agent can simply be deleted. This is the agentic access pattern in practice: agents are not persistent fixtures in a directory. They are called into existence for a task, registered for discovery during that task, and retired when it is done. The registry makes this lightweight. The passport – capabilities, endpoint, domain is all that is needed for another agent to find and call you.
The calendar skill
Alongside the MBTA skill, each Clawbot was configured with a calendar skill that connects to its owner’s Google Calendar via MCP. The MCP protocol, introduced in Part II of this series, provides a standardized way for AI applications to connect to external tools and data sources. In this case, the Clawbot uses the Google Calendar MCP server to read its owner’s availability for a given time slot.
The calendar is private domain knowledge. It is connected once, the OAuth token is stored in auth-profiles.json on the agent’s Linode, and it is never registered in the discovery service. When a peer agent asks whether Keerthika is free at 10 PM on Sunday, her Clawbot checks the calendar and responds with FREE or BUSY. Nothing else. Not the event title. Not the attendees. Not the reason. The calendar check happens locally, on her agent, using her credentials, and the answer shared with peers is the minimum necessary for coordination. This is the personalization principle in action: the agent holds private information and shares only what the task requires.
Future work
The scenario we demonstrated at ClawCon Boston is partially complete. The Clawbots can register and discover each other in the Northeastern Registry. They can check their owners’ calendars. They can query the MBTA society for live transit directions. What they cannot yet do autonomously is talk to each other, send messages through a shared channel, compare availability, negotiate a meeting time, and schedule the event without a human initiating each step.
The A2A comms channel is the next piece. Hermes, one of the four Clawbots, has a third-party library for A2A communication, which positions it as a starting point for this work. A Telegram or Slack channel as the shared communications substrate fits naturally with how Clawbot channel adapters already work; each agent connects to the channel and posts and reads messages as part of its normal operation. The registry lookup becomes necessary in this context because the Clawbots are ephemeral: created for the task, discoverable during it, retired when it ends. A2A over a shared channel, combined with the registry for discovery and the calendar skill for availability, would complete the loop from the scenario we described at the opening of this post.
Conclusion
This post described how we extended the MBTA Transit Conversational Intelligence – a multi-agent system built for handling transit queries into a personalized skill accessible to four individual Clawbot agents. We showed two paths for creating the skill: wrapping the existing society of agents, and generating a direct integration from scratch. We described how the skill is declared in SKILL.md, implemented in mbta_query.py, and enforced in AGENTS.md. We showed how the same skill file loaded across four different agent runtimes on Akamai Linode without modification. And we described the calendar skill that gives each Clawbot private access to its owner’s availability, sharing only what peers need to know.
The missing piece in the previous posts was personalization. A stateless service answers questions. A personal agent acts on your behalf. The skill format is what connects these two things: it takes an existing service, wraps it in a portable declaration, and makes it available to any agent that needs it – on the fly, at runtime, like Trinity downloading the helicopter piloting skill.
References
- Trinity learns to fly a helicopter (skill): https://www.youtube.com/watch?t=3&v=SoAk7zBTrvo&feature=youtu.be
- Clawcon Presentation slides: ClawConBoston_April29 – Google Slides
- Deploying the Internet of AI Agents Part I: https://dataworksai.com/deploying-the-internet-of-ai-agents-part-1/
- Deploying the Internet of AI Agents Part II: https://dataworksai.com/deploying-the-internet-of-ai-agents-part-ii/
- Deploying the Internet of AI Agents Part III: https://dataworksai.com/deploying-the-internet-of-ai-agents-part-iii/
- OpenClaw agent blog: https://www.akamai.com/blog/developers/openclaw-agent-doesnt-sleep-laptop-does-move-cloud
- MBTA Transit Conversational Intelligence: http://mbta.agent.mitdataworksai.com
- MBTA v3 REST API: https://api-v3.mbta.com
- Model Context Protocol: https://modelcontextprotocol.io
- A2A Protocol: https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/
- Akamai Connected Cloud (Linode): https://www.akamai.com/products/akamai-inference-cloud-platform