Agent Service is the FastAPI-based hosting layer that turns AgentScope agents into a multi-tenant, multi-session HTTP service. It owns everything around the agent — request routing, per-user resource lifecycle, session state, persistence, scheduling, and tool offloading — so that the agent code you wrote againstDocumentation Index
Fetch the complete documentation index at: https://docs.agentscope.io/llms.txt
Use this file to discover all available pages before exploring further.
Agent can serve production traffic without being rewritten.
What sets it apart:
- Production backbone for live agents — agent runs, background tasks, schedules, and the tool/MCP/skill/workspace lifecycle are managed end-to-end, with session streams that fan out to multiple subscribers and replay buffered history on reconnect.
- Schema-driven frontend — credentials publish JSON schemas and models expose declarative cards (input/output types, context size, parameter schemas), so the UI can render forms and capability badges without coupling to provider-specific code.
- Multi-tenant by construction — credentials, agents, sessions, schedules, and messages are all owned by the request’s
user_id, and ownership is enforced at the routing layer — one deployment serves many users with no per-tenant code paths. - Modular and extensible — authentication, chat protocols, workspace isolation strategy, storage backend, and the set of model providers and credential types are all open at the boundary, swappable without touching framework code.
Capabilities
| Capability | Description |
|---|---|
| Streaming chat | SSE-based streaming of AgentEvent objects in real time |
| Session management | Persistent sessions with state serialization across requests |
| Session replay | Late-joining clients receive full event history via buffered replay |
| Background task offloading | Long-running tool calls move to background with automatic result injection |
| Cron scheduling | Time-based agent execution with stateful or stateless sessions |
| Credential management | Secure storage and retrieval of model provider API keys |
| Protocol adaptation | Middleware-based conversion to external protocols (AG-UI, A2A, etc.) |
| Workspace management | Pluggable workspace isolation (built-in: per-agent; extensible to per-session or per-user) |
The service does not include a built-in user authentication system. It provides a placeholder
X-User-ID header dependency that you replace with your own auth middleware (JWT, OAuth, session tokens, etc.).Resource Model
Every operation in Agent Service is scoped to auser_id resolved from the request. Below that boundary, the service manages six resource types whose relationships define both the REST surface and the runtime behavior:
Understanding this graph upfront makes the REST API self-explanatory — most endpoints just create, list, or mutate one of these resources for the authenticated user.
User
Agent Service models no user system of its own.user_id is an opaque tenant identifier resolved through the get_current_user_id dependency — replace it with a one-function adapter to whatever identity system you use (JWT, OAuth2, session cookies, SSO, API tokens). See Extend Authentication.
Credential
A credential is the connection configuration for a model provider (DashScope, OpenAI, Anthropic, Ollama, …) — an API key plus any provider-specific settings. A user can register multiple credentials per provider — separate keys for LLM, TTS, or Realtime services, or a personal key alongside a team key — and reuse one credential across many agents and sessions so key rotation happens in one place. New providers are added by subclassingCredentialBase and registering through create_app(extra_credentials=[...]); they become usable to clients with no further work.
Agent
An agent record captures who the agent is — a display name, the system prompt that defines its persona, and runtime configuration such as how it manages context and runs its ReAct loop. The same agent can be driven by different LLMs across many sessions: identity belongs to the agent, model and runtime state belong to the session.Workspace
A workspace is the agent’s runtime environment — a file-system-like working area, the unified set of tools, MCPs, and skills available in it, and a place to offload compressed context for agentic search. Implementations range from the local filesystem (LocalWorkspace) to sandboxes (DockerWorkspace, E2BWorkspace); the WorkspaceManager decides how workspaces map to users, agents, and sessions — per-user, per-agent (the built-in default), or per-session isolation are all valid policies.
Session
A session is one ongoing exchange between a user and an agent. It carries:- Agent state — working memory, in-flight reply, permission context; persisted after every turn.
- Message transcript — the persisted user/assistant exchange the frontend renders; distinct from the context window the model sees, which is reconstructed per turn from agent state.
- LLM configuration — provider, model, parameters, and the credential to call it with. Because it lives on the session, the same agent can run under different models in different sessions.
- Permission level — bounds which tools the agent may invoke.
Schedule
A schedule fires an agent on a cron expression. Each fire runs inside a session — either fresh per execution (stateless) or reused so context accumulates across fires (stateful). Schedules persist and survive server restarts.Quickstart
The minimum to get a service running is a storage backend and a workspace manager. The examples below boot a service on port 8000 backed by Redis — pick the workspace backend that matches where you want the agent’s tools to execute.create_app parameters
The storage backend for persisting agents, sessions, credentials, messages, and schedules. Its lifecycle (
__aenter__ / __aexit__) is managed by the app lifespan.Manages workspaces (file storage, MCP servers, skills) with TTL-based caching. The built-in
LocalWorkspaceManager isolates per agent; see Workspace implementation and isolation for other strategies.Additional credential types to register. Each class is registered with
CredentialFactory before the app starts.Additional ASGI middlewares (e.g., protocol adapters, CORS, auth).
OpenAPI title shown in the docs UI.
API version shown in the docs UI.
Typical operation flow
Once the server is running, drive it through the resources defined in the resource model. The flow below is the path a chat session usually takes — each step is one or two REST calls.Create an agent
Register the agent’s identity — display name, system prompt, and runtime configuration. The same agent can drive many sessions under different models.
Create and configure a credential
Discover each provider’s form fields with
GET /credential/schemas, then save the API key. One credential can be reused across many sessions and agents.Create a session and select a model
Create a session bound to the agent and attach a model configuration — provider, model name, parameters, and the credential to call it with. The session owns the runtime state from here on.
Configure MCPs and skills (optional)
Attach MCP clients and skills to the session’s workspace if the agent needs tools beyond its built-ins.
X-User-ID header and a single user message:
/chat call is needed; the agent runs autonomously when the cron fires.
API Overview
The service exposes the resources from the resource model as REST endpoints, plus the streaming chat endpoint. The table below groups them by category; full request and response shapes are documented in the service’s OpenAPI specification.| Category | Endpoints | Description |
|---|---|---|
| Chat | POST /chat | Stream agent events for a session over SSE; supports replay and multi-subscriber fan-out. |
| Sessions | GET/POST/PATCH/DELETE /sessions | Create and manage chat sessions, including model binding and permission level. |
| Messages | GET /sessions/{id}/messages | Paginated message transcript for a session. |
| Agents | GET/POST/PATCH/DELETE /agent | Manage agent records — display name, system prompt, runtime config. |
| Credentials | GET/POST/PATCH/DELETE /credential | CRUD for per-provider API keys and connection configs. |
| Credential schemas | GET /credential/schemas | Discover all registered credential types and their JSON parameter schemas for form rendering. |
| Models | GET /model?provider=<name> | List candidate models for a provider, with their declarative ModelCard (capabilities and parameter schemas). |
| Schedules | GET/POST/PATCH/DELETE /schedule | Manage cron-based agent execution, stateful or stateless. |
| Background tasks | GET /background-tasks | Inspect offloaded tool executions. |
| Workspace MCPs | GET/POST /workspace/mcp, DELETE /workspace/mcp/{mcp_name} | Manage MCP clients attached to the session’s workspace. |
| Workspace skills | GET/POST /workspace/skill, DELETE /workspace/skill/{skill_name} | Manage skills available in the session’s workspace. |
Customization
The service is open at every infrastructure boundary. The sections below describe what is built in and how to plug in your own.Agent chat protocol
The chat endpoint emits AgentScope’s nativeAgentEvent stream over SSE. To serve the same agent under a different frontend protocol, install a protocol middleware that intercepts the SSE stream and rewrites each frame.
AgentScope ships with AGUIProtocolMiddleware for the AG-UI protocol. Install it via extra_middlewares:
ProtocolMiddlewareBase and implement _convert_to_protocol:
StreamingResponse objects from the chat endpoint, deserializes each SSE frame back into an AgentEvent, calls _convert_to_protocol() to produce the target format, and re-serializes the converted frame.
User authentication
The built-inget_current_user_id dependency extracts the caller identity from the X-User-ID request header — a placeholder, not authentication. Override it with your own dependency to integrate any identity system.
JWT bearer token:
Workspace implementation and isolation
Two independent axes are configurable:- Workspace backend — what runtime environment the agent runs in. Built-in implementations include
LocalWorkspace,DockerWorkspace, andE2BWorkspace. New backends implement the workspace interface and can wrap container images, sandboxes, or remote VMs. - Isolation strategy — how workspaces map to users, agents, and sessions. The built-in
LocalWorkspaceManagerkeys workspaces byagent_id: all sessions of the same agent share one workspace. To switch to per-user or per-session isolation, subclassWorkspaceManagerBaseand overrideget_workspacewith your own keying strategy.
API credentials
A new credential type is a pair of classes: aCredentialBase subclass that captures the connection config (and publishes its JSON schema for form rendering), and a ChatModelBase subclass that implements the actual streaming chat protocol against the provider’s API. The credential class is the entry point — it tells the service which chat model class to instantiate.
GET /credential/schemas, and GET /model?provider=<name> routes to the chat model class returned by get_chat_model_class().
Provider models
The model list returned byGET /model?provider=<name> is built from ModelCard instances — declarative metadata records that tell the frontend how to display each model and what request parameters are valid. Each chat model exposes its catalog through list_models(), which by default loads ModelCard entries from YAML files in the provider’s model directory; ModelCard.from_yaml() parses each YAML and merges its overrides into the base parameter schema supplied by the chat model’s parameters class.
A model card carries the following fields:
| Field | Description |
|---|---|
name | Provider-side model identifier. |
label | Display name shown in the UI. |
status | One of active, deprecated, sunset. |
deprecated_at | Deprecation timestamp, if any. |
input_types | MIME types the model accepts (e.g., text/plain, image/png, video/mp4). |
output_types | MIME types the model emits (e.g., text/plain, application/x-thinking). |
context_size | Maximum context window in tokens. |
output_size | Maximum output tokens. |
parameter_schema | JSON schema for the request parameters, auto-merged with per-model overrides. |
parameters_overrides | Per-model deltas applied on top of the base parameter schema. |
qwen3.6-plus.yaml
GET /model?provider=<name>.
Storage backend
TheStorageBase abstract class defines the persistence contract for agents, sessions, credentials, messages, and schedules. AgentScope ships with RedisStorage as the built-in implementation:
| Record | Description |
|---|---|
AgentRecord | Agent configuration (name, system prompt, context config, react config). |
SessionRecord | Session state including AgentState, model config, and workspace binding. |
CredentialRecord | Encrypted model provider API keys. |
ScheduleRecord | Cron schedule definitions with execution history. |
Msg | Persisted messages per session with pagination support. |
Service Internals
For developers who need to extend or embed the actual implementation of Agent Service in AgentScope, this section describes how the FastAPI app is wired together — what runs at startup, which managers hold runtime state, where middlewares sit in the request path, and how routers get hold of those resources.Lifespan
The lifespan context manager runs once per process. On startup it opens the storage connection pool, instantiates the in-memory managers, and restores persisted schedules so they survive restarts. On shutdown it cancels in-flight sessions and background tasks, waits for the scheduler to drain, and closes the storage pool.Managers
Four managers are bound to the FastAPI app state during the lifespan and shared across all requests:| Manager | Responsibility |
|---|---|
SessionManager | Per-session serialization (one active run per session_id), buffered SSE replay, and multi-subscriber fan-out. |
BackgroundTaskManager | Registry for tool calls offloaded by ToolOffloadMiddleware; injects results back into agent context when they finish. |
SchedulerManager | APScheduler-backed cron execution; resolves the target session (stateful or stateless) and drives the run through ChatService. |
WorkspaceManager | Workspace lifecycle and TTL-based caching; the isolation key (per-agent, per-user, per-session) is decided by the subclass. |
Middlewares
ASGI middlewares wrap every request. The two categories used in practice are protocol middlewares (e.g.,AGUIProtocolMiddleware), which intercept SSE responses from the chat endpoint and rewrite each frame into the target protocol, and observability middlewares (e.g., OpenTelemetry tracing), which run unmodified through extra_middlewares.
Dependencies
Routers receive application state through FastAPI’sDepends(). The standard injectables are:
| Dependency | Returns |
|---|---|
get_current_user_id | The caller’s user id — overridable to integrate any auth system. |
get_storage | The StorageBase instance bound to the app. |
get_session_manager | The lifespan-bound SessionManager. |
get_workspace_manager | The lifespan-bound WorkspaceManager. |
get_background_task_manager | The lifespan-bound BackgroundTaskManager. |
get_scheduler_manager | The lifespan-bound SchedulerManager. |
Further Reading
Agent
Core agent abstraction and the ReAct loop
Message & Event
Event streaming and message reconstruction
Tool
Built-in and custom tools including external execution
Context
Context compression and workspace offloading