# MCP Bridge for Virtuoso (Community Edition) A custom MCP server that lets OpenClaw (or any LLM agent) access Virtuoso Community Edition as a semantic backend without running raw SPARQL from the agent. The MCP layer exposes structured tools that orchestrate queries and later aggregate data across additional stores (PostgreSQL, CouchDB, Qdrant). ## Vision - LLMs never issue SQL/SPARQL directly—they call MCP tools. - The MCP server handles orchestration, sanitization, rate limiting, and multi-source composition. - Start with Virtuoso (SPARQL) and progressively add new connectors. ## Architecture ``` LLM Agent (OpenClaw) ↓ MCP Server ├── Virtuoso (SPARQL) ├── PostgreSQL └── Vector DBs (e.g., Qdrant) ``` ## Guardrails (current) - `sparql_query` is **SELECT-only** and always uses a LIMIT (default `SPARQL_DEFAULT_LIMIT`). - Any LIMIT above `SPARQL_MAX_LIMIT` is clamped. - Example data loads are disabled unless `MCP_ALLOW_EXAMPLE_LOAD=true` is set. ## Configuration (env) `run.sh` and `test.sh` will source a local `.env` file if present. Use `.env.example` as a template. - `VIRTUOSO_ENDPOINT` (default `http://localhost:8891/sparql`; can be `.../sparql-auth` for digest auth) - `VIRTUOSO_USER` / `VIRTUOSO_PASS` (optional; enables HTTP Digest auth) - `GRAPH_URI` (used for prefix `:`) - `SPARQL_TIMEOUT` (seconds) - `SPARQL_UPDATE_TIMEOUT` (seconds) - `SPARQL_DEFAULT_LIMIT` - `SPARQL_MAX_LIMIT` - `MCP_ALLOW_EXAMPLE_LOAD` (`true`/`false`) - `EXAMPLE_GRAPH` (graph URI for `load_examples`, default `http://example.org/catalog#test`) ## Design Principles 1. Tool-based abstraction: Provide helpers such as `sparql_query`, `get_entities_by_type`, `list_graphs` instead of exposing raw SPARQL. 2. Gradual complexity: Ship a minimal working setup, then layer on helper tooling, schema introspection, and connectors. 3. Separation of concerns: Virtuoso stores RDF, MCP runs tool interfaces, and LLMs focus on reasoning/tool selection. 4. Guardrails: Raw queries are SELECT-only, bounded by a default LIMIT, and clamped to a maximum size. ## Success Criteria - Phase 1: MCP tool (`sparql_query`) returns valid SPARQL JSON results. - Phase 2: LLM relies on helper tools instead of free-form queries (Stage 2 helpers are now present). - Phase 3: Multiple data sources accessible through a unified MCP interface. ## Example loading (test instances) Set `MCP_ALLOW_EXAMPLE_LOAD=true` to enable the `load_examples` tool. It loads Turtle fixtures (e.g., `examples/catalog_fixture.ttl`) into the `EXAMPLE_GRAPH` (default `http://example.org/catalog#test`). This is meant for test instances only and uses harmless sample data. **Note:** the example files are Turtle (`.ttl`) and the loader sends them as SPARQL Update with Turtle prefixes preserved. ## Current helper tools ### Core query/navigation - `sparql_query` (SELECT-only, LIMIT enforced) - `list_graphs` - `search_label` - `get_entities_by_type` - `get_predicates_for_subject` - `get_labels_for_subject` - `traverse_property` (follow any property link, incoming or outgoing, and get labels/descriptions) ### Ontology discovery (generic, reusable across domain layers) - `list_classes` (list ontology classes, optional term filter) - `list_properties` (list ontology properties, optional term/domain/range filters) - `describe_class` (class label/comment + properties declaring it as domain) - `describe_property` (property label/comment/domain/range/type + usage samples) ### Relationship helpers - `describe_subject` (see all predicates/objects for a subject with optional labels) - `path_traverse` (walk a configured property path from a subject and return each step) - `property_usage_statistics` (count property usage and sample subjects/objects) - `batch_insert` (send TTL or multiple triples in a single guarded update; useful for staging domain changes) ### Update/test helpers - `insert_triple` (single-triple update helper) - `load_examples` (optional; requires `MCP_ALLOW_EXAMPLE_LOAD=true`; loads fixtures such as `examples/catalog_fixture.ttl`) ## MCP JSON-RPC compatibility (v0.1, minimal) The `/mcp` endpoint supports: 1) **Legacy tool router** (unchanged): `{ "tool": "...", "input": {...} }` 2) **Minimal JSON-RPC 2.0** messages: - `initialize` - `tools/list` - `tools/call` This is intended to work with OpenClaw’s MCP bridge and allow tool discovery + calling with request ids. ## Layering recommendation Keep ontology discovery in `virtuoso_mcp` so any specialized layer (garden, inventory, analytics, etc.) can reuse it. Domain modules should call these generic tools instead of re-implementing ontology probing logic. ## Domain plugin layers To expose domain-specific helpers automatically, set the `DOMAIN_LAYERS` environment variable to a comma-separated list of Python modules (the default is `garden_layer.plugin`). Each module must expose a `register_layer(tools)` function that receives the MCP `TOOLS` dictionary and adds prefixed entries (e.g., `garden_add_seedling`). `virtuoso_mcp` calls those hooks at startup, so simply `pip install --upgrade git+https://repo.home.world.eu.org/lucky/garden_layer.git` and update `DOMAIN_LAYERS` to include `garden_layer.plugin`. The workspace already contains the `garden_layer` source tree (`workspace/garden_layer`), so during local iteration you can also install it in editable form: `pip install -e /home/lucky/.openclaw/workspace/garden_layer`. The new tools appear in the `/mcp` tool list (`curl -sS http://127.0.0.1:8501/ | jq .tools`) without changing the single `/mcp` endpoint surface.