# MCP Bridge for Virtuoso (Community Edition)

A custom MCP server that lets OpenClaw (or any LLM agent) access Virtuoso Community Edition as a semantic backend without running raw SPARQL from the agent. The MCP layer exposes structured tools that orchestrate queries and later aggregate data across additional stores (PostgreSQL, CouchDB, Qdrant).

## Vision

- LLMs never issue SQL/SPARQL directly—they call MCP tools.
- The MCP server handles orchestration, sanitization, rate limiting, and multi-source composition.
- Start with Virtuoso (SPARQL) and progressively add new connectors.

## Architecture

```
LLM Agent (OpenClaw)
↓
MCP Server
├── Virtuoso (SPARQL)
├── PostgreSQL
└── Vector DBs (e.g., Qdrant)
```

## Guardrails (current)

- `sparql_query` is **SELECT-only** and always uses a LIMIT (default `SPARQL_DEFAULT_LIMIT`).
- Any LIMIT above `SPARQL_MAX_LIMIT` is clamped.
- Example data loads are disabled unless `MCP_ALLOW_EXAMPLE_LOAD=true` is set.

## Configuration (env)

`run.sh` and `test.sh` will source a local `.env` file if present. Use `.env.example` as a template.

- `VIRTUOSO_ENDPOINT` (default `http://localhost:8891/sparql`; can be `.../sparql-auth` for digest auth)
- `VIRTUOSO_USER` / `VIRTUOSO_PASS` (optional; enables HTTP Digest auth)
- `GRAPH_URI` (used for prefix `:`)
- `SPARQL_TIMEOUT` (seconds)
- `SPARQL_UPDATE_TIMEOUT` (seconds)
- `SPARQL_DEFAULT_LIMIT`
- `SPARQL_MAX_LIMIT`
- `MCP_ALLOW_EXAMPLE_LOAD` (`true`/`false`)
- `EXAMPLE_GRAPH` (graph URI for `load_examples`, default `http://example.org/catalog#test`)

## Design Principles

1. Tool-based abstraction: Provide helpers such as `sparql_query`, `get_entities_by_type`, `list_graphs` instead of exposing raw SPARQL.
2. Gradual complexity: Ship a minimal working setup, then layer on helper tooling, schema introspection, and connectors.
3. Separation of concerns: Virtuoso stores RDF, MCP runs tool interfaces, and LLMs focus on reasoning/tool selection.
4. Guardrails: Raw queries are SELECT-only, bounded by a default LIMIT, and clamped to a maximum size.

## Success Criteria

- Phase 1: MCP tool (`sparql_query`) returns valid SPARQL JSON results.
- Phase 2: LLM relies on helper tools instead of free-form queries (Stage 2 helpers are now present).
- Phase 3: Multiple data sources accessible through a unified MCP interface.

## Example loading (test instances)

Set `MCP_ALLOW_EXAMPLE_LOAD=true` to enable the `load_examples` tool. It loads Turtle fixtures (e.g., `examples/catalog_fixture.ttl`) into the `EXAMPLE_GRAPH` (default `http://example.org/catalog#test`). This is meant for test instances only and uses harmless sample data.

**Note:** the example files are Turtle (`.ttl`) and the loader sends them as SPARQL Update with Turtle prefixes preserved.

## Current helper tools

### Core query/navigation
- `sparql_query` (SELECT-only, LIMIT enforced)
- `list_graphs`
- `search_label`
- `get_entities_by_type`
- `get_predicates_for_subject`
- `get_labels_for_subject`
- `traverse_property` (follow any property link, incoming or outgoing, and get labels/descriptions)

### Ontology discovery (generic, reusable across domain layers)
- `list_classes` (list ontology classes, optional term filter)
- `list_properties` (list ontology properties, optional term/domain/range filters)
- `describe_class` (class label/comment + properties declaring it as domain)
- `describe_property` (property label/comment/domain/range/type + usage samples)

### Relationship helpers
- `describe_subject` (see all predicates/objects for a subject with optional labels)
- `path_traverse` (walk a configured property path from a subject and return each step)
- `property_usage_statistics` (count property usage and sample subjects/objects)
- `batch_insert` (send TTL or multiple triples in a single guarded update; useful for staging domain changes)

### Update/test helpers
- `insert_triple` (single-triple update helper)
- `load_examples` (optional; requires `MCP_ALLOW_EXAMPLE_LOAD=true`; loads fixtures such as `examples/catalog_fixture.ttl`)

## MCP JSON-RPC compatibility (v0.1, minimal)

The `/mcp` endpoint supports:

1) **Legacy tool router** (unchanged): `{ "tool": "...", "input": {...} }`
2) **Minimal JSON-RPC 2.0** messages:
   - `initialize`
   - `tools/list`
   - `tools/call`

This is intended to work with OpenClaw’s MCP bridge and allow tool discovery + calling with request ids.

## Layering recommendation

Keep ontology discovery in `virtuoso_mcp` so any specialized layer (garden, inventory, analytics, etc.) can reuse it. Domain modules should call these generic tools instead of re-implementing ontology probing logic.

## Domain plugin layers

To expose domain-specific helpers automatically, set the `DOMAIN_LAYERS` environment variable to a comma-separated list of Python modules (the default is `garden_layer.plugin`). Each module must expose a `register_layer(tools)` function that receives the MCP `TOOLS` dictionary and adds prefixed entries (e.g., `garden_add_seedling`). `virtuoso_mcp` calls those hooks at startup, so simply `pip install --upgrade git+https://repo.home.world.eu.org/lucky/garden_layer.git` and update `DOMAIN_LAYERS` to include `garden_layer.plugin`. The workspace already contains the `garden_layer` source tree (`workspace/garden_layer`), so during local iteration you can also install it in editable form: `pip install -e /home/lucky/.openclaw/workspace/garden_layer`. The new tools appear in the `/mcp` tool list (`curl -sS http://127.0.0.1:8501/ | jq .tools`) without changing the single `/mcp` endpoint surface.