Connect Local Models¶
By the end of this page, a locally-running model (via Ollama, vLLM, LM Studio, or llama.cpp) will be connected to KruxOS.
KruxOS v0.0.1 ships local-provider support via the same OpenAI-compatible function-calling adapter — point any server that speaks that format at the appliance.
Prerequisites¶
- A running KruxOS instance (Install)
- An agent token (64-char hex) from
kruxos agent createor the wizard - One of: Ollama, vLLM, LM Studio, or llama.cpp's
serverrunning and reachable from the appliance - The bundled Python SDK at
/opt/kruxos/sdk/python/(auto-importable on the appliance). The externalpip install kruxosdistribution ships in v0.0.3.
Pull a model¶
Expected output:
Connect and use capabilities¶
import asyncio
import ollama
from kruxos import KruxOS
from kruxos.connectors.local import LocalAdapter
async def main():
# Connect to KruxOS
os = await KruxOS.connect_async(
endpoint="ws://localhost:7700",
agent_name="my-agent",
api_key="7f3a8c1d2e9b5a4f8e6c1d3b7a9f2e5c8d1b4a7f3c9e6d8b1a4c7f2e5d9b8a3c",
purpose="Local model quickstart",
)
try:
# Create adapter — uses OpenAI-compatible format
adapter = OllamaAdapter(os)
tools = adapter.as_tools()
print(f"Registered {len(tools)} tools")
# Use with Ollama
response = ollama.chat(
model="llama3.1",
messages=[{"role": "user", "content": "List files in /workspace"}],
tools=tools,
)
# Execute any tool calls
for tool_call in response["message"].get("tool_calls", []):
result = await adapter.execute(
tool_call["function"]["name"],
tool_call["function"]["arguments"],
)
print(f"Tool: {tool_call['function']['name']}")
print(f"Result: {result}")
finally:
await os.close_async()
asyncio.run(main())
Expected output:
Registered 89 tools
Tool: filesystem__list
Result: {"entries": [{"name": "hello.txt", "type": "file", "size": 21}]}
Other OpenAI-compatible servers¶
The Ollama adapter works with any server that speaks the OpenAI tool-calling format:
- vLLM:
pip install vllmand point to its endpoint - llama.cpp server: use
--api-like-oaiflag - LM Studio: enable the server and use the Ollama adapter
The adapter produces standard OpenAI function-calling tool definitions, compatible with any server that accepts that format.
Next steps¶
- Web Dashboard — monitor agent activity
- Managing Agents — create dedicated agents per model
- Policies — configure what each agent can do