Skip to content

Connect Local Models

The local model adapter connects KruxOS to Ollama and any OpenAI-compatible inference server.

Ollama

Setup

from ollama import Client
from kruxos import KruxOS
from kruxos.connectors.local import LocalAdapter

ollama = Client()
agent = KruxOS.connect("localhost:7700", api_key="<64-char hex>")
adapter = OllamaAdapter(agent)

Usage

tools = adapter.as_tools()

response = ollama.chat(
    model="llama3.1:70b",
    messages=[{"role": "user", "content": "Read /workspace/README.md"}],
    tools=tools,
)

if response.message.tool_calls:
    for tool_call in response.message.tool_calls:
        result = adapter.handle_tool_call(tool_call)

Agentic loop

messages = [{"role": "user", "content": "List all Python files in /workspace"}]

while True:
    response = ollama.chat(
        model="llama3.1:70b",
        messages=messages,
        tools=tools,
    )

    messages.append(response.message)

    if not response.message.tool_calls:
        print(response.message.content)
        break

    for tool_call in response.message.tool_calls:
        result = adapter.handle_tool_call(tool_call)
        messages.append({
            "role": "tool",
            "content": str(result),
        })

OpenAI-compatible servers

Any server that implements the OpenAI chat completions API with function calling works with the OpenAI adapter:

vLLM

from openai import OpenAI
from kruxos.connectors.openai import OpenAIAdapter

client = OpenAI(base_url="http://localhost:8000/v1", api_key="unused")
adapter = OpenAIAdapter(agent)
tools = adapter.as_tools()

response = client.chat.completions.create(
    model="meta-llama/Llama-3.1-70B-Instruct",
    messages=[{"role": "user", "content": "..."}],
    tools=tools,
)

LM Studio

client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
# Same adapter usage as above

llama.cpp server

client = OpenAI(base_url="http://localhost:8080/v1", api_key="unused")
# Same adapter usage as above

Model recommendations

Model Size Tool-calling quality Notes
Llama 3.1 70B 70B Good Best open-source option for tool use
Llama 3.1 8B 8B Moderate Acceptable for simple tasks
Mistral Large 123B Good Strong function calling
Qwen2.5 72B 72B Good Strong reasoning
Command R+ 104B Good Built for tool use

Structured errors help smaller models

KruxOS's structured error responses with explicit recovery suggestions significantly improve smaller models' ability to recover from failures — they don't have to guess what went wrong.

Verifying connection

kruxos status

Local model agents appear the same as any other agent in the status output.