close
Skip to content

JoakimCarlsson/ai

Repository files navigation

Go AI Client Library

CI License: MIT Go Version

Migrating? MIGRATION.md covers two transitions: v0.18.x → v0.1.0 (single module split into ~50 per-vendor modules) and v0.1.x → v0.2.0 (memory and session lifted out of agent/ to top-level modules).

A multi-provider Go library for AI: LLMs, embeddings, image generation, TTS, STT, rerankers, and fill-in-the-middle. Each capability is a modality module and each vendor implementation is its own sub-module — you import only the SDKs you actually use.

Documentation

Features

  • Per-vendor modules — Pull only the SDKs you need; no transitive bloat
  • LLM — Chat, streaming, tool calling, structured output, reasoning
  • Agent framework — Sub-agents, handoffs, fan-out, sessions, persistent memory, context strategies
  • Voice agent — Low-latency streaming STT → LLM → TTS pipeline with barge-in, filler audio, tool-call sounds, sessions, hooks, handoffs, toolsets, and memory
  • Embeddings — Text, multimodal, and contextualized
  • Image generation — OpenAI, Gemini, xAI
  • Audio — TTS (ElevenLabs, OpenAI, Google Cloud, Azure Speech) and STT (OpenAI Whisper, ElevenLabs Scribe, Deepgram, AssemblyAI, Google Cloud)
  • Rerankers — Voyage AI, Cohere
  • Fill-in-the-middle — Mistral, DeepSeek
  • Batch processing — Native batch APIs (OpenAI, Anthropic, Gemini) or bounded concurrency for any provider
  • MCP integration — Model Context Protocol tooling
  • OpenTelemetry tracing — GenAI semantic conventions across every provider call
  • Cost tracking — Token / character usage with cost calculation

Module structure

The library is published as ~50 independent Go modules organised by tier:

  • Tier 0 leavesmodel, message, tool, schema, tracing, prompt, types (no vendor SDKs)
  • Tier 1 modality interfacesllm, embeddings, tts, stt, image, rerankers, fim (no vendor SDKs)
  • Tier 2 vendor implementationsllm/openai, llm/anthropic, embeddings/voyage, tts/elevenlabs, etc. (carry the vendor SDK)
  • Tier 3 utilitiestokens/{sliding,truncate,summarize}, batch/{openai,anthropic,gemini,concurrent}
  • Tier 4 agent runtimeagent, agent/team, session, memory, voice
  • Tier 5 persistencememory/{pgvector,postgres,sqlite}

See the full module list for every package, its purpose, and the vendor SDK it carries.

Installation

You install only the modules you use. For an OpenAI chat client:

go get github.com/joakimcarlsson/ai/llm
go get github.com/joakimcarlsson/ai/llm/openai
go get github.com/joakimcarlsson/ai/message
go get github.com/joakimcarlsson/ai/model

Quick start

package main

import (
    "context"
    "fmt"
    "log"
    "os"

    llmopenai "github.com/joakimcarlsson/ai/llm/openai"
    "github.com/joakimcarlsson/ai/message"
    "github.com/joakimcarlsson/ai/model"
)

func main() {
    client := llmopenai.NewLLM(
        llmopenai.WithAPIKey(os.Getenv("OPENAI_API_KEY")),
        llmopenai.WithModel(model.OpenAIModels[model.GPT4o]),
    )

    response, err := client.SendMessages(context.Background(), []message.Message{
        message.NewUserMessage("Hello, how are you?"),
    }, nil)
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println(response.Content)
}

Supported providers

Provider LLM Embeddings Images TTS STT Rerankers FIM
OpenAI
Anthropic
Google Gemini
Google Cloud
AWS Bedrock
Azure OpenAI
Azure Speech
Vertex AI
Groq
OpenRouter
xAI
Voyage AI
Cohere
Mistral
DeepSeek
ElevenLabs
Deepgram
AssemblyAI

Plus any OpenAI-compatible endpoint via BYOM.

Agent framework

import (
    "github.com/joakimcarlsson/ai/agent"
    "github.com/joakimcarlsson/ai/session"
)

myAgent := agent.New(llmClient,
    agent.WithSystemPrompt("You are a helpful assistant."),
    agent.WithTools(&weatherTool{}),
    agent.WithSession("user-123", session.FileStore("./sessions")),
)

response, _ := myAgent.Chat(ctx, "What's the weather in Tokyo?")

The agent framework supports sub-agents, handoffs, fan-out, team coordination, continue/resume, context strategies, persistent memory, and instruction templates.

Voice agent

voice/ ships a streaming STT → LLM → TTS pipeline for building low-latency, voice-first conversational agents. Pluggable providers — bring any stt.SpeechToText, llm.LLM, and tts.Generation implementation.

import (
    "github.com/joakimcarlsson/ai/session"
    "github.com/joakimcarlsson/ai/voice"
)

agent := voice.New(llmClient, sttClient, ttsClient,
    voice.WithSystemPrompt("You are a concise voice assistant."),
    voice.WithTools(myTool),
    voice.WithBargeIn(voice.BargeInInterrupt),
    voice.WithSession("user-42", session.MemoryStore()),
)

conv, _ := agent.StartConversation(ctx, audioTransport)
for evt := range conv.Events() {
    // observe transcripts, tool calls, deltas, etc.
}

The voice agent supports barge-in, filler audio and tool-call sounds for slow first tokens / tool execution, sessions, context strategies, hooks, handoffs, toolsets, and memory. Four runnable end-to-end examples under examples/voice/: web (kitchen-sink), handoff, toolsets, memory.

Batch processing

Each batch backend is its own module. Native batch APIs submit a single async job; the concurrent runner wraps an existing client with bounded concurrency.

import (
    "github.com/joakimcarlsson/ai/batch"
    batchopenai "github.com/joakimcarlsson/ai/batch/openai"
)

proc := batchopenai.NewProcessor(
    batchopenai.WithAPIKey("your-api-key"),
    batchopenai.WithModel(model.OpenAIModels[model.GPT4o]),
)

requests := []batch.Request{
    {ID: "q1", Type: batch.RequestTypeChat, Messages: msgs1},
    {ID: "q2", Type: batch.RequestTypeChat, Messages: msgs2},
}

resp, _ := proc.Process(ctx, requests)
for _, r := range resp.Results {
    fmt.Printf("[%s] %s\n", r.ID, r.ChatResponse.Content)
}

Per-item error handling, progress callbacks, and async channel-based tracking are all supported. See the batch processing docs.

Workspace setup

The repo is a pure Go workspace with no root module. To work locally:

git clone https://github.com/joakimcarlsson/ai
cd ai
cp go.work.example go.work   # go.work is gitignored
go build ./...

go.work.example is the canonical workspace file checked into git; contributors copy or symlink it to go.work.

Versioning

Each module is versioned independently using path-prefixed git tags. The tag prefix must match the subdirectory path exactly — this is how the Go module system resolves versions.

Module Tag format Example
llm/openai llm/openai/vX.Y.Z llm/openai/v0.1.0
embeddings/voyage embeddings/voyage/vX.Y.Z embeddings/voyage/v0.1.0
agent agent/vX.Y.Z agent/v0.2.0
memory/pgvector memory/pgvector/vX.Y.Z memory/pgvector/v0.1.0

All modules follow semantic versioning.

Release process

Releases follow the AWS SDK v2 pattern: CI on main is the safety net, git tags drive go get resolution, and dated GitHub Releases provide changelogs.

1. Ensure main is green

CI must pass on the latest commit before tagging.

2. Tag modules that changed

# List every module
scripts/release.sh modules

# Tag a single module (dry-run — creates local tag only)
scripts/release.sh tag -m llm/openai -v v0.1.0

# Tag and push
make release-tag MODULE=llm/openai VERSION=v0.1.0

The script verifies the module's go.mod exists and the tag prefix matches the directory path.

3. Cascade to direct consumers (when bumping a shared module)

When the changed module is a shared dep that surfaces user-facing symbols (model, message, memory, etc.), open a branch and bump the require line in every module a typical user go gets to access the new symbols. For example, a model change adding new Gemini constants cascades to llm/gemini, image/gemini, embeddings/gemini, batch/gemini, and llm/vertexai, but not to unrelated providers (llm/openai, llm/anthropic) or to umbrella modules that take a user-built client (agent, voice).

cd llm/gemini && go mod edit -require=github.com/joakimcarlsson/ai/model@v0.2.0 && go mod tidy
# ...repeat for each direct consumer, then commit + PR

After the PR merges, tag each cascaded module with a patch bump. Without the cascade, users running go get llm/gemini@latest get the old model version via MVS and the new constants don't resolve in their build.

Skip this step when the changed module isn't a shared dep (e.g. a fix internal to llm/anthropic), or when only the module's own go.mod changed (indirect dep bumps, tidy cleanup). Those have no consumer-visible effect and don't need tags at all.

4. Warm the Go module proxy

scripts/release.sh warm -t llm/openai/v0.1.0

This ensures the tagged version is immediately available via go get. Run it for every tag created in steps 2 and 3.

5. Create a dated GitHub Release

# Dry-run (shows what would be published)
scripts/release.sh release

# Publish
make release-publish

This creates a release-YYYY-MM-DD tag and a GitHub Release listing all module tags created since the previous release.

License

See LICENSE file.

About

A Go toolkit for building AI agents and applications across multiple providers. Unified LLM client, agent framework with handoffs, tool calling, streaming, structured output, MCP integration, and cost tracking.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages