Migrating? MIGRATION.md covers two transitions:
v0.18.x → v0.1.0(single module split into ~50 per-vendor modules) andv0.1.x → v0.2.0(memoryandsessionlifted out ofagent/to top-level modules).
A multi-provider Go library for AI: LLMs, embeddings, image generation, TTS, STT, rerankers, and fill-in-the-middle. Each capability is a modality module and each vendor implementation is its own sub-module — you import only the SDKs you actually use.
- Per-vendor modules — Pull only the SDKs you need; no transitive bloat
- LLM — Chat, streaming, tool calling, structured output, reasoning
- Agent framework — Sub-agents, handoffs, fan-out, sessions, persistent memory, context strategies
- Voice agent — Low-latency streaming STT → LLM → TTS pipeline with barge-in, filler audio, tool-call sounds, sessions, hooks, handoffs, toolsets, and memory
- Embeddings — Text, multimodal, and contextualized
- Image generation — OpenAI, Gemini, xAI
- Audio — TTS (ElevenLabs, OpenAI, Google Cloud, Azure Speech) and STT (OpenAI Whisper, ElevenLabs Scribe, Deepgram, AssemblyAI, Google Cloud)
- Rerankers — Voyage AI, Cohere
- Fill-in-the-middle — Mistral, DeepSeek
- Batch processing — Native batch APIs (OpenAI, Anthropic, Gemini) or bounded concurrency for any provider
- MCP integration — Model Context Protocol tooling
- OpenTelemetry tracing — GenAI semantic conventions across every provider call
- Cost tracking — Token / character usage with cost calculation
The library is published as ~50 independent Go modules organised by tier:
- Tier 0 leaves —
model,message,tool,schema,tracing,prompt,types(no vendor SDKs) - Tier 1 modality interfaces —
llm,embeddings,tts,stt,image,rerankers,fim(no vendor SDKs) - Tier 2 vendor implementations —
llm/openai,llm/anthropic,embeddings/voyage,tts/elevenlabs, etc. (carry the vendor SDK) - Tier 3 utilities —
tokens/{sliding,truncate,summarize},batch/{openai,anthropic,gemini,concurrent} - Tier 4 agent runtime —
agent,agent/team,session,memory,voice - Tier 5 persistence —
memory/{pgvector,postgres,sqlite}
See the full module list for every package, its purpose, and the vendor SDK it carries.
You install only the modules you use. For an OpenAI chat client:
go get github.com/joakimcarlsson/ai/llm
go get github.com/joakimcarlsson/ai/llm/openai
go get github.com/joakimcarlsson/ai/message
go get github.com/joakimcarlsson/ai/modelpackage main
import (
"context"
"fmt"
"log"
"os"
llmopenai "github.com/joakimcarlsson/ai/llm/openai"
"github.com/joakimcarlsson/ai/message"
"github.com/joakimcarlsson/ai/model"
)
func main() {
client := llmopenai.NewLLM(
llmopenai.WithAPIKey(os.Getenv("OPENAI_API_KEY")),
llmopenai.WithModel(model.OpenAIModels[model.GPT4o]),
)
response, err := client.SendMessages(context.Background(), []message.Message{
message.NewUserMessage("Hello, how are you?"),
}, nil)
if err != nil {
log.Fatal(err)
}
fmt.Println(response.Content)
}| Provider | LLM | Embeddings | Images | TTS | STT | Rerankers | FIM |
|---|---|---|---|---|---|---|---|
| OpenAI | ✅ | ✅ | ✅ | ✅ | ✅ | ||
| Anthropic | ✅ | ||||||
| Google Gemini | ✅ | ✅ | ✅ | ||||
| Google Cloud | ✅ | ✅ | |||||
| AWS Bedrock | ✅ | ✅ | |||||
| Azure OpenAI | ✅ | ||||||
| Azure Speech | ✅ | ||||||
| Vertex AI | ✅ | ||||||
| Groq | ✅ | ||||||
| OpenRouter | ✅ | ||||||
| xAI | ✅ | ✅ | |||||
| Voyage AI | ✅ | ✅ | |||||
| Cohere | ✅ | ✅ | ✅ | ||||
| Mistral | ✅ | ✅ | ✅ | ||||
| DeepSeek | ✅ | ✅ | |||||
| ElevenLabs | ✅ | ✅ | |||||
| Deepgram | ✅ | ||||||
| AssemblyAI | ✅ |
Plus any OpenAI-compatible endpoint via BYOM.
import (
"github.com/joakimcarlsson/ai/agent"
"github.com/joakimcarlsson/ai/session"
)
myAgent := agent.New(llmClient,
agent.WithSystemPrompt("You are a helpful assistant."),
agent.WithTools(&weatherTool{}),
agent.WithSession("user-123", session.FileStore("./sessions")),
)
response, _ := myAgent.Chat(ctx, "What's the weather in Tokyo?")The agent framework supports sub-agents, handoffs, fan-out, team coordination, continue/resume, context strategies, persistent memory, and instruction templates.
voice/ ships a streaming STT → LLM → TTS pipeline for building low-latency, voice-first conversational agents. Pluggable providers — bring any stt.SpeechToText, llm.LLM, and tts.Generation implementation.
import (
"github.com/joakimcarlsson/ai/session"
"github.com/joakimcarlsson/ai/voice"
)
agent := voice.New(llmClient, sttClient, ttsClient,
voice.WithSystemPrompt("You are a concise voice assistant."),
voice.WithTools(myTool),
voice.WithBargeIn(voice.BargeInInterrupt),
voice.WithSession("user-42", session.MemoryStore()),
)
conv, _ := agent.StartConversation(ctx, audioTransport)
for evt := range conv.Events() {
// observe transcripts, tool calls, deltas, etc.
}The voice agent supports barge-in, filler audio and tool-call sounds for slow first tokens / tool execution, sessions, context strategies, hooks, handoffs, toolsets, and memory. Four runnable end-to-end examples under examples/voice/: web (kitchen-sink), handoff, toolsets, memory.
Each batch backend is its own module. Native batch APIs submit a single async job; the concurrent runner wraps an existing client with bounded concurrency.
import (
"github.com/joakimcarlsson/ai/batch"
batchopenai "github.com/joakimcarlsson/ai/batch/openai"
)
proc := batchopenai.NewProcessor(
batchopenai.WithAPIKey("your-api-key"),
batchopenai.WithModel(model.OpenAIModels[model.GPT4o]),
)
requests := []batch.Request{
{ID: "q1", Type: batch.RequestTypeChat, Messages: msgs1},
{ID: "q2", Type: batch.RequestTypeChat, Messages: msgs2},
}
resp, _ := proc.Process(ctx, requests)
for _, r := range resp.Results {
fmt.Printf("[%s] %s\n", r.ID, r.ChatResponse.Content)
}Per-item error handling, progress callbacks, and async channel-based tracking are all supported. See the batch processing docs.
The repo is a pure Go workspace with no root module. To work locally:
git clone https://github.com/joakimcarlsson/ai
cd ai
cp go.work.example go.work # go.work is gitignored
go build ./...go.work.example is the canonical workspace file checked into git;
contributors copy or symlink it to go.work.
Each module is versioned independently using path-prefixed git tags. The tag prefix must match the subdirectory path exactly — this is how the Go module system resolves versions.
| Module | Tag format | Example |
|---|---|---|
| llm/openai | llm/openai/vX.Y.Z |
llm/openai/v0.1.0 |
| embeddings/voyage | embeddings/voyage/vX.Y.Z |
embeddings/voyage/v0.1.0 |
| agent | agent/vX.Y.Z |
agent/v0.2.0 |
| memory/pgvector | memory/pgvector/vX.Y.Z |
memory/pgvector/v0.1.0 |
All modules follow semantic versioning.
Releases follow the AWS SDK v2 pattern: CI on main is the safety net, git tags
drive go get resolution, and dated GitHub Releases provide changelogs.
CI must pass on the latest commit before tagging.
# List every module
scripts/release.sh modules
# Tag a single module (dry-run — creates local tag only)
scripts/release.sh tag -m llm/openai -v v0.1.0
# Tag and push
make release-tag MODULE=llm/openai VERSION=v0.1.0The script verifies the module's go.mod exists and the tag prefix matches
the directory path.
When the changed module is a shared dep that surfaces user-facing symbols
(model, message, memory, etc.), open a branch and bump the require
line in every module a typical user go gets to access the new symbols.
For example, a model change adding new Gemini constants cascades to
llm/gemini, image/gemini, embeddings/gemini, batch/gemini, and
llm/vertexai, but not to unrelated providers (llm/openai,
llm/anthropic) or to umbrella modules that take a user-built client
(agent, voice).
cd llm/gemini && go mod edit -require=github.com/joakimcarlsson/ai/model@v0.2.0 && go mod tidy
# ...repeat for each direct consumer, then commit + PRAfter the PR merges, tag each cascaded module with a patch bump. Without
the cascade, users running go get llm/gemini@latest get the old model
version via MVS and the new constants don't resolve in their build.
Skip this step when the changed module isn't a shared dep (e.g. a fix
internal to llm/anthropic), or when only the module's own go.mod
changed (indirect dep bumps, tidy cleanup). Those have no
consumer-visible effect and don't need tags at all.
scripts/release.sh warm -t llm/openai/v0.1.0This ensures the tagged version is immediately available via go get.
Run it for every tag created in steps 2 and 3.
# Dry-run (shows what would be published)
scripts/release.sh release
# Publish
make release-publishThis creates a release-YYYY-MM-DD tag and a GitHub Release listing all
module tags created since the previous release.
See LICENSE file.