1# chromem-go
2
3[![Go Reference](https://pkg.go.dev/badge/github.com/philippgille/chromem-go.svg)](https://pkg.go.dev/github.com/philippgille/chromem-go)
4[![Build status](https://github.com/philippgille/chromem-go/actions/workflows/go.yml/badge.svg)](https://github.com/philippgille/chromem-go/actions/workflows/go.yml)
5[![Go Report Card](https://goreportcard.com/badge/github.com/philippgille/chromem-go)](https://goreportcard.com/report/github.com/philippgille/chromem-go)
6[![GitHub Releases](https://img.shields.io/github/release/philippgille/chromem-go.svg)](https://github.com/philippgille/chromem-go/releases)
7
8Embeddable vector database for Go with Chroma-like interface and zero third-party dependencies. In-memory with optional persistence.
9
10Because `chromem-go` is embeddable it enables you to add retrieval augmented generation (RAG) and similar embeddings-based features into your Go app *without having to run a separate database*. Like when using SQLite instead of PostgreSQL/MySQL/etc.
11
12It's *not* a library to connect to Chroma and also not a reimplementation of it in Go. It's a database on its own.
13
14The focus is not scale (millions of documents) or number of features, but simplicity and performance for the most common use cases. On a mid-range 2020 Intel laptop CPU you can query 1,000 documents in 0.3 ms and 100,000 documents in 40 ms, with very few and small memory allocations. See [Benchmarks](#benchmarks) for details.
15
16> ⚠️ The project is in beta, under heavy construction, and may introduce breaking changes in releases before `v1.0.0`. All changes are documented in the [`CHANGELOG`](./CHANGELOG.md).
17
18## Contents
19
201. [Use cases](#use-cases)
212. [Interface](#interface)
223. [Features + Roadmap](#features)
234. [Installation](#installation)
245. [Usage](#usage)
256. [Benchmarks](#benchmarks)
267. [Development](#development)
278. [Motivation](#motivation)
289. [Related projects](#related-projects)
29
30## Use cases
31
32With a vector database you can do various things:
33
34- Retrieval augmented generation (RAG), question answering (Q&A)
35- Text and code search
36- Recommendation systems
37- Classification
38- Clustering
39
40Let's look at the RAG use case in more detail:
41
42### RAG
43
44The knowledge of large language models (LLMs) - even the ones with 30 billion, 70 billion parameters and more - is limited. They don't know anything about what happened after their training ended, they don't know anything about data they were not trained with (like your company's intranet, Jira / bug tracker, wiki or other kinds of knowledge bases), and even the data they *do* know they often can't reproduce it *exactly*, but start to *hallucinate* instead.
45
46Fine-tuning an LLM can help a bit, but it's more meant to improve the LLMs reasoning about specific topics, or reproduce the style of written text or code. Fine-tuning does *not* add knowledge *1:1* into the model. Details are lost or mixed up. And knowledge cutoff (about anything that happened after the fine-tuning) isn't solved either.
47
48=> A vector database can act as the up-to-date, precise knowledge for LLMs:
49
501. You store relevant documents that you want the LLM to know in the database.
512. The database stores the *embeddings* alongside the documents, which you can either provide or can be created by specific "embedding models" like OpenAI's `text-embedding-3-small`.
52 - `chromem-go` can do this for you and supports multiple embedding providers and models out-of-the-box.
533. Later, when you want to talk to the LLM, you first send the question to the vector DB to find *similar*/*related* content. This is called "nearest neighbor search".
544. In the question to the LLM, you provide this content alongside your question.
555. The LLM can take this up-to-date precise content into account when answering.
56
57Check out the [example code](examples) to see it in action!
58
59## Interface
60
61Our original inspiration was the [Chroma](https://www.trychroma.com/) interface, whose core API is the following (taken from their [README](https://github.com/chroma-core/chroma/blob/0.4.21/README.md)):
62
63<details><summary>Chroma core interface</summary>
64
65```python
66import chromadb
67# setup Chroma in-memory, for easy prototyping. Can add persistence easily!
68client = chromadb.Client()
69
70# Create collection. get_collection, get_or_create_collection, delete_collection also available!
71collection = client.create_collection("all-my-documents")
72
73# Add docs to the collection. Can also update and delete. Row-based API coming soon!
74collection.add(
75 documents=["This is document1", "This is document2"], # we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well
76 metadatas=[{"source": "notion"}, {"source": "google-docs"}], # filter on these!
77 ids=["doc1", "doc2"], # unique for each doc
78)
79
80# Query/search 2 most similar results. You can also .get by id
81results = collection.query(
82 query_texts=["This is a query document"],
83 n_results=2,
84 # where={"metadata_field": "is_equal_to_this"}, # optional filter
85 # where_document={"$contains":"search_string"} # optional filter
86)
87```
88
89</details>
90
91Our Go library exposes the same interface:
92
93<details><summary>chromem-go equivalent</summary>
94
95```go
96package main
97
98import "github.com/philippgille/chromem-go"
99
100func main() {
101 // Set up chromem-go in-memory, for easy prototyping. Can add persistence easily!
102 // We call it DB instead of client because there's no client-server separation. The DB is embedded.
103 db := chromem.NewDB()
104
105 // Create collection. GetCollection, GetOrCreateCollection, DeleteCollection also available!
106 collection, _ := db.CreateCollection("all-my-documents", nil, nil)
107
108 // Add docs to the collection. Update and delete will be added in the future.
109 // Can be multi-threaded with AddConcurrently()!
110 // We're showing the Chroma-like method here, but more Go-idiomatic methods are also available!
111 _ = collection.Add(ctx,
112 []string{"doc1", "doc2"}, // unique ID for each doc
113 nil, // We handle embedding automatically. You can skip that and add your own embeddings as well.
114 []map[string]string{{"source": "notion"}, {"source": "google-docs"}}, // Filter on these!
115 []string{"This is document1", "This is document2"},
116 )
117
118 // Query/search 2 most similar results. Getting by ID will be added in the future.
119 results, _ := collection.Query(ctx,
120 "This is a query document",
121 2,
122 map[string]string{"metadata_field": "is_equal_to_this"}, // optional filter
123 map[string]string{"$contains": "search_string"}, // optional filter
124 )
125}
126```
127
128</details>
129
130Initially `chromem-go` started with just the four core methods, but we added more over time. We intentionally don't want to cover 100% of Chroma's API surface though.
131We're providing some alternative methods that are more Go-idiomatic instead.
132
133For the full interface see the Godoc: <https://pkg.go.dev/github.com/philippgille/chromem-go>
134
135## Features
136
137- [X] Zero dependencies on third party libraries
138- [X] Embeddable (like SQLite, i.e. no client-server model, no separate DB to maintain)
139- [X] Multithreaded processing (when adding and querying documents), making use of Go's native concurrency features
140- [X] Experimental WebAssembly binding
141- Embedding creators:
142 - Hosted:
143 - [X] [OpenAI](https://platform.openai.com/docs/guides/embeddings/embedding-models) (default)
144 - [X] [Cohere](https://cohere.com/models/embed)
145 - [X] [Mistral](https://docs.mistral.ai/platform/endpoints/#embedding-models)
146 - [X] [Jina](https://jina.ai/embeddings)
147 - [X] [mixedbread.ai](https://www.mixedbread.ai/)
148 - Local:
149 - [X] [Ollama](https://github.com/ollama/ollama)
150 - [X] [LocalAI](https://github.com/mudler/LocalAI)
151 - Bring your own (implement [`chromem.EmbeddingFunc`](https://pkg.go.dev/github.com/philippgille/chromem-go#EmbeddingFunc))
152 - You can also pass existing embeddings when adding documents to a collection, instead of letting `chromem-go` create them
153- Similarity search:
154 - [X] Exhaustive nearest neighbor search using cosine similarity (sometimes also called exact search or brute-force search or FLAT index)
155- Filters:
156 - [X] Document filters: `$contains`, `$not_contains`
157 - [X] Metadata filters: Exact matches
158- Storage:
159 - [X] In-memory
160 - [X] Optional immediate persistence (writes one file for each added collection and document, encoded as [gob](https://go.dev/blog/gob), optionally gzip-compressed)
161 - [X] Backups: Export and import of the entire DB to/from a single file (encoded as [gob](https://go.dev/blog/gob), optionally gzip-compressed and AES-GCM encrypted)
162 - Includes methods for generic `io.Writer`/`io.Reader` so you can plug S3 buckets and other blob storage, see [examples/s3-export-import](examples/s3-export-import) for example code
163- Data types:
164 - [X] Documents (text)
165
166### Roadmap
167
168- Performance:
169 - Use SIMD for dot product calculation on supported CPUs (draft PR: [#48](https://github.com/philippgille/chromem-go/pull/48))
170 - Add [roaring bitmaps](https://github.com/RoaringBitmap/roaring) to speed up full text filtering
171- Embedding creators:
172 - Add an `EmbeddingFunc` that downloads and shells out to [llamafile](https://github.com/Mozilla-Ocho/llamafile)
173- Similarity search:
174 - Approximate nearest neighbor search with index (ANN)
175 - Hierarchical Navigable Small World (HNSW)
176 - Inverted file flat (IVFFlat)
177- Filters:
178 - Operators (`$and`, `$or` etc.)
179- Storage:
180 - JSON as second encoding format
181 - Write-ahead log (WAL) as second file format
182 - Optional remote storage (S3, PostgreSQL, ...)
183- Data types:
184 - Images
185 - Videos
186
187## Installation
188
189`go get github.com/philippgille/chromem-go@latest`
190
191## Usage
192
193See the Godoc for a reference: <https://pkg.go.dev/github.com/philippgille/chromem-go>
194
195For full, working examples, using the vector database for retrieval augmented generation (RAG) and semantic search and using either OpenAI or locally running the embeddings model and LLM (in Ollama), see the [example code](examples).
196
197### Quickstart
198
199This is taken from the ["minimal" example](examples/minimal):
200
201```go
202package main
203
204import (
205 "context"
206 "fmt"
207 "runtime"
208
209 "github.com/philippgille/chromem-go"
210)
211
212func main() {
213 ctx := context.Background()
214
215 db := chromem.NewDB()
216
217 c, err := db.CreateCollection("knowledge-base", nil, nil)
218 if err != nil {
219 panic(err)
220 }
221
222 err = c.AddDocuments(ctx, []chromem.Document{
223 {
224 ID: "1",
225 Content: "The sky is blue because of Rayleigh scattering.",
226 },
227 {
228 ID: "2",
229 Content: "Leaves are green because chlorophyll absorbs red and blue light.",
230 },
231 }, runtime.NumCPU())
232 if err != nil {
233 panic(err)
234 }
235
236 res, err := c.Query(ctx, "Why is the sky blue?", 1, nil, nil)
237 if err != nil {
238 panic(err)
239 }
240
241 fmt.Printf("ID: %v\nSimilarity: %v\nContent: %v\n", res[0].ID, res[0].Similarity, res[0].Content)
242}
243```
244
245Output:
246
247```text
248ID: 1
249Similarity: 0.6833369
250Content: The sky is blue because of Rayleigh scattering.
251```
252
253## Benchmarks
254
255Benchmarked on 2024-03-17 with:
256
257- Computer: Framework Laptop 13 (first generation, 2021)
258- CPU: 11th Gen Intel Core i5-1135G7 (2020)
259- Memory: 32 GB
260- OS: Fedora Linux 39
261 - Kernel: 6.7
262
263```console
264$ go test -benchmem -run=^$ -bench .
265goos: linux
266goarch: amd64
267pkg: github.com/philippgille/chromem-go
268cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
269BenchmarkCollection_Query_NoContent_100-8 13164 90276 ns/op 5176 B/op 95 allocs/op
270BenchmarkCollection_Query_NoContent_1000-8 2142 520261 ns/op 13558 B/op 141 allocs/op
271BenchmarkCollection_Query_NoContent_5000-8 561 2150354 ns/op 47096 B/op 173 allocs/op
272BenchmarkCollection_Query_NoContent_25000-8 120 9890177 ns/op 211783 B/op 208 allocs/op
273BenchmarkCollection_Query_NoContent_100000-8 30 39574238 ns/op 810370 B/op 232 allocs/op
274BenchmarkCollection_Query_100-8 13225 91058 ns/op 5177 B/op 95 allocs/op
275BenchmarkCollection_Query_1000-8 2226 519693 ns/op 13552 B/op 140 allocs/op
276BenchmarkCollection_Query_5000-8 550 2128121 ns/op 47108 B/op 173 allocs/op
277BenchmarkCollection_Query_25000-8 100 10063260 ns/op 211705 B/op 205 allocs/op
278BenchmarkCollection_Query_100000-8 30 39404005 ns/op 810295 B/op 229 allocs/op
279PASS
280ok github.com/philippgille/chromem-go 28.402s
281```
282
283## Development
284
285- Build: `go build ./...`
286- Test: `go test -v -race -count 1 ./...`
287- Benchmark:
288 - `go test -benchmem -run=^$ -bench .` (add `> bench.out` or similar to write to a file)
289 - With profiling: `go test -benchmem -run ^$ -cpuprofile cpu.out -bench .`
290 - (profiles: `-cpuprofile`, `-memprofile`, `-blockprofile`, `-mutexprofile`)
291- Compare benchmarks:
292 1. Install `benchstat`: `go install golang.org/x/perf/cmd/benchstat@latest`
293 2. Compare two benchmark results: `benchstat before.out after.out`
294
295## Motivation
296
297In December 2023, when I wanted to play around with retrieval augmented generation (RAG) in a Go program, I looked for a vector database that could be embedded in the Go program, just like you would embed SQLite in order to not require any separate DB setup and maintenance. I was surprised when I didn't find any, given the abundance of embedded key-value stores in the Go ecosystem.
298
299At the time most of the popular vector databases like Pinecone, Qdrant, Milvus, Chroma, Weaviate and others were not embeddable at all or only in Python or JavaScript/TypeScript.
300
301Then I found [@eliben](https://github.com/eliben)'s [blog post](https://eli.thegreenplace.net/2023/retrieval-augmented-generation-in-go/) and [example code](https://github.com/eliben/code-for-blog/tree/eda87b87dad9ed8bd45d1c8d6395efba3741ed39/2023/go-rag-openai) which showed that with very little Go code you could create a very basic PoC of a vector database.
302
303That's when I decided to build my own vector database, embeddable in Go, inspired by the ChromaDB interface. ChromaDB stood out for being embeddable (in Python), and by showing its core API in 4 commands on their README and on the landing page of their website.
304
305## Related projects
306
307- Shoutout to [@eliben](https://github.com/eliben) whose [blog post](https://eli.thegreenplace.net/2023/retrieval-augmented-generation-in-go/) and [example code](https://github.com/eliben/code-for-blog/tree/eda87b87dad9ed8bd45d1c8d6395efba3741ed39/2023/go-rag-openai) inspired me to start this project!
308- [Chroma](https://github.com/chroma-core/chroma): Looking at Pinecone, Qdrant, Milvus, Weaviate and others, Chroma stood out by showing its core API in 4 commands on their README and on the landing page of their website. It was also putting the most emphasis on its embeddability (in Python).
309- The big, full-fledged client-server-based vector databases for maximum scale and performance:
310 - [Pinecone](https://www.pinecone.io/): Closed source
311 - [Qdrant](https://github.com/qdrant/qdrant): Written in Rust, not embeddable in Go
312 - [Milvus](https://github.com/milvus-io/milvus): Written in Go and C++, but not embeddable as of December 2023
313 - [Weaviate](https://github.com/weaviate/weaviate): Written in Go, but not embeddable in Go as of March 2024 (only in Python and JavaScript/TypeScript and that's experimental)
314- Some non-specialized SQL, NoSQL and Key-Value databases added support for storing vectors and (some of them) querying based on similarity:
315 - [pgvector](https://github.com/pgvector/pgvector) extension for [PostgreSQL](https://www.postgresql.org/): Client-server model
316 - [Redis](https://github.com/redis/redis) ([1](https://redis.io/docs/interact/search-and-query/query/vector-search/), [2](https://redis.io/docs/interact/search-and-query/advanced-concepts/vectors/)): Client-server model
317 - [sqlite-vss](https://github.com/asg017/sqlite-vss) extension for [SQLite](https://www.sqlite.org/): Embedded, but the [Go bindings](https://github.com/asg017/sqlite-vss/tree/8fc44301843029a13a474d1f292378485e1fdd62/bindings/go) require CGO. There's a [CGO-free Go library](https://gitlab.com/cznic/sqlite) for SQLite, but then it's without the vector search extension.
318 - [DuckDB](https://github.com/duckdb/duckdb) has a function to calculate cosine similarity ([1](https://duckdb.org/docs/sql/functions/nested)): Embedded, but the Go bindings use CGO
319 - [MongoDB](https://github.com/mongodb/mongo)'s cloud platform offers a vector search product ([1](https://www.mongodb.com/products/platform/atlas-vector-search)): Client-server model
320- Some libraries for vector similarity search:
321 - [Faiss](https://github.com/facebookresearch/faiss): Written in C++; 3rd party Go bindings use CGO
322 - [Annoy](https://github.com/spotify/annoy): Written in C++; Go bindings use CGO ([1](https://github.com/spotify/annoy/blob/2be37c9e015544be2cf60c431f0cccc076151a2d/README_GO.rst))
323 - [USearch](https://github.com/unum-cloud/usearch): Written in C++; Go bindings use CGO
324- Some orchestration libraries, inspired by the Python library [LangChain](https://github.com/langchain-ai/langchain), but with no or only rudimentary embedded vector DB:
325 - [LangChain Go](https://github.com/tmc/langchaingo)
326 - [LinGoose](https://github.com/henomis/lingoose)
327 - [GoLC](https://github.com/hupe1980/golc)
View as plain text