Selecting an embedded database

Part of series: Making a naive go backend

A while back I first read about pocketbase - a firebase alternative that can be deployed with one executable and one database file. After learning of it I kept it in mind while trying to theory craft use cases for it, and it stuck with me. Imagine shipping a single executable and backing up a single database file.

Pocketbase claims 10k+ concurrent realtime data users on a cheap VPS, a very reasonable number for deployment that simple. I think many apps and products could get by with those numbers, at least to the point of making money. Especially if the time spent developing to get there is short enough to have some spare time to reimplement performance bottlenecks at a later time.

SQLite

Pocketbase uses SQLite “the most used database engine in the world” in WAL Mode. TLDR: Writing to a separate log, later merged into main DB -> speed and concurrency.

SQLite is great considering it boasts being a full-featured SQL database with additional luxuries as WAL. But if the target use case is firebase-ish, a document oriented store with flexible schemas, why SQL? Also: Pocketbase and Firebase are both oriented around configuring your collections in the UI of the platform, not as code. This works in some cases, but this quite strongly ties your application to the specific platform.

Pocketbases claim of 10k+ concurrent realtime data users mostly revolve around maintaining connections and distributing data to interested users - so the database is not a core feature. This got me digging through awesome-go hunting for an alternative that maybe aligned more with simple keys and documents. Quickly I found an alternative written entirely in go:

bolt/bbolt - An embedded key/value database for Go.

  • Initially developed by a Ben Johnson, now forked and maintained by etcd.
  • B+ tree
  • Single writer, multiple reader
  • Single file memory-mapped for speed
  • Super simple API
  • Buckets that you can nest -> split data into customers’/tenants’/users’ separate bins.

One great thing here is the statement: “Single writer, multiple reader”. That’s really the big limitation you need to understand to evaluate how well it will fit. The etcd bbolt-repo lists more caveats for those interested. It being B+ tree based also gives very fast ordered scans, so having some time-series data in there will be just fine.

The list of projects using bolt contain:

  • NATS
  • InfluxDB
  • Consul
  • etcd

If you read this far, thanks. The next post will be about authorization.

An experiment in implementing this can be found at my sourcehut.