The single-writer Database Architecture

Klaas van Schelven; March 24 - 9 min read

A conveyor belt with lady-bugs — Conveyor belt with lady-bugs. Representing the idea of processing events in a predictable, sequential manner.

TL;DR: Bugsink uses a single-writer architecture to keep database state consistent and predictable. This model is built around SQLite’s concurrency model: for writes, a global lock; for reads, snapshot isolation. Bugsink’s specific use case (event processing) makes this model a perfect fit.

Introduction & Design Goals

In this article, we’ll take a look at Bugsink’s transaction model, and how it uses SQLite’s unique properties to create a simple, stable, and efficient system.

Bugsink is a self-hosted error tracker. This means that it’s built to be simple to operate and safe to run: everybody should be able to set it up and keep it running without too much hassle.

It’s also designed to be simple to maintain: I’m a single developer taking on a market in which VC-backed unicorns operate. This means that I need to be able to reason about the system’s behavior quickly and confidently, even if I haven’t touched the code in months.

Here’s how the broader design goals translate into the transaction model:

Stability: No weird edge cases, no locking errors, no surprises under load.
Data Integrity: The transaction model should ensure that data is never left in an inconsistent state, even if something goes wrong. (this is a bit of an open door, but it’s worth stating explicitly).
Simplicity: The transaction model should be simple to understand and implement. There’s one way to do it, it’s used everywhere, and it’s easy to reason about later. Think once, apply everywhere.
Alignment with SQLite: Lean into what SQLite is good at. Treat its limits as “features in disguise”, shaping forces that guide the design. Don’t fight the datbase.
Fit with Bugsink’s Architecture: Bugsink makes strong, opinionated choices, and the transaction model leans into them. Also: the way we interact with the database might just influence the way we structure the application.

All of these goals point in the same direction: I want a system that works correctly, but without adding extra overhead to my development process.

Contrast this with the typical “just don’t want to think about transactions at all” approach, where you just do CRUD operations throughout the code. That may just work, until it doesn’t. And that’s not acceptible to me.

Still, the underlying objective of not having to think about transactions all the time is very valuable. The transaction model should be a given, not something that needs to be rethought for every new feature.

SQLite in Production: the Basics

When Bugsink was first developed (i.e. when this architecture materialized), we supported only a single database: SQLite. This means this story starts with a production-ready configuration of SQLite.

By the way, that somewhat unusual choice was itself made to fit with the design goals of being easy to install and operate.

As recommended by basically everyone, the first step is to enable WAL mode. Without it, any write blocks all readers until it completes. That’s fine for scripts or low-traffic apps, but not acceptable for a production system. WAL mode avoids this by allowing readers and writers to operate concurrently. It’s the default in Bugsink, and the foundation for everything that follows.

The second key concept to understand is how reads behave under WAL. SQLite provides snapshot isolation: a read transaction sees the database exactly as it was when the read began, even if concurrent writes are happening. Readers never see any writes that started after their transaction began. It also ensures the UI and API reflect a consistent view of the world, and it matches how most developers (humans, even) intuitively expect things to work.

Global write lock

In the above, I mentioned that WAL mode allows readers and writers to operate concurrently. That’s true, in the sense that readers don’t block writers and vice versa.

Regarding multiple writers operating at the same time, however, the story is different. SQLite allows only one writer at a time. Even in WAL mode, concurrent write transactions are not possible – any attempt to write while another write is in progress will block or fail with a SQLITE_BUSY error. This limitation applies across all tables and connections to the database: the write lock is effectively global.

For many applications, that’s a dealbreaker. But it’s also one of the clearest boundaries SQLite enforces, and it gives you a strong guarantee: if a write is in progress, nothing else is modifying the database.

Turning a Weakness into Strength

The global write lock might look like a limitation — but in Bugsink, it became part of the architecture. To turn this limitation into a strength, we embraced it fully.

The key insight was this: Bugsink is, at its core, an error-event processing machine, and it’s this event processing that puts the overwhelming majority of writes into the database. The other 1% or so of writes, in systems under load, are user-triggered actions like marking an issue as resolved or changing settings.

That makes it natural to split the event-processing into ingestion and digestion phases: receive events, queue them, and process them in some separate pipeline.

This is exactly how Bugsink works in practice. Incoming events are accepted by a lightweight ingestion process, which does not write to the database, but instead enqueues a digestion task to be picked up “later”.

A separate process reads from this queue and does the actual work of processing the event and writing it to the datbase, the “digestion”. For the purposes of this article, feel free to think of this digestion process as a single-threaded, fully serialized process. (In actuality, the process is multi-threaded and we just rely on SQLite to serialize writes, but that’s an implementation detail).

By leaning into SQLite’s locking behavior rather than fighting it, the architecture avoids contention and becomes simpler overall.

One major advantage of a single write architecture is that it removes the mental overhead of coordinating concurrent modifications. With only one writer at a time, there’s no need to think about what might be changing in parallel.

For example, consider what happens when a new event comes in: the event itself needs to be stored, the corresponding issue’s event count must be incremented, and the project-wide counters need to be updated as well. In a concurrent system, each of these updates might require separate locks, retries, or careful coordination to avoid subtle bugs. But in Bugsink’s model, all of this happens in one transaction, with no other writes taking place at the same time.

Of course, this does put the burden on the digestion process to be very fast: since each digest operation is serialized, it must be able to keep up with the incoming event rate. You can’t just “add more workers” to speed things up. But that’s a focused problem: one place to optimize, one code path to understand. Also: the fact that SQLite is so fast at writes (it’s a local database) helps a lot.

Some of the associated optimazations are in fact made possible by the simple model itself: various denormalizations and caches are trivial to compute if you know you’re alone in the database. Consider counting things: when digesting, this becomes a simple plus-1 operation, and those counts themselves being available in the database means they can be read without doing a full table scan.

Technical Details

Let’s see how the above principles are implemented in practice.

Enabling WAL Mode

Bugsink ensures that the database operates in Write-Ahead Logging (WAL) mode by executing:

PRAGMA journal_mode=WAL;

This configuration is applied during the initial setup of the database schema: a mechanism called migrations is already in place to ensure that the database schema matches Bugsink’s requirements, and the WAL mode simply ties into that by being the first migration.

This approach simplifies the setup process for users, as the system is ready for optimal performance out of the box.

All writes use `BEGIN IMMEDIATE`

To ensure that write operations never needlessly fail, Bugsink uses BEGIN IMMEDIATE for all write transactions. To understand why, let’s take a step back and look at how SQLite handles transactions.

In SQLite, the default transaction mode (called “deferred”) doesn’t begin a write transaction until the first write operation is executed. If a transaction begins with a read, it starts as a read transaction; if a write is attempted later, SQLite tries to upgrade it to a write transaction. But if another connection is already in the middle of a write, that upgrade fails, and the operation returns SQLITE_BUSY.

The reason is simple: read transactions operate on a stable snapshot of the database – as it existed when the read began. If a deferred transaction starts as a read and later attempts to write, SQLite faces a problem: the database may have changed in the meantime. SQLite would either have to break the promise of snapshot isolation to continue, or have the upgraded write overwrite the changes made by the concurrent write. Neither of those options are acceptible, so SQLite just says “no” – it errors out instead.

There is a way to prevent this: by starting a transaction with BEGIN IMMEDIATE, it becomes a write transaction right away. That way, if the transaction starts at all, it’s guaranteed to be able to write later. There’s no mid-flight change of state, and the system stays predictable. The trade-off is that the transaction is now a write transaction from the start, and will thus have to wait for concurrent write transactions to finish before it can begin: it will have to wait for the global write lock, even for the operations in the transactions that are reads.

That “trade-off” is a perfect match for the kind of writes Bugsink does: in the event digestion process the writes are sequential (one event is written after the other), and in the UI/API layer, writes are infrequent and can afford to wait a bit (under heavy load this might increase to half a second or so, but that’s still acceptable).

Read or Write: The Handler Decides

As per the above we take some care to tell SQLite about the kind of transaction we need: for operations that will (ultimately) involve writes, we start with BEGIN IMMEDIATE. For read-only operations, we start with BEGIN.

To tie this back to Python code in the HTTP request/response cycle, each view in Bugsink is wrapped in a transaction of the correct type. For most views, we determine which type of transaction to use based on the HTTP method: GET requests are treated as reads, and get a standard read transaction (BEGIN). Anything else—POST, PUT, PATCH, DELETE—is treated as a write, and gets a BEGIN IMMEDIATE transaction.

This is handled by a decorator called @atomic_for_request_method, which wraps the whole request. The main reason for doing this at the outermost level is that important reads (like checking project membership) often happen before the actual view function is called. Wrapping early ensures those reads are also included in the same transaction – and therefore later writes don’t error out.

The case of inspecting the request method is just the default. In some places, we take a slightly more custom approach. The most common example of doing that is when expensive read operations are followed by write operations: in that case a read transaction is started for the expensive reads first. Then a separate write transaction is started (no upgrading). This ensures that the time spent by the expensive read operations does contribute to the time the write transaction is open, and hence to the time the global write lock is held.

Of course, doing a single logical operation in multiple database transactions does require extra work, because the results of the read transaction need to be somehow passed to the write transaction, but without assuming that the data corresponds to a single point in time. In other words: this is the kind of optimization where the design goal of simplicity is at odds with the design goal of performance.

MySQL and Postgres

As mentioned earlier, Bugsink started with SQLite as the only supported database. This is the database the transaction model was designed around. However, the system has since expanded to support other databases as well.

In addition to SQLite, Bugsink currently supports MySQL and has experimental support for Postgres. Despite differences between these systems and SQLite, the core transactional model has been preserved. Thanks to the use of an ORM, most of the database logic remains portable, with only minimal backend-specific handling.

Unlike SQLite, neither MySQL nor Postgres offer an exact equivalent of BEGIN IMMEDIATE. However, similar behavior can be achieved by acquiring an explicit lock at the start of the transaction. We simply issue a SELECT ... FOR UPDATE on a known, singleton, row. This ensures that only one transaction proceeds at a time, effectively emulating SQLite’s “one writer at a time” behavior.

The other property that we rely on, namely “snapshot isolation”, is also present (and is the default) in MySQL and Postgres.

The fact that these databases are in the client/server model may also mean that they’re more slow to write to than SQLite, which could make the global write lock more of a bottleneck. This is something that will need to be monitored as the system grows.

That said, full global locking is likely unnecessary for most real-world workloads. The database tables in Bugsink that are involved in the event ingestion process are already partitioned by project, and most other tables are as well. This allows for a project-level global write-lock strategy for these databases, which would allow for more parallelism than SQLite’s global write lock.

Conclusion

Bugsink’s transaction model is designed to be simple, stable, and efficient. By embracing SQLite’s limitations and building around them, the system avoids many of the pitfalls that can come with more complex transaction models.

The model is built around the idea of a single writer at a time, which simplifies the system and ensures that writes are always consistent. This approach is particularly well-suited to the event-processing nature of Bugsink, where most writes are sequential and can be handled in a single transaction.

The model has been successfully extended to support MySQL and Postgres, with only minimal changes required. The core principles of the model remain the same, and the system continues to operate efficiently and predictably.

If you’re considering using SQLite in a production system, or if you’re looking for a simple, reliable transaction model for your application, Bugsink’s approach is a good place to start.