Django Deployment, Simplified

Klaas van Schelven
Klaas van Schelven; August 30, 2024 - 12 min read
A man removing components from a diagram

Bugsink, our error tracking tool, is offered exclusively as a self-hosted piece of software. In a world dominated by SaaS solutions, this raises a critical two-part question:

  • can we still persuade people to install their own software at all?
  • can we ensure they won’t be disappointed if they try?

Time will tell the answer to the first part of that question (at the time of writing Bugsink was still so young as to be in private beta).

The second part is the subject of this article: I’ll discuss the steps we took to make self-hosting Bugsink as easy as possible.

Join me on a journey of unconventional choices and home-grown solutions for completely solved problems. You’ll either come away inspired to simplify your own projects or have a good laugh at my expense.

Python and Django: the untouchables

Before we dive into all the components we got rid off, let’s start with the ones we didn’t touch: Python and Django. Bugsink is built as a Python/Django application for two reasons:

  • Django is an excellent framework for building web applications: it’s well-documented, has a large community, and offers a wide range of built-in tools that streamline development, allowing you to focus on building features rather than reinventing the wheel.

  • I personally know them very well. After over a decade of working with these technologies, I can get things done quickly and efficiently — essential when you’re a solo founder building a product from scratch. The best language or framework for a project is the one you already know.

Using a language that compiles into a single binary (e.g. Go, Rust) would certainly be a good starting point from the perspective of simplicity of installation. But it would not be a good first step for me, and I doubt they would beat Django for ease of building a web application. So, we’re sticking with Python and Django, and we’re not going to rewrite the whole thing in Go or Rust.

What we removed

If you’re reading this, you’re probably familiar with the typical components of a modern Python web application. Rather than list them all out, and only then tell you which ones we kept and removed, let’s just start with the punchline:

We removed (or simply never picked up) the following components:

  • A “real” database (Postgres, MySQL)
  • Celery, a distributed task queue
  • Message queues (redis, RabbitMQ) to distribute tasks

This leaves us with:

  • Another also-actually-real database (SQLite)
  • Snappea, a home-grown solution for asynchronous tasks
  • Gunicorn, our trusted Python WSGI server
  • Nginx, a web server

Docker, Kubernetes, and other container orchestration tools are notably absent from this list. The short of it is: it’s supported, but not necessary.

Let’s dive in!

Database: the usual suspects

When we were setting up Bugsink, one of the first major decisions was picking the right database. MySQL and Postgres are the go-to choices for Django applications: Django’s own documentation implicitly recommends them by putting them at the top of the list, and just about everybody on the internet parrots that advice.

They are also widely available and relatively simple to set up and keep running. However, they are certainly not zero-effort, as you still have to:

  • Install and configure them: doing the actual installation, creating user accounts, managing passwords, and passing those to your application as environment variables.

  • do maintaintence: regular upgrades, performance tuning, planning for downtime during upgrades and ensuring that your server handles interruptions gracefully.

You might get someone else to do some of that work for you, but that’s somewhat antithetical to the idea of self-hosting and introduces new complexities.

In local development, we are using an SQLite database, which is a serverless database that operates as a single file, and has none of the installation and maintaintence overhead mentioned above. Could we use SQLite in production as well?

SQLite according to “the internet”

Django’s documentation for sqlite says that:

SQLite provides an excellent development alternative for applications that are predominantly read-only or require a smaller installation footprint.

Bugsink is neither predominantly read-only nor small, and we’re interested in production rather than development. However, the above was written in 2009. Could it be outdated advice by now? Some people seem to think it is.

SQLite itself gives 100k requests/day as a “conservative estimate” on its when-to-use page. Since we’re aiming at least an order of magnitude higher, it seems the answer is to go see for ourselves.

SQLite howto

All the details of what made it work for us is long enough for its own article. The short of it is:

  • Turn on Write-Ahead Logging (WAL) mode. This provides more concurrency as readers and writers do not block each other.

  • Serialize (most) writes: SQLite locks the whole database for writes, which can be a problem if you have many processes trying to write. In our case, dealing with API events from SDKs constitutes approximately 99% of our writes under load. This is highly suggestive of the architecture we adopted: let a single background process write the events to the database, and reduce the highly parallel server processes to simple enqueue operations for that background process.

  • Use BEGIN IMMEDIATE when you know a transaction will eventually become a write-transaction. This tells SQLite to lock the database for writes immediately, rather than waiting until the first write operation and erroring out if another process is writing.

SQLite “free benefits”

The driving reason to pick SQLite was to simplify the installation process. However, we also got some free benefits:

  • Harmonization of dev and prod environments: when you run the same database in development and production, you avoid many types of unwelcome surprises. This is the exact reason many people recommend using a “real database” in development. We decided to go the other way: use the development database in production.

  • Low latency for your queries: SQLite is just local files, and the queries are run on those. This means there’s no network latency, and the queries are as fast as they can be.

  • Serialization rocks: SQLite writes are serialized by default. This eliminates a whole class of bugs that can happen when you have multiple processes writing to the same database. Moreover: it eliminates thinking about such bugs.

Background work in Python

Most web applications have some class of work that is too long-running to be done inside the main HTTP request-response loop, and on which the request-response loop does not depend. Typical names are “background jobs” or “asynchronous tasks”.

In our case we have both the single process for serialized event-handling (our SQLite-inspired architecture) and various emails being sent out (for which we depend on possibly-slow external servers).

Unfortunately, the available application servers for Python (we use Gunicorn) don’t really have a built-in answer for such work: the processes that Gunicorn manages are there to serve requests and nothing else, and trying to escape from that context would not be recommended.

The default solution: Celery

The go-to solution for this problem in the Python community is Celery. Celery relies on a message queue system like Redis or RabbitMQ to know what tasks to run and when to run them.

The added complexity of Celery-plus-queue is similar (identical even) to the case of a database-as-a-service:

  • Install and configure of the message queue: doing the actual installation, creating user accounts, managing passwords, and passing those to your application as environment variables.

  • do maintaintence: regular upgrades, performance tuning, planning for downtime during upgrades and ensuring that your server handles interruptions gracefully.

On top of that, Celery itself is a highly complex solution. It has support for distributing work over servers, passing results back to the calling process, monitoring and much more. All while sometimes failing at its core mission. Remember that all we wanted is a way to “do slow stuff in the background”?

Enter Snappea

Enter Snappea, our “Not Invented Here” approach to background tasks. A full description of how we did it and what we learned along the way is, again, long enough for its own article. The short of it is:

Snappea leverages SQLite as an ad hoc message queue, a decision that aligns with our philosophy of minimizing dependencies. The core idea is simple: the Foreman (our main orchestrator) scans the database for tasks, picks them up, and executes them in worker threads.

Here are the key takeaways from Snappea’s design and implementation:

  • SQLite scales to our needs: insertion/deletion (which each fully lock the DB) takes in the order of 3ms, meaning we can do hunderds of tasks per second. This means leaves us with approximately an order of magnitude before the queue becomes the bottleneck.

  • Snappea is in the order of 100 to 200 lines of code. While it is obviously a “not invented here” solution, such a small one is easy to maintain and understand. With such small numbers, you’re at the point that understanding the full system is easier than understanding the documentation of a third-party solution.

  • No failure-handling and retries, other than to log the error and never try that task again. Snappea is simply a mechanism to run tasks in the background. No need to make the background tasks more robust than the main process.

Cloud-last strategy

As we worked on Bugsink, we knew we wanted to avoid the complexity often associated with modern container orchestration systems like Docker and Kubernetes. Our goal was to reduce the number of components to the point where we didn’t need to orchestrate or containerize them.

Docker and similar tools are just that—tools. However, they can sometimes introduce unnecessary complexity. The mindset of “We’ll just put it in a container” or even “the more parts the better” can lead to a situation where complexity spirals out of control, as more and more components are added without careful consideration.

That said, we do offer a ready-made Docker container for Bugsink. But we built this container only after minimizing the number of components and with a strong emphasis on keeping the container itself as self-contained as possible.

No Nginx?

As it stands, the documentation for Gunicorn (which we use as our WSGI server) recommends using Nginx as a reverse proxy in front of Gunicorn. We parrot that advice in our own documentation. This is a common setup for Python web applications, as Nginx is known for its performance and robustness.

However, we started to wonder: do we really need Nginx? Could we simplify the installation process even further by removing it? The answer is “maybe” – in any case I’ve opened the discussion on the Gunicorn issue tracker.

For now, we’ll keep Nginx in the installation process. It’s a well-known and well-documented solution that provides additional security and performance benefits. But we’re always looking for ways to simplify, so we’ll continue to evaluate whether Nginx is truly necessary for Bugsink.

Conclusion

With Bugsink, we’ve focused on reducing complexity while keeping the setup reliable and efficient. By sticking with familiar tools, simplifying our dependencies, and building only what we needed, we hope to have made self-hosting straightforward.

Looking ahead, we’ll keep questioning what’s truly necessary, exploring ways to simplify even further. Our aim is to ensure that as Bugsink evolves, it remains easy to deploy, manage, and use—showing that powerful tools don’t need to be complicated.

Now, let’s hope the people that try it out won’t be disappointed. Both in the installation process and in the product itself.