Handled Errors in Sentry
In the Sentry ecosystem, handled errors are exceptions that your application catches and recovers from. These errors don’t crash your app but are logged for review, giving you a chance to understand and address unusual conditions without disrupting users.
As someone who builds a Sentry-compatible error tracking service, I’ve thought a lot about what these errors are for—and it took me a while to truly understand. (It doesn’t help that there doesn’t seem to be an actual documentation page on the subject)
I used to think, “If the error is handled, doesn’t that mean there’s no problem?” And if there’s no problem, why bother collecting information about it? It seemed like a waste of both event quota and mental energy.
But then I came across someone using handled errors in a way that made perfect sense: as a tool to learn about the real world. Once that clicked, practical examples started coming to mind, and I began to see handled errors as part of a spectrum between full-on crashes and general logging.
What is a Handled Error?
What is a handled error, exactly? It’s an exception that your application catches and recovers from. In very practical terms, that means an error that doesn’t crash your app or cause the user to notice anything is wrong.
Don’t be distracted by the fact that technically all errors are handled at some point before the application crashes. If they weren’t, the Sentry SDK wouldn’t be able to intercept them and record them as events.
What’s relevant for
practical purposes is the question whether they were handled before the SDK was invoked. If they were, the SDK will
mark them as handled
in the event data, and they will show up as such in your error tracker.
The use case for handled errors
I always wondered why you would want to log an error that you’ve already handled. When I finally ran into someone who used this feature, I couldn’t wait to ask them about it. Here’s what they told me:
These are error conditions that I have built the app to handle transparently and gracefully for the user, but I still want to know about them, and log them, so as to gather as much context as possible to improve on how they handled or if there are contextual conditions on the device that the error may have been a side effect of.
They were kind enough to provide a real-world example:
A very simple example is failed network requests for user data backups/syncing to a remote api - the user doesn’t know or care that the backup even happened, but if it failed I want to know why, I want to know what the conditions of the device were at time, e.g. I want to know which network interface they were connected via (wifi/LTE), was my app nearing it’s RAM limit, was it in the background or foreground, etc etc.
That’s just a basic example and network errors are easy to handle, but the can of worms gets opened when you integrate any/many of Apple’s iOS features/services, which typically have poor documentation, and your only real way of discovering error boundaries is via your own handling and logging in production
In short: this person used handled errors as a way to learn about the real world in production. They used them to gather context about the conditions leading up to the error, to identify patterns or edge cases that might otherwise go unnoticed, and to improve the app’s ability to deal with similar scenarios in the future.
Logging ERROR conditions
Reading it like this made me realize I had actually been doing something similar for a long time, but I had never thought of it as handled errors.
When writing code defensively, I often find myself adding if
statements to catch conditions that “shouldn’t happen” but
that I still want to know about if they do. For example, I might add a log statement to monitor when and where a deprecated
code path is still called, capturing full stack traces to trace the origin of the calls.
Another example: if an external API is unreachable, returns a non-200 status code, or returns a response that doesn’t match the expected format, I’ll log the full context of the error. This helps me pinpoint the origin of the bad data and refine how my application handles it.
As with “handled errors”, I was (ab)using my Error Tracker to learn about the real world in production.
Two patterns for one problem
Thinking about the two approaches side by side, I realized they’re two sides of the same coin. The only difference are the syntactical patterns used in your code. Compare the following bit of python:
try:
result = requests.get(url)
result.raise_for_status()
except requests.exceptions.RequestException as e:
sentry.capture_exception(e)
with this one:
result = requests.get(url)
if result.status_code != 200:
capture_stacktrace("Non-200 result from foobar API")
(Note: capture_stacktrace
is a not a default part of the Sentry SDK; but it can be easily built from
parts)
Seen like this, the difference is just a matter of syntax and opportunity. In some cases the 2 ways are entirely interchangeable, in others one is more appropriate than the other:
-
When a large number of related exceptional cases are expected, and you want to handle them all in the same way, a
try/except
block is more appropriate. APIs are a good example of this, because there may be any number of things that go wrong when calling an external service. All of these can be put in a singleexcept
case which can then be logged. -
When the exceptional case does not have a natural
Exception
to go with it, or when you want to log the error without interrupting the flow of the program, a conditional check is more appropriate.
Spectrum of brokenness
Back to my initial surprise at the concept of handled errors:
Personally, I’ve always felt that Error Tracking is vastly more valuable than a full Application Performance Monitoring solution, which is why I’m betting on it with my startup.
In particular: when you’re doing error tracking, you’re betting that “as much information about the moment something really went wrong” is more valuable than “thinly spread information about everything.” (like you get with APM)
For unhanded errors, this is clear: they represent a moment where something went catastrophically wrong, and you want to know as much as possible about that moment to fix it.
Handled errors occupy a middle ground. They don’t cause crashes, but they represent situations that are more significant than everyday logging noise. These are the moments where something unexpected happened, even if the app gracefully moved on.
Both approaches – catching handled errors and logging weird conditions – help bridge the gap between catastrophic failures and logging. They focus your attention on areas where something isn’t fully recoverable but still represents a meaningful pain point.
In conclusion: handled errors are a powerful tool for learning about the real world in production. They help you gather context about unexpected conditions, identify patterns or edge cases that might otherwise go unnoticed, and improve your app’s ability to deal with similar scenarios in the future.
Oh… and what does this mean for Bugsink? It means I’ll figure out a good way to support interact with handled errors in the near future. What that will look like exactly, I’m not sure yet. One thing I do know: it won’t be by using a skull and bones to indicate unhandled errors: those are and will be the default, so they don’t need a special icon.