Maximizing Your Observability
--
Do you know if your code is working in production?
If you can’t answer that question with a confident “Yes, it is” or “No, it’s not,” your software system isn’t complete.
In today’s complex software landscape, we need to know whether our code is working correctly. And we need to know quickly to be able to have 3 or 4 nines (or more!) availability.
And observability patterns, tools, and techniques are how you get there. Observability has become the term used throughout the industry to encapsulate logs, captured errors, metrics, alerting, and other reporting metrics to observe what our systems are doing.
But observability doesn’t just stop at being a grab-bag of tools to trigger Pagerduty when something is wrong.
Instead, we can harness these techniques to help us build better software vs. constantly scrambling to put out fires.
When used correctly, your observability dashboards and systems can tell you where to spend your resources for the next iteration, teach you how your system is actually operating, and even be a place to test ideas quickly to influence the product roadmap.
Here are some ways observability can help you and your team and mazimize it’s value to you.
Show You Where to Optimize
Have you heard the quote from Donald Knuth about premature optimization?
premature optimization is the root of all evil
The idea is that we as engineers will often spend too much time trying to optimize parts of our systems that either don’t matter or will have only a negligible impact on performance across the system to be worth the investment.
But how many times have you, as an engineer, been in a room or reading a doc about a new feature that “has to be super optimized?” I have — several times.
And what usually happens is the engineers (who are intelligent, capable, etc.) start doing a lot of thinking. They’ll begin working their way up and down the system, attempting to spot the bottlenecks of the system and design solutions around them as they go.
Nothing is wrong with this exercise except that we don’t know if those bottlenecks are…