Overview🍃
Observability (o11y) has, of late, become a buzzword in the world of DevOps. Consequently, many vendors have rebranded their solutions as observability tools without regard for best practices. This phenomenon – beginning to be termed o11y-washing – was perceived in vendors by 80% of respondents, according to Splunk’s The State of Observability 2023. This highlights the importance of understanding observability fully.
In the previous blog post, we mentioned that groundcover fulfilled the 3 pillars of observability:
- Logs as a timestamped record for events, warning and error messages.
- Metrics as numerical values relating to the service, such as CPU usage, memory usage, and network latency.
- Traces as a record of the action a user performs on the application or service.
On top of fulfilling the basic requirements of an observability tool, groundcover sets itself apart by having a seamless experience and features that are only supported by virtue of their architecture design and thoughtful approach to observability issues, as we’ll describe below.
Installation🍃
During the installation process, we were pleasantly surprised that groundcover required no code changes, which is rare amongst observability tools. All the data we needed was available from the get go. This is only possible because groundcover utilises eBPF, a technology that enables monitoring to run efficiently in the kernel without the need to modify source code.
Installing groundcover on the cluster is super easy. There are 4 available options for the installation, which gives groundcover the flexibility to adapt in accordance to company policy. For instance, they may require that all tools be operated via GitOps or all through Helm charts. In our experience, it only took less than 10 minutes from the installation process to show the full suite of data from our cluster, in stark contrast to other solutions we’ve tried.
Another unique approach that groundcover takes is how observability data collected is stored locally, with a separation between control plane and data plane. This allows the data to reside completely in the cluster, which helps to comply with data privacy laws. This is why, unlike other observability tools, groundcover doesn’t have any hidden storage or data egress costs. With that, there’s no more “surprise charge” from the observability vendor that we normally fear.
Feature highlights
Network Diagram Map
The network map gives us visibility over network traffic between services, workloads and external resources in the infrastructure. This makes it possible to easily determine network bottlenecks on top of the traces and logs, and filter by protocol, namespace, workload and connection status.
Logs and related traces
Through the aforementioned network map, it’s possible to drill down into traces that are relevant to the debugging process. Here, eBPF also helps us to gain complete end-to-end visibility over the trace, with the payload, response, and path traversed clearly presented.
Groundcover also shows the context of the trace, which provides more insight on utilisation of node and container resources. As such, this helps to glean the constituent players and, often, the purpose of a request.
This stands in contrast to other observability platforms, where multiple add-ons and the installation of a separate agent are required to achieve such a unified perspective.
Dashboard
In the previous blog post, we discussed the importance of the golden signals of monitoring.. However, it has traditionally been difficult to collect these signals, through what’s termed “white-box” instrumentation. It requires deliberate code changes for everything you expose. Alternatively, there are easier ways to approach this instrumentation method. Despite that, it still requires pre-planning and figuring out ways to extract data from containers.
By utilising eBPF, it’s possible to change this approach with “black-box”, removing the need to worry about the logic of collecting all the information. This is shown in groundcover through workload and resource aggregation.
Workload aggregation
Resource aggregation
Monitor and Alerting (anomaly detection)
Finally, a new feature in groundcover, called Monitors, is a powerful alerting mechanism that proactively monitors and raises alerts about important issues in Kubernetes environments. It is able to cover a variety of issues that might affect your infrastructure, resources or API. Furthermore, it follows industry best practices out of the box. To unlock even greater flexibility in your alerting arsenal, groundcover provides in-built integration with Grafana.
Conclusion🍃
Observability is not just the mere tools that show metrics, traces, and logs. At the end of the day, it’s really the matter of how you utilise the data that you collect to act proactively. As Cindy Sridharan wrote in her book, Distributed System Observability, “The value of the observability of a system primarily stems from the business and organizational value derived from it. Being able to debug and diagnose production issues quickly not only makes for a great end-user experience, but also paves the way toward the humane and sustainable operability of a service”. We think an observability tool becomes great when it gives insights to deeper understanding of our system. And that’s what groundcover has provided for us, in a friendly and approachable manner.
#observability #groundcover #kubernetes #goldensignals #ebpf #debugging #containers #megazonecloud #aws #awspremierpartner
Written by Theodore Fabian , Associate Cloud Solutions Architect, MegazoneCloud Hong Kong