Gorodenkoff - stock.adobe.com
IT pros mull observability tools, devx and generative AI
Observability as a common language for both developers and operations teams still has plenty of room for improvement in the era of platform engineering, according to experts.
As platform engineering teams increasingly take on enterprise performance management tasks in production, there have been missed opportunities to give developers insights into their applications, experts say.
The instrumentation of application code is an area where platform engineers and SREs have stepped in on behalf of application developers, who aren't as steeped in the complexities of distributed cloud infrastructure systems such as Kubernetes. Analysts have also seen an increase in observability teams, specifically within the platform engineering discipline that connect developers' application performance insights with underlying infrastructure data.
"[There's] a move toward centralizing observability teams and centers of excellence," said Nancy Gohring, an analyst at IDC. "One driver for doing that is to try to control costs -- and one way those teams are trying to control costs is setting up data [storage] quotas for teams."
Such teams don't replace the need for developers to instrument their own application code but have helped ease the burden of managing the ongoing operational costs associated with collecting observability data, Gohring said.
There are some aspects of infrastructure monitoring, too, that developers don't need to concern themselves with, said Gregg Siegfried, an analyst at Gartner. Still, there remains a divide between the interests of platform teams in production observability and the interests of application developers, Siegfried said.
"I see an emergence of tools trying to give developers closer access to that data to give them more insight -- maybe allow them to put better instrumentation into the software," he said. "But none of them have really set the world on fire yet."
Rethinking devx in observability tools
It's a commonly understood best practice in observability that developers instrument their own code before it's deployed to production, the better to manage its performance in the "you build it, you run it" mode of DevOps.
"I'm part of the OpenTelemetry End User Working Group. And recently we had somebody come in and talk to our user community about how they work in a company that really fosters an observability culture," said Adriana Villela, developer advocate at observability vendor LightStep, in a presentation at the recent Monitorama Conference. "The wonderful thing about it is that there is a directive from the executive saying, 'Thou shalt do observability and also developers are the ones instrumenting their own code,' which means that if you've got some disgruntled development team saying, 'I don't have time to instrument my code,' tough [s---]."
But some newer entrants to the market and their early customers question whether the devx, or developer experience, with observability needs to be quite so tough.
"Developers being able to add custom metrics to their code or spans or use observability tools is really critical to help developers take ownership of what they run in production," said Joseph Ruscio, a general partner at Heavybit, an early-stage investor in cloud infrastructure startups, in a Monitorama presentation.
However, to a new engineer, the overwhelming number of tools available for observability is "inscrutable and not at all welcoming to someone new to the craft," Ruscio said.
A production engineering team at a market research company is trying to make this task less onerous for developers using Groundcover's new Kubernetes-based APM tool. Groundcover uses eBPF to automatically gather data from Kubernetes clusters and associate it with specific applications, which could eventually replace the language-specific SDKs developers used to instrument applications using incumbent vendor Datadog.
"For what we are calling custom metrics that monitor a specific application's behavior, these will continue to be the responsibility of the developers," said Eli Yaacov, a production engineer at SimilarWeb, based in New York. "But we, the production engineers, can provide the developers the [rest of] the ecosystem. For example, if they are running Kubernetes, they don't need to worry about [instrumenting for] the default CPU or memory. … Groundcover collects all this data in Kubernetes … without requiring the developers to integrate with anything into their services."
Other emerging vendors also offer automated instrumentation features in debugging tools to instrument developers' apps without requiring code changes. These include Lightrun and Rookout.
Generative AI to the rescue?
Amid this year's general hype about generative AI, observability vendors have been quick to roll out natural language interfaces for their tools, mostly to add a user-friendly veneer over their relatively complex, often proprietary, data query languages. Such vendors include Honeycomb, Splunk, and most recently, Dynatrace and Datadog.
Gregg SiegfriedAnalyst, Gartner
However, generative AI interfaces are not necessarily an obvious option to improve the developer experience of using observability tools, Siegfried said, as most developers are comfortable working in code.
"They have better things to do with their time than learn how to use an [application performance management] solution," he said.
Long term, generative AI and artificial general intelligence may have a significant effect, Ruscio said. But in the short term, Siegfried said he is skeptical that large language models such as ChatGPT will make a major impact on observability, particularly the developer experience.
Instead, unlike security and production-level systems monitoring, observability has yet to shift very far left in the development lifecycle, and developers would be best served by changing that, Ruscio said during his presentation. New and emerging vendors -- some of which are among Heavybit's portfolio companies -- are working in this area, termed observability-driven development.
"There's this missing mode where, wouldn't it be nice if you had some input when you are actually writing code as to what does this code look like in production?" Ruscio said. "It's cool that when I ship it, I'll get a graph. But why shouldn't I just know now, in my IDE, [how it will perform?]"
Beth Pariseau, senior news writer at TechTarget, is an award-winning veteran of IT journalism. She can be reached at [email protected] or on Twitter @PariseauTT.