This is part of a bigger consolidation trend, AI hype or not: which general-purpose data vendor gets to store and query all of your observability and business data?
Snowflake acquired Observe last week, AWS made it easy in December to put logs from Cloudwatch in their managed iceberg catalog, and Azure is doing a bunch of interesting stuff with Fabric.
The line between your data lake/analytics vendor and observability vendor is getting blurry.
This is an emerging pattern that’s surprisingly powerful: thick clients that embed wasm query engines (pglite, duckdb) and do powerful analytics (with or without AI agents writing the queries).
Below are two examples using duckdb under the hood for similar purposes. Like the author, excited for this type of architecture making semi-advanced analytics more attainable if you’re not a data engineer.
Agree with the author, will add: duckdb is an extremely compelling choice if you’re a developer and want to embed analytics in your app (which can also run in a web browser with wasm!)
Think this opens up a lot of interesting possibilities like more powerful analytics notebooks like marimo (https://marimo.io/) … and that’s just one example of many.
We recently created a survey website for the community survey results for Node-RED making it completely dynamic and segment-able. Creates lots of value and allows everyone to look at the data through their own lens. It uses DuckDB with WASM under the hood. Awesome technologies.
I'd really love a minimalist version, I'm not sure how small it's feasible for them to shrink it. As long as it doesn't get bigger and devices keep getting faster, I suppose?
Kudos to Ben for speaking to one of the elephants in the room in observability: data waste and the impact it has on your bill.
All major vendors have a nice dashboard and sometimes alerts to understand usage (broken down by signal type or tags) ... but there's clearly a need for more advanced analysis which Tero seems to be going after.
Speaking of the elephant in room in observability: why does storing data on a vendor cost so much in the first place? With most new observability startups choosing to store store data in columar formats on cheap object storage, think this is also getting challenged in 2026. The combination of cheap storage with meaningful data could breathe some new life into the space.
Thank you! And you're right, it shouldn't cost that much. Financials are public for many of these vendors: 80%+ margins. The cost to value ratio has gotten way out of whack.
But even if storage were free, there's still a signal problem. Junk has a cost beyond the bill: infrastructure works harder, pipelines work harder, network egress adds up. And then there's noise. Engineers are inundated with it, which makes it harder to debug, understand their systems, and iterate on production. And if engineers struggle with noise and data quality, so does AI.
It's all related. Cheap storage is part of the solution, but understanding has to come first.
Problem has never been the storage. Its running those queries to return in milliseconds - if its for a dashboard, an alert of your new AI agent trying to make sense of it.
This is great. If you’re skeptical, vibe coding in the go is great because of how async the agentic coding workflows can be. Nothing like fixing a bug in the dentist office.
Lots of different technical solutions for how to do this, including the Claude and ChatGPT mobile apps nowadays. I use Tailscale. Choose what works best for you and enjoy.
Been experimenting with OpenTelemetry->Parquet conversion lately for logs, metrics, and traces. Lots of related projects popping up in this area. It's powerful and cheap.
My favorite "malleable" example this past week that I recommend: using Gen AI to rewrite, customize, or create dotfiles* for your laptop/desktop.
Defining a bespoke system config was always something that never had the time, desire, or deep familiarity with various config formats to do. With Claude Code or ChatGPT can now generate Wezterm/tmux/neovim/etc config files and get exactly what I want not knowing a single thing about specific config file formats.
Hey, it's not well known but Anthropic actually publishes their own devcontainer feature layer for this (1). Many different CLI wrappers and tools are starting to imbed it (2) but is a nice DIY/open-source way to sandbox.
Snowflake acquired Observe last week, AWS made it easy in December to put logs from Cloudwatch in their managed iceberg catalog, and Azure is doing a bunch of interesting stuff with Fabric.
The line between your data lake/analytics vendor and observability vendor is getting blurry.
reply