Hacker Newsnew | past | comments | ask | show | jobs | submit | maxchehab's commentslogin

Trust is the hardest part to scale here.

We're building something similar and found that no matter how good the agent loop is, you still need "canonical metrics" that are human-curated. Otherwise non-technical users (marketing, product managers) are playing a guessing game with high-stakes decisions, and they can't verify the SQL themselves.

Our approach: 1. We control the data pipeline and work with a discrete set of data sources where schemas are consistent across customers 2. We benchmark extensively so the agent uses a verified metric when one exists, falls back to raw SQL when it doesn't, and captures those gaps as "opportunities" for human review

Over time, most queries hit canonical metrics. The agent becomes less of a SQL generator and more of a smart router from user intent -> verified metric.

The "Moving fast without breaking trust" section resonates, their eval system with golden SQL is essentially the same insight: you need ground truth to catch drift.

Wrote about the tradeoffs here: https://www.graphed.com/blog/update-2


Yes, I’ve been working on this and you need a clear semantic layer.

If there are multiple paths or perceived paths to an answer, you’ll get two answers. Plus, LLMs like to create pointless “xyz_index” metrics that are not standard, clear, or useful. Yet i see users just go “that sounds right” and run with it.


Absolutely. We make it obvious to the user when a query/chart is using a non standard metric and have a fast SLA on finding/building the right metric.

It only works because all of the data looks the same between customers (we manage ad platform, email, funnel data).

So if we make an “email open rate” metric, that’ll amortize to other customers.


How do you handle schema drift?

The data archive serialized the schema of the deleted object representative the schema in that point in time.

But fast-forward some schema changes, now your system has to migrate the archived objects to the current schema?


In my experience, archived objects are almost never accessed, and if they are, it's within a few hours or days of deletion, which leaves a fairly small chance that schema changes will have a significant impact on restoring any archived object. If you pair that with "best-effort" tooling that restores objects by calling standard "create" APIs, perhaps it's fairly safe to _not_ deal with schema changes.

Of course, as always, it depends on the system and how the archive is used. That's just my experience. I can imagine that if there are more tools or features built around the archive, the situation might be different.

I think maintaining schema changes and migrations on archived objects can be tricky in its own ways, even kept in the live tables with an 'archived_at' column, especially when objects span multiple tables with relationships. I've worked on migrations where really old archived objects just didn't make sense anymore in the new data model, and figuring out a safe migration became a difficult, error-prone project.


My partner and I have been playing this almost every morning. We're really enjoying it!

Some feedback: 1) it would be great if the incomplete clues could move to the top. this would avoid having to scroll down towards the end of the puzzle. 2) better collission behavior; it would be nice if we could drag a chunk of words and it would just "move the other words" out of the way. Sometimes we have to spend time to make a path to move chunks of words around.

Thanks for building this!


Hey, thanks for playing!

1) This is an interesting idea! I’ll play with that when I have time.

2) I am experimenting with this but have gotten mixed feedback from players. Some people don’t like it. I’m curious what you think! If I don’t do this I’ll explore other options: https://sunny.garden/@paulhebert/115698266272946749


Nah, that's too smart of a behavior. What exists now may have some edge cases, but it is otherwise staright-forward and intuitive. The only real "hassle" is swapping two large assembled pieces closer to the end of the game round, but it's not really a hassle. Not a big deal, really.


Yeah, I’ve heard that from a few people.

I’m thinking of adding a “shuffle” button to rearrange the tiles if you get really stuck. It’s theoretically possible to get in an unwinnable state where you can’t swap two tiles


Perhaps do what you showed at the link, but only activate it on long tap-and-hold?

That is, if you hover a piece over some spot for X seconds, then it will shuffle other pieces out of the way.


I like that demo, looking forward to seeing what you come out with.


Just some fast feedback, I can't copy & paste in the connection url input form. On a mac.

Once loaded, I get the error "Table must contain a UUID column for vector visualization."

I'm assuming it's trying to find an ID column for grouping? Can we manually specify this? My ID columns are varchars.


Same here. I'm using langchain which creates a varchar id column. It also has different collections on the same table.


This is a great backlink play, kudos


HOLY! Yes thats what this is about.

I was coming to the comments to ask about this as I noticed a other (finance) companies [1] were providing this for free and I wanted to know what the game was about.

[1] https://www.feylogos.com/


>Kudos

Great work on the enshittification.


Both Core Weave & Lambda Labs have fairly predatory pricing making it impossible to rent GPUs without a yearly contract.

This doesn't make sense for training models, where a training run is on the scale of days & weeks.

I wished that the techcrunch article mentioned other companies, like sfcompute, which offer hourly compute instead of yearly contracts.


I don't think it is predatory, I think it just happened over time due to demand. It is a marketplace after all.

https://news.ycombinator.com/item?id=40101377


Lambda Labs has on-demand GPUs. Just put in a credit card and you're able to launch. I launched an 1x H100 server just 10 minutes ago on Lambda Labs.

The price is also $2.49/hr which does not seem predatory at all.


I have never seen 1x H100 available on Lambda Labs. Don't know why though.


I've been checking about twice a week for the last 6 months, and they are very rare, but it does happen. I caught one on video 2 weeks ago! https://youtu.be/NkNx6tx3nu0?t=744


That would make the TechCrunch article highly misleading since AWS and the other clouds offer big savings for reserved instances.


I've been using Supermaven for the past week and it's obviously better than copilot. Excited to see this product evolve!


Do you mind linking to the video?



I recently found https://react.email and really enjoy using it instead of mjml, or whatever else is available!


Despite some pushback, I worked on something similar (internally) for a previous company.

Having React/JSX for email templating (even if it was on top of MJML) is a great win for productivity.

All our front-end devs knew React, a couple knew Jinja, Pug or mustache. And every-time a team needed to add a new email template, their frontend devs needed to learn those again.

Instead they could just write email templates as they would write their regular components the way they do every day.

Glad to be validated on this!


Uhh I am saving this one! Thanks! I remember the last time I had to set up a few mail templates it was so incredibly painful. Especially the testing of it is so painstakingly slow.


I find the meta implications on this thread quite interesting:

How cheap can you run this cron job, the basics:

- 8,765 invocations a year

- Lambda: $0.20 per 1M requests

- Cloudwatch Events: $1.00 per 1M requests

$1.2 / 1,000,000 * 8,765 = About 1 cent

Is anybody going to host that as a service for this price? Absolutely not.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: