Hacker Newsnew | past | comments | ask | show | jobs | submit | rgbrenner's commentslogin

But the security risk wasnt taken by OpenClaw. Releasing vulnerable software that users run on their own machines isn't going to compromise OpenClaw itself. It can still deliver value for it's users while also requiring those same users to handle the insecurity of the software themselves (by either ignoring it or setting up sandboxes, etc to reduce the risk, and then maybe that reduced risk is weighed against the novelty and value of the software that then makes it worth it to the user to setup).

On the other hand, if OpenClaw were structured as a SaaS, this entire project would have burned to the ground the first day it was launched.

So by releasing it as something you needed to run on your own hardware, the security requirement was reduced from essential, to a feature that some users would be happy to live without. If you were developing a competitor, security could be one feature you compete on--and it would increase the number of people willing to run your software and reduce the friction of setting up sandboxes/VMs to run it.


This argument has the same obvious flaws as the anti-mask/anti-vax movement (which unfortunately means there will always be a fringe that don't care). These things are allowed to interact with the outside world, it's not as simple as "users can blow their own system up, it's their responsibility".

I don't need to think hard to speculate on what might go wrong here - will it answer spam emails sincerely? Start cancelling flights for you by accident? Send nuisance emails to notable software developers for their contribution to society[1]? Start opening unsolicited PRs on matplotlib?

[1] https://news.ycombinator.com/item?id=46394867


We really needed to have made software engineering into a real, licensed engineering practice over a decade ago. You wanna write code that others will use? You need to be held to a binding set of ethical standards.

Even though it means I probably wouldn't have a job, I think about this a lot and agree that it should. Nowadays suggesting programmers should be highly knowledgeable at what they do will get you called a gatekeeper.

While it is literally gatekeeping, it's necessary. Doctors, architects, lawyers should be gatekept.

I used to work on industrial lifting crane simulation software. People used it to plan out how to perform big lift jobs to make sure they were safe. Literal, "if we fuck this up, people could die" levels of responsibility. All the qualification I had was my BS in CS and two years of experience. It was lucky circumstance that I was actually quiet good at math and physics to be able to discover that there were major errors in the physics model.

Not every programmer is going to encounter issues like that, but also, neither can we predict where things will end up. Not every lawyer is going to be a criminal defense lawyer. Not every doctor is going to be a brain surgeon. Not every architect is going to design skyscrapers. But they all do work that needs to be warranteed in some way.

We're already seeing people getting killed because of AI. Brian in middle management "getting to code again" is not a good enough reason.


> While it is literally gatekeeping, it's necessary. Doctors, architects, lawyers should be gatekept.

That was exactly my point. It's one of those things where deliberately use a word that is technically correct in a context where it doesn't, or shouldn't, hold true. Does this mean I want to stop people from "vibe coding" flappy bird. No, of course not, but as per your original comment yes, there should be stricter regulations when it comes to hiring.


Yeah, I know what you mean. It is a weapon people throw around on social media sites.

At least during the Covid response, your concerns over anti-mask and anti-vaccine issues seem unwarranted.

The claims being shared by officials at the time was that anyone vaccinated was immune and couldn't catch it. Claims were similarly made that we needed roughly 60% vaccination rate to reach herd immunity. With that precedent being set it shouldn't matter whether one person chose not to mask up or get the jab, most everyone else could do so to fully protect themselves and those who can't would only be at risk if more than 40% of the population weren't onboard with the masking and vaccination protocols.


> that anyone vaccinated was immune and couldn't catch it.

Those claims disappeared rapidly when it became clear they offered some protection, and reduced severity, but not immunity.

People seem to be taking a lot more “lessons” from COVID than are realistic or beneficial. Nobody could get everything right. There couldn’t possibly be clear “right” answers, because nobody knew for sure how serious the disease could become as it propagated, evolved, and responded to mitigations. Converging on consistent shared viewpoints, coordinating responses, and working through various solutions to a new threat on that scale was just going to be a mess.


Those claims were made after the studies were done over a short duration and specifically only watching for subjects who reported symptoms.

I'm in no way taking a side here on whether anyone should have chosen to get vaccinated or wear masks, only that the information at the time being pushed out from experts doesn't align with an after the fact condemnation of anyone who chose not to.


I specifically wasn't referring to that instance (if anything I'm thinking more of the recent increase in measles outbreaks), I myself don't hold a strong view on COVID vaccinations. The trade-offs, and herd immunity thresholds, are different for different diseases.

Do we know that 0.1% prevalence of "unvaccinated" AI agents won't already be terrible?


Fair enough. I assumed you had Covid in mind with an anti-mask reference. At least in modern history in the US, we have only even considered masks during the Covid response.

I may be out of touch, but I haven't heard about masks for measles, though it does spread through aerosol droplets so that would be a reasonable recommendation.


I think you're right - outside of COVID, it's not fringe, it's an accepted norm.

Personally I at least wish sick people would mask up on planes! Much more efficient than everyone else masking up or risking exposure.


Oh I wish sick people would just not get on a plane. I've cancelled a trip before, the last thing I want to do when sick is deal with the TSA, stand around in an airport, and be stuck in a metal tube with a bunch of other people.

Love passing off the externalities of security to the user, and then the second order externalities of an LLM that then blackmails people in the wild. Love how we just don’t care anymore.

You should join the tobacco lobby! Genius!

More straightforwardly, people are generally very forgiving when people make mistakes, and very unforgiving when computers do. Look at how we view a person accidentally killing someone in a traffic accident versus when a robotaxi does it. Having people run it on their own hardware makes them take responsibility for it mentally, so gives a lot of leeway for errors.

I think that’s generally because humans can be held accountable, but automated systems can not. We hold automated systems to a higher standard because there are no consequences for the system if it fails, beyond being shut off. On the other hand, there’s a genuine multitude of ways that a human can be held accountable, from stern admonishment to capital punishment.

I’m a broken record on this topic but it always comes back to liability.


Thats one aspect.

Another aspect is that we have much higher expectations of machines than humans in regards to fault-tolerance.


Traffic accidents are the same symptom of fundamentally different underlying problems among human-driven and algorithmically-driven vehicles. Two very similar people differ more than the two most different robo taxis in any given uniform fleet— if one has some sort of bug or design shortcoming that kills people, they almost certainly all will. That’s why product (including automobile) recalls exist, but we don’t take away everyone’s license when one person gets into an accident. People have enough variance that acting on a whole population because of individual errors doesn’t make sense— even for pretty common errors. The cost/benefit is totally different for mass-produced goods.

Also, when individual drivers accidentally kill somebody in a traffic accident, they’re civilly liable under the same system as entities driving many cars through a collection of algorithms. The entities driving many cars can and should have a much greater exposure to risk, and be held to incomparably higher standards because the risk of getting it wrong is much, much greater.


Oh please, why equate IT BS with cancer? If the null pointer was a billion dollar mistake, then C was a trillion dollar invention.

At this scale of investment countries will have no problem cheapening the value of human life. It's part and parcel of living through another industrial revolution.


Exactly! I was digging into Openclaw codebase for the last 2 weeks and the core ideas are very inspiring.

The main work he has done to enable personal agent is his army of CLIs, like 40 of them.

The harness he used, pi-mono is also a great choice because of its extensibility. I was working on a similar project (1) for the last few months with Claude Code and it’s not really the best fit for personal agent and it’s pretty heavy.

Since I was planning to release my project as a Cloud offering, I worked mainly on sandboxing it, which turned out to be the right choice given OpenClaw is opensource and I can plug its runtime to replace Claude Code.

I decided to release it as opensource because at this point software is free.

1: https://github.com/lobu-ai/lobu


I don't agree that making your users run the binaries means security isn't your concern. Perhaps it doesn't have to be quite as buttoned down as a commercial product, but you can't release something broken by design and wash your hands of the consequences. Within a few months, someone is going to deploy a large-scale exploit which absolutely ruins OpenClaw users, and the author's new OpenAI job will probably allow him to evade any real accountability for it.

> But the security risk wasnt taken by OpenClaw

This is the genius move at the core of the phenomenon.

While everyone else was busy trying to address safety problems, the OpenClaw project took the opposite approach: They advertised it as dangerous and said only experienced power users should use it. This warning seemingly only made it more enticing to a lot of users.

It’ve been fascinated by how well the project has just dodged and avoided any consequences for the problems it has introduced. When it was revealed that the #1 skill was malware masquerading as a Twitter integration I thought for sure there would be some reporting on the problems. The recent story about an OpenClaw bot publishing hit pieces seemed like another tipping point for journalists covering the story.

Though maybe this inflection point made it the most obvious time to jump off of the hype train and join one of the labs. It takes a while for journalists to sync up and decided to flip to negative coverage of a phenomenon after they cover the rise, but now it appears that the story has changed again before any narratives could build about the problems with OpenClaw.


I am guessing there will be an OpenClaw "competitor" targeting Enterprise within the next 1-2 months. If OpenAI, Anthropic or Gemini are fast and smart about it they could grab some serious ground.

OpenClaw showed what an "AI Personal Assistant" should be capable of. Now it's time to get it in a form-factor businesses can safely use.


With the guard rails up, right? Right?

the best time to learn anything is tomorrow when a better model will be better at doing the same work

doesn’t that presume no value is being delivered by current models?

I can understand applying this logic to building a startup that solves today’s ai shortcomings… but value delivered today is still valuable even if it becomes more effective tomorrow.


I think it also presumes that the skills of today won't be helpful in making you better, faster, stronger at knowing what to learn tomorrow. Skateboarding ain't snowboarding but I guarantee the experience helps.

Yeah but neither makes a difference to taking a taxi.

And your skills at catching a cab don't matter for booking a self driving car online.


The keen observer will of course know that there's no such thing as "federal immunity"

The scary thing is that there is.. you should look up "sovereign immunity". The government has complete immunity, except where and how the law permits it to be held accountable. And while we have a constitution, defending those rights through the courts requires legislation to permit it. For the most part, federal law permits lawsuits against states that violate the constitution, but have permitted far less accountability for federal actions that violate the constitution.

For example, Section 1983 of the Civil Rights Act only permits individuals to sue state and local governments for rights violations. It can't be used to sue the federal government.

There's many court cases, dating back decades, tossing out cases against the federal government for rights violations. Look how SCOTUS has limited the precedent set by Bivens over the years, basically neutering it entirely.


using utc on servers was very common in 2005


I’d say it was common enough but not universally, given the number of arguments I had from 2005 to 2015 about this exact issue.


Hold on, I'm not a sysadmin guy. Are you folks saying the server should not know what part of the world its in, that basically it should think it's in Greenwitch?

I would have thought you configure the server to know where it is have it clock set correctly for the local time zone, and the software running on the server should operate on UTC.


From a logging perspective, there is a time when an event happens. The timestamp for that should be absolute. Then there's the interaction with the viewer of the event, the person looking at the log, and where he is. If the timestamp is absolute, the event can be translated to the viewer at his local time. If the event happens in a a different TZ, for example a sysadmin sitting in PST looking at a box at EST, it's easier to translate the sysadmin TZ env, and any other sysadmin's TZ anywhere in the world, than to fiddle with the timestamp of the original event. It's a minor irritation if you run your server in UTC, and you had to add or subtract the offset, eg. if you want your cron to run at 6PM EDT, you have to write the cron for 0 22 * * *. You also had to do this mental arithmetic when you look at your local system logs, activities at 22:00:00 seem suspicious, but are they really? Avoid the headaches and set all your systems to UTC, and throw the logs into a tool that does the time translation for you.

The server does not "know" anything about the time, that is, it's really about the sysadmin knowing what happened and when.


1) Most software gets its timestamps from the system clock 2) If you have a mismatch between the system time and the application time, then you just have log timestamps that don't match up; it's a nightmare - even more so around DST/ST transitions


you've got it backwards - the server clock should be in UTC, and if an individual piece of software needs to know the location, that should be provided to it separately.

for example, I've got a server in my garage that runs Home Assistant. the overall server timezone is set to UTC, but I've configured Home Assistant with my "real" timezone so that I can define automation rules based on my local time.

Home Assistant also knows my GPS coordinates so that it can fetch weather, fire automation rules based on sunrise/sunset, etc. that wouldn't be possible with only the timezone.


I kind of assumed all computer clocks were UTC, but that you also specified a location, and when asked what time it is, it did the math for you.


Windows assumes computer clocks are local time. It can be configured to assume UTC. Other operating systems assume computer clocks are UTC. Many log tools are not time zone aware.


Computer clock is just counter, if you set the start counting point to UTC, then it's UTC, you set it to local time, then it's local time.


that's the difference between "aware" and "naive" timestamps. Python has a section explaining it in their docs (though the concept applies to any language):

https://docs.python.org/3/library/datetime.html#aware-and-na...


AKA time zones


A server doesn't need to "know" where in the world it is (unless it needs to know the position in the sun in the sky for some reason).


A server doesn't "think" and the timezone has no relevance to where it is located physically.


I'm not sure why you're getting downvoted.

Yes, that's exactly what I'm saying :). In fact, I've run servers where I didn't even physically know where it was located. It wouldn't have been hard to find out given some digging with traceroute, but it didn't matter. It was something I could SSH into and do everything I needed to without caring where it was.

Everyone else down-thread has clarified the why of it. Keep all of your globally distributed assets all running on a common clock (UTC) so that you can readily correlate things that have happened between them (and the rest of the world) without having to do a bunch of timezone math all the time.


Common, but not universal - from 2005 to as late as 2014 I worked for companies that used Pacific time on their servers.


the second case isn’t illegal in the USA because it’s not a specific credible threat.


openrouter requires an openai api key.


Where did you get that from? I am currently using GPT-5 via OpenRouter and never added an OpenAI key to my account there. Same for any previous OpenAI model. BYOK is an option, not a necessity.


You had to use your own key for o3 at least.

> Note that BYOK is required for this model. Set up here: https://openrouter.ai/settings/integrations

https://openrouter.ai/api/v1/models


> {"id":"openai/gpt-5-chat","canonical_slug":"openai/gpt-5-chat-2025-08-07","hugging_face_id":"","name":"OpenAI: GPT-5 Chat","created":1754587837,"description":"GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.","context_length":400000,"architecture":{"modality":"text+image->text","input_modalities":["file","image","text"],"output_modalities":["text"],"tokenizer":"GPT","instruct_type":null},"pricing":{"prompt":"0.00000125","completion":"0.00001","request":"0","image":"0","audio":"0","web_search":"0","internal_reasoning":"0","input_cache_read":"0.000000125"},"top_provider":{"context_length":400000,"max_completion_tokens":128000,"is_moderated":true},"per_request_limits":null,"supported_parameters":["max_tokens","response_format","seed","structured_outputs"]},

If you look at the JSON you linked, it does not enforce BYOK for openai/gpt-5-chat, nor for openai/gpt-5-mini or openai/gpt-5-nano.


Did I say GPT-5? I said o3. :) That was a rebuttal to you saying you have never needed to add your key to use an OpenAI model before.


Fair, I should not have said "any".


It does for the model this thread is about: openai/gpt-5.


What's openai/gpt-5 vs openai/gpt-5-chat?


the media but also the llm providers actively encourage this to fuel their meteoric valuations that are based on the eminent value that would be provided by AGI replacing human labor.

the entire thing — from the phrasing of errors as “hallucinations”, to the demand for safety regulations, to assigning intention to llm outputs — is all a giant show to drive the hype cycle. and the media is an integral part of that, working together with openai et al.


why would the llm share any of the blame? it has no agency. it doesn’t “understand” anything about the meaning of the symbols it produces.

if you go put your car in drive and let it roll down the street.. the car has 0% of the blame for what happened.

this is a full grown educated adult using a tool, and then attempting to deflect blame for the damage caused by blaming the tool.


The LLM isn't to blame.

The human parties on both sides of it share some.

As the meme goes: "A computer can never be held accountable; Therefore a computer must never make a management decision."


As the saying also goes, "To make a mistake is human, but to really fuck things up you need a computer"


pro is the $20/mo plan that they recently started allowing access to claude code.. but i’ve heard users hit the rate limit with a few queries.. so imo that sounds about right. the chat interface has its own limits separate from claude code.


Has to be wrong. I'm on that subscription as I wanted to reinforce my opinion that it's still shit for devs that actually have experience, like it was a few months ago.

While my plan didn't pan out, cuz it was way too effective, I can confidently say that I'm going through 3-6k tokens per prompt on average, and usually get around 3 hours of usage before I'm hitting the rate limit.

The limit is probably closer to 300k then <10k

Also the chat interface doesn't have a separate limit, once you hit it via Claude code, you cannot use the website either anymore.

Maybe it's a 7k limit per prompt? Dunno if I exceeded that before


I found I can hit the limit very quickly if I have it scan large quantities of code for analysis. If I try to be more surgical, and give it terse but accurate documentation and instructions, the budget lasts longer.


Same here. Sometimes I direct it just to specific files related to the feature requested.


Claude Code is chugging away on a step (6/10) for me right now:

Transforming… (212s · 26.1k tokens · esc to interrupt)

I reset just under 2 hours ago, probably been going at this pace for the last hour or so.


7k is literally nothing, even for a trial 7k of token is basically 1-2 files written that doesn't seem right if it is then i dont see why anyone would pay for that and not the 250 prompts/month from augment or one of the others


I hadn't even heard of augment, but Claude Code's UX is _mostly_ very nice (despite the problematic UX this particular project attempts to solve). So perhaps Claude Code has a better UI/UX?


I tried it with Roo Code (with 3.7 Sonnet, not Code). For agentic use you will probably hit the limit from your first prompt/assignment if it does some browser tool use.


if you work on a team most code you see isn’t yours.. ai code review is really no different than reviewing a pr… except you can edit the output easier and maybe get the author to fix it immediately


Reviewing code is harder than writing code. I know staff engineers that can’t review code. I don’t know where this confidence that you’ll be able to catch all the AI mistakes comes from.


I was about to say exactly this—it's not really that different from managing a bunch of junior programmers. You outline, they implement, and then you need to review certain things carefully to make sure they didn't do crazy things.

But yes, these juniors take minutes versus days or weeks to turn stuff around.


> if you work on a team most code you see isn’t yours.. ai code review is really no different than reviewing a pr… except you can edit the output easier and maybe get the author to fix it immediately

And you can't ask "why" about a decision you don't understand (or at least, not with the expectation that the answer holds any particular causal relationship with the actual reason)... so it's like reviewing a PR with no trust possible, no opportunity to learn or to teach, and no possibility for insight that will lead to a better code base in the future. So, the exact opposite of reviewing a PR.


Are you using the same tools as everyone else here? You absolutely can ask "why" and it does a better job of explaining with the appropriate context than most developers I know. If you realize it's using a design pattern that doesn't fit, add it to your rules file.


You can ask it "why", and it gives a probable English string that could reasonably explain why, had a developer written that code, they made certain choices; but there's no causal link between that and the actual code generation process that was previously used, is there? As a corollary, if Model A generates code, Model A is no better able to explain it than Model B.


I think that's right, and not a problem in practice. It's like asking a human why: "because it avoids an allocation" is a more useful response than "because Bob told me I should", even if the latter is the actual cause.


> I think that's right, and not a problem in practice. It's like asking a human why: "because it avoids an allocation" is a more useful response than "because Bob told me I should", even if the latter is the actual cause.

Maybe this is the source of the confusion between us? If I see someone writing overly convoluted code to avoid an allocation, and I ask why, I will take different actions based on those two answers! If I get the answer "because it avoids an allocation," then my role as a reviewer is to educate the code author about the trade-off space, make sure that the trade-offs they're choosing are aligned with the team's value assessments, and help them make more-aligned choices in the future. If I get the answer "because Bob told me I should," then I need to both address the command chain issues here, and educate /Bob/. An answer is "useful" in that it allows me to take the correct action to get the PR to the point that it can be submitted, and prevents me from having to make the same repeated effort on future PRs... and truth actually /matters/ for that.

Similarly, if an LLM gives an answer about "why" it made a decision that I don't want in my code base that has no causal link to the actual process of generating the code, it doesn't give me anything to work with to prevent it happening next time. I can spend as much effort as I want explaining (and adding to future prompts) the amount of code complexity we're willing to trade off to avoid an allocation in different cases (on the main event loop, etc)... but if that's not part of what fed in to actually making that trade-off, it's a waste of my time, no?


Right. I don't treat the LLM like a colleague at all, it's just a text generator, so I partially agree with your earlier statement:

> it's like reviewing a PR with no trust possible, no opportunity to learn or to teach, and no possibility for insight that will lead to a better code base in the future

The first part is 100% true. There is no trust. I treat any LLM code as toxic waste and its explanations as lies until proven otherwise.

The second part I disagree somewhat. I've learned plenty of things from AI output and analysis. You can't teach it to analyze allocations or code complexity, but you can feed it guidelines or samples of code in a certain style and that can be quite effective at nudging it towards similar output. Sometimes that doesn't work, and that's fine, it can still be a big time saver to have the LLM output as a starting point and tweak it (manually, or by giving the agent additional instructions).


Although it cannot understand the rhetorical why as in a frustrated “Why on earth would you possibly do it that brain dead way?”

Instead of the downcast, chastened look of a junior developer, it responds with a bulleted list of the reasons why it did it that way.


Oh, it can infer quite a bit. I've seen many times in reasoning traces "The user is frustrated, understandably, and I should explain what I have done" after an exasperated "why???"


>And you can't ask "why" about a decision you don't understand (or at least, not with the expectation that the answer holds any particular causal relationship with the actual reason).

To be fair, humans are also very capable of post-hoc rationalization (particularly when they're in a hurry to churn out working code).


Yes you can


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: