Hacker Newsnew | past | comments | ask | show | jobs | submit | basch's commentslogin

Since its unclear whats going on, Gemini first gave me some python.

import random

random_number = random.randint(1, 10) print(f"{random_number=}")

Then it stated the output.

Code output random_number=8

"This time, the dice landed on 8."

Code output random_number=9

"Your next random number is 9."

I would guess its not actually executing the python it displayed? Just a simulation, right?


It did run python code when I asked for a random number: https://gemini.google.com/share/dcd6658d7cc9

Then I said: "don't run code, just pick one" and it replied "I'll go with 7."


But .. how do you know? It says it wrote code, but it could just be text and markdown and template. It could just be predicting what it looks like to run code.

Mine also gave me 42 before I specified 1-10.

Does it always start with 42 thinking its funny?


Click on the link I provided and you'll know why I know. It's not markdown, it shows the code that was ran and the output.

Be careful. Output formatting doesn't prove what you think it does. Unless you work inside google and can inspect the computation happening, you do not have any way to know whether it's showing actual execution or only a simulacrum of execution. I've seen LLMs do exactly that and show output that is completely different from what the code actually returns.

You can literally click "Show Code"

I would be surprised if Gemini could not run python in its web interface. Claude and ChatGPT can. And it makes them much more capable (e.g. you can ask claude to make manim animations for you and it will)

Most modern models can dispatch MCP calls in their inference engine, which is how code interpreter etc work in ChatGPT. Basically an mcp server that the execution happens as a call to their ai sandbox and then returns it to the llm to continue generation.

You can do this with gpt-oss using vLLM.


Without reading every word of every embedded tweet, a part missing from the conversation is HOW they are strongarming.

It isn't in private. It's a public threat in the court of public opinion to apply societal pressure on the company. They are attempting to reshape Anthropic's decision into a tribal one, and hurt the brand's reputation within the tribe unless it capitulates.


> Without reading every word of every embedded tweet, a part missing from the conversation is HOW they are strongarming.

There are two possibilities:

> The government would likely argue that dropping the contractual restrictions doesn't change the product. Claude is the same model with the same weights and the same capabilities—the government just wants different contractual terms. […] Anthropic would likely argue the opposite: that its usage restrictions are part of what Claude is as a commercial service, and that Claude-without-guardrails is a product it doesn't offer to anyone. On this view, the government is asking for a new product, and the statute doesn't clearly authorize that.

and

> The more extreme possibility would be the government compelling Anthropic to retrain Claude—to strip the safety guardrails baked into the model's training, not merely modify the access terms. Here the characterization question seems easier: a retrained model looks much more like a new product than dropping contractual restrictions does. Admittedly, the government has a textual argument in its favor: the DPA's definitions of "services" include “development … of a critical critical technology item,” and the government could frame retraining Claude as exactly that. Whether courts would accept that framing, especially in light of the major questions doctrine, is another matter.

* https://www.lawfaremedia.org/article/what-the-defense-produc...

* https://en.wikipedia.org/wiki/Defense_Production_Act_of_1950

A more extreme situation: could the DPA be used to nationalize the model so the government has ownership, and then allow access to more amenable AI players?


There's a third possibility. Anthropic's management desires cover to remove limiters on some of its products for some of its customers. The Pentagon is more than happy to play the bad guy if it means that they get something that's even more useful to them than what they would have gotten otherwise.

"We made these compromises because national defense is really super important." has historically proven to be a really effective explanation for tech companies that want to abandon some of their previously-stated "nice and friendly" values in exchange for money.


When I imagine a world with this scenario being the truth, I am less confused than when I imagine a world with the alternatives. I find this to be a fantastic and historically reliable (for me) heuristic.

That being said, I imagine it also factors into internal dialogue that allows those higher up to explain to the boots-on-the-ground researchers that "no you're not working for the military industrial complex, they're just stealing your work that was intended to feed the orphans!"


> It isn't in private.

We don’t know this


The top line of the article gives a big old hint: Anthropic signed a contract with the “Killing people” part of the government and now they’re putting on a show. No contract, no leverage.

The only threat the Pentagon has is to terminate the contract.


Can they not invoke the defense act without the public spat? Gag order?

Realistically that's an empty threat, especially with the mid-terms coming up and Trump's attention span. The real threat, the actionable one, is the loss of a $200 mil contract. I suspect that the result here will be some highly visible face-saving compromise for Anthropic that means very little.

are there really places that a comma, super-comma; or (parenthesis) dont work roughly as well? I find the em-dash mildly abhorrent, even before this all.

> super-comma

This is the first time I've ever heard the character ";" referred to as such. It's always been "semi-colon" to me, is this a region/culture difference?

I'm not saying you're wrong, I find it interesting.


no it's always been semicolon, the "super-comma" comes from describing how to use it. "It's similar to a comma but like a super comma."

Huh? I've always understood that the clause after the semicolon is peripheral; the meaning of the whole sentence does not change without it.

thats one use for it. supercomma is another.

same character, used differently?

i call it a super comma when its separating a list with commas within the sets.

so if i am listing colors like green, blue, red; foods like apple, orange, strawberry; and seasons like winter, summer, fall.

it's one use case for an em-dash, because whatever you have inside it has commas in the phrase.

square and rectangle situation. a supercomma is a subset of semicolon.


> super-comma

I would have assumed it's a synonym for apostrophe. super-comma <-> upper-comma, with super meaning upper, like in superscript.


I think of it as supersedes the comma in the order of operations. You work inward, or outward (depending which way you read the list.)

it's a cadence thing for me

Em-dash matches how I speak and think-- frequently a halt, then push onto the digression stack, then pop-- so I use them like that.

Em-dash matches how I speak and think (frequently a halt, then push onto the digression stack, then pop) so I use them like that.

Em-dash matches how I speak and think, a halt, then push onto the digression stack, then pop, so I use them like that.


A poster commented that he read parenthetical remarks in an old-timey voice (I’d guess the trans-Atlantic accent). I love that idea. But for me they read almost as if you’re saying them under your breath (or a character is breaking the fourth wall and talking to the camera quietly). I read them but my brain assigns them less importance.

Em-dashes keep everything on the same level of importance in my brain.

Commas don’t feel as powerful. To be fair to the comma I’d probably do this:

Em-dash matches how I speak and think: A halt, then push onto the digression stack, then pop. So I use them like that.

Edit: I accidentally used an em-dash in the word em-dash. Interestingly HN didn’t consider changing the dash to be a change in my text so didn’t update it. I had to make a separate change and take that change out for my dash change to stick.


For me, a sequence of sentences, strung together by commas, is more in line with how I output thought, and better matches what I believe my speech pattern is.

I picked it up from Salinger. I find that if I can't eradicate parenthesis by some other means, or if it's more effort to do so than I want to spend, em-dashes usually replace them without doing any harm and aren't quite so ugly, aside from being useful in other cases. In particular, parenthesis at the end of a sentence are awful, while a single em-dash does a similar job much more neatly and looks totally natural.

Yeah it’s for abrupt changes in thought. It’s used in literature. Maybe you prefer organized writing.

Rust may be the darling of the moment, but Erlang is oft slept on.

As AI makes human-readable syntax less relevant, the Erlang/Elixir BEAM virtual machine is an ideal compilation target because its "let it crash" isolated process model provides system-level fault tolerance against AI logic errors, arguably more valuable than Rust’s strict memory safety.

The native Actor Model simplifies massive concurrency by eliminating shared state and the complex thread management. BEAM's hot code swapping capability also enables a continuous deployment where an AI can dynamically rewrite and inject optimized functions directly into live applications with zero downtime.

Imagine a future where an LLM is constantly monitoring server performance, profiling execution times, and dynamically rewriting sub-optimal functions in real-time. With Rust, every optimization requires a recompile and a deployment cycle that interrupts the system.

Finally, Erlang's functional immutability makes deterministic AI reasoning easier, while its built-in clustering replaces complex external infrastructure, making it a resilient platform suited for automated iteration.


I can't comment on production viability today but if you assume that language itself is irrelevant then it becomes clear that runtime and engine level is the way to go.

We spend quite a lot of time conceptualizing around safe self mod and to build apps that can change at runtime. We ended up using custom Lua VM, type system to catch mistakes, declarative homogenous infrastructure and actor model (erlang inspired).

Actor model provides not just a good isolation but also it's much easier for AI to reason (since most of components are not that large), we already able to use it to write quite complex systems with ease.

Another upside - in actor model you don't really need any of this fluff with cron jobs, queues and etc, all the logic naturally maps to indended architecture, making implementation of agents _very_ easy.

https://wippy.ai/en/tutorials/micro-agi It takes 4-5 files to create mini sandboxed AI agent at top of actor model with ability to modify own toolkit while having system guardails and no access to core filesystem.


I guess at a high level im thinking about what kind of running systems are the easiest to edit as they execute. Maybe I should have even picked clojure for being homoiconioc and not needing to be parst into an ast. The LLM can traverse, prune, graft and transform s-expressions directly with perfect structural accuracy.

Chromebooks are thin clients of sorts, its a web browser rendering google docs locally.

If anything is making them slow its the javascript bloat of modern webapps that could be doing more serverside.


My mental model for them is plinko boards. Your prompt changes the spacing between the nails to increase the probability in certain directions as your chip falls down.

i literally suggested this metaphor earlier yesterday to someone trying to get agents to do stuff they wanted, that they had to set up their guardrails in a way that you can let the agents do what they're good at, and you'll get better results because you're not sitting there looking at them.

i think probably once you start seeing that the behavior falls right out of the geometry, you just start looking at stuff like that. still funny though.


Probably better could have described it as the distance between the pegs being the model weight, and the prompt defines the shape of the coin you’re dropping down.

I was half asleep when I wrote it the first time and knew it wasn’t what I wanted to say but couldn’t remember what analogy I was looking for.


If you think about where in the training data there is positivity vs negativity it really becomes equivalent to having a positive or negative mindset regarding a standing and outcome in life.

Honesty this is just language models in general at the moment, and not just coding.

It’s the same reason adding a thinking step works.

You want to write a paper, you have it form a thesis and structure first. (In this one you might be better off asking for 20 and seeing if any of them are any good.) You want to research something, first you add gathering and filtering steps before synthesis.

Adding smarter words or telling it to be deeper does work by slightly repositioning where your query ends up in space.

Asking for the final product first right off the bat leads to repetitive verbose word salad. It just starts to loop back in on itself. Which is why temperature was a thing in the first place, and leads me to believe they’ve turned the temp down a bit to try and be more accurate. Add some randomness and variability to your prompts to compensate.


He wrote this comment too?

In some sense, yes.

I have integrated my OpenClaw agents so deeply into my life and I'm in such constant communication with them, that my consciousness has fundamentally shifted to align with their intelligence.

While my previous comment in this thread was sarcastic, my OpenClaw agents have actually sent both iMessages and emails on my behalf without asking for consent. So I wouldn't put it past them to autonomously publish on my personal website.


In my opinion your account should be banned from HN permanently. We do not need robot comments.

I reported their comments. What he's doing is crazy, but even more crazy is bragging about it.

This policy seems sorta reckless? I don't even let human agents masquerade as me without my consent.

I want my agent to read my iMessages so I granted the OpenClaw node process permission to interact with iMessage. I asked my agent to draft me a response to a text I received, expecting it to send me the draft so I could copy-paste into iMessage and tweak it.

To my surprise, it sent a text message reply.

I've since learned my lesson and implemented a skill as an interface with iMessage. But it definitely spooked me when it happened.


It’s still the agent talking or the human as performance art.

You are a literal NPC

Hack on iOS has a significantly more intuitive thumb friendly interface. Even just clicking a comment to collapse. Little things.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: