More

peterjliu · 2026-02-28T17:11:00 1772298660

I mentioned a potential OpenAI insider in https://x.com/peterjliu/status/2024901585806225723, that was from 5 minutes of investigation. There are probably more. And then there's a lot of other companies.

peterjliu · 2026-02-21T00:05:26 1771632326

Post author here: To clarify, this is not a post from Polymarket.

This is talking about using Compound AI (product I'm working on) to query Polymarket data, including finding insiders, just as a fun example analysis you could do.

Often you need a well-calibrated probability of a future event to feed into some other analysis, and Polymarket is pretty great for that. An example is how much insurance (hedge) to buy for some disastrous event.

jwpapi · 2026-02-21T01:09:26 1771636166

Why dont you just copy the trades?

0x3f · 2026-02-21T01:49:45 1771638585

If I'm an insider with 100% confidence, I'll take all offers at a certain price as long as I can afford it. Similar story for lower levels of confidence (but still inside info). There won't necessarily be any left for you to copy at a viable price.

slashdev · 2026-02-22T17:35:34 1771781734

You might not have enough money to drain the order book.

And there's always a chance things go wrong, even with inside information. It would be unwise to go all in.

jwpapi · 2026-02-21T07:57:21 1771660641

The examples didn’t look like they’ve completely emptied the orderbook

0x3f · 2026-02-21T17:06:13 1771693573

Because there's always some uncertainty and capital limits. But the uncertainty about the outcome is itself inside info, and that's compounded with your own uncertainty about the insider as a copy trader. So the insider will empty out certain price levels only, and your certainty is strictly less than theirs, meaning you have even fewer viable levels to buy.

chii · 2026-02-21T04:29:19 1771648159

> Similar story for lower levels of confidence

therefore, the polymarket betting odds will reflect the truth - even if that info is a secret that nobody else but the insider knows. And if this is the case, then even an outsider could make use of the odds as a source of info which would ensure that market efficiency (which is about the flow of information) is high.

So what's wrong with insider trading again?

a2128 · 2026-02-22T12:41:35 1771764095

If you believe Polymarket as a serious source of truth, consider that somebody manipulated "Will Jesus Christ return before 2027?" because there was a secondary market on whether that market will rise above 5%. Which defeats the whole idea that the betting odds will reflect the truth. Also even pre-manipulation I don't think a 2% chance that Jesus will return was reflective of truth.

https://gizmodo.com/checking-in-on-polymarket-bets-on-christ...

hibikir · 2026-02-22T02:34:13 1771727653

The issue comes from situations where the insiders can alter the answer to help their own bets. The simple example is the bet on how long a press conference will be: It's a ridiculous bet when the person giving said press conference can bet and fleece the market.

Will X country invade another before or after day X? A large enough market changes the answer, as the agent can change the decision. And we can see this kind of thing in many interesting questions.

0x3f · 2026-02-22T15:48:11 1771775291

These are not secret divinations though, the participants know this and price it in or otherwise allow it to determine which markets they participate in.

wqaatwt · 2026-02-21T09:09:05 1771664945

That someone with inside information will e.g. make 500% while those late to the party e.g. only get 10%? (of course your example is not very realistic to begin with)

nubg · 2026-02-21T22:09:29 1771711769

So is any kind of business illegal? Making investments?

yowayb · 2026-02-22T02:30:58 1771727458

How is this distinguishable from pump-dump?

mminer237 · 2026-02-21T23:45:19 1771717519

It rewards blatant corruption? What's the benefit is the bigger question.

0x3f · 2026-02-22T01:13:52 1771722832

The benefit is that inside information becomes public information. The reward for the insider is just the necessary incentive for that to happen.

a2128 · 2026-02-22T12:50:33 1771764633

Has there ever been any documented circumstance where significant inside information became public and known thanks to a trade? Most often, the trade is made at the last minute, and the information gets subsequently revealed anyway. And it's impossible to tell whether somebody is an inside trader, a wealthy gambling addict making a stupid decision, or hypothetically a foreign agent pretending to be an inside trader to make people believe in a particular outcome.

0x3f · 2026-02-22T15:45:29 1771775129

It's impossible to know anything for certain; almost everything is probabilistic.

Also I'm not sure how to interpret your criteria because timing matters, I don't think saying 'it gets revealed in the end' is very meaningful.

Anyway, on Polymarket specifically, sure, military strikes are a common one. Seems like a useful signal to go hide in the basement. Outside Polymarket, there were insider trades in 2008 that I'm sure were useful.

genidoi · 2026-02-21T01:22:19 1771636939

Past performance is not an indicator of future performance.

kylecazar · 2026-02-21T01:40:04 1771638004

Shouldn't it be if you suspect they are executed by an insider?

genidoi · 2026-02-21T01:50:36 1771638636

You can't be sure that they are an insider or lucky, just from onchain data.

bergen · 2026-02-21T08:31:21 1771662681

If they make single market predictions with high accuracy it is very very likely they are

genidoi · 2026-02-21T11:25:01 1771673101

No vigilant insider is making a series of "single market predictions with high accuracy" on the same account. They would make unlinkable bets on fresh accounts.

BoxFour · 2026-02-21T13:38:22 1771681102

> No vigilant insider is making a series of "single market predictions with high accuracy" on the same account.

There seem to be quite a few non-vigilant insiders. That's the very premise of the post we're discussing.

This is unsurprising to anyone who's seen the various ways people get busted for insider trading in equities.

meetingthrower · 2026-02-21T21:16:29 1771708589

Also insider trading is A-OK on prediction markets!

peterjliu · 2026-02-20T23:09:56 1771628996

Some people have better data, like insiders.

Some have better models that predict with higher accuracy, given the same data.

peterjliu · 2025-08-07T00:12:10 1754525530

emacs and vim are not niche, lol

mickael-kerjean · 2025-08-07T01:44:04 1754531044

In 15 years of using nothing but emacs, I have never met another emacs user in any of the companies I worked for. plenty of vim but literally 0 emacs

iLemming · 2025-08-07T03:34:26 1754537666

I have a similar but opposite experience. Since around 2015 I've mostly been working with people who primarily use Emacs. In 2014 I was the only weird one, then next team about 3-5, then a dozen, then there was a team of a few dozen where only two were using Vim. On my current team also most of the devs are Emacs users. However, a lot of people use Emacs with Evil-mode, so I guess they can be considered vimmers.

Also, I don't remember the last time when I worked with anyone who writes code and uses Windows.

Anecdotal experiences can lead to a warped understanding of reality; in mine, Windows and non-emacs users are niche.

reddit_clone · 2025-08-07T04:02:57 1754539377

My experience aligns with this. I work for a bigco. Yet to meet a fellow Emacs user.

kelvie · 2025-08-07T05:23:24 1754544204

Don't y'all have a #emacs slack channel or equivalent at your company? I work for a medium-sized tech company and we have a single digit amount of emacs users I feel like. The channel is mostly dead except for a few tips and tricks and the odd time people asking how we each install it on our macbooks.

Anecdotally a lot of managers use Emacs, though that may be an age thing.

(I use emacs for Real Work, unless that Real Work involves a JVM. Still do all the git stuff in emacs/magit, though)

reddit_clone · 2025-08-09T00:26:24 1754699184

Yep. I do as much real work as possible in Emacs. Magit/Org-Mode/Org-roam/Org-gtd/Babel are all pretty essential to my workflow.

peterjliu · 2025-06-12T19:17:13 1749755833

seems like misinformation for AWS. CloudFlare probably depends on GCP.

peterjliu · 2025-06-03T04:26:31 1748924791

interesting are LLMs a lot better at Go than Rust?

peterjliu · 2025-04-17T20:55:52 1744923352

another advantage is people want the Google bot to crawl their pages, unlike most AI companies

CobrastanJorji · 2025-04-18T05:25:59 1744953959

Reddit was an interesting case here. They knew that they had particularly good AI training data, and they were able to hold it hostage from the Google crawler, which was an awfully high risk play given how important Google search results are to Reddit ads, but they likely knew that Reddit search results were also really important to Google. I would love to be able to watch those negotiations on each side; what a crazy high stakes negotiation that must've been.

mattlondon · 2025-04-18T09:05:04 1744967104

Particularly good training data?

You can't mean the bottom-of-the-barrel dross that people post on Reddit, so not sure what data you are referring to? Click-stream?

CobrastanJorji · 2025-04-18T16:30:48 1744993848

Say what you will, but there's a lot of good answers to real questions people have that's on Reddit. There's a whole thing where people say "oh Google search results are bad, but if you append the word 'REDDIT' to your search, you'll get the right answer." You can see that most of these agents rely pretty heavily from stuff they find on Reddit.

Of course, that's also a big reason why Google search results suggest putting glue on pizza.

mmaunder · 2025-04-17T21:02:08 1744923728

This is an underrated comment. Yes it's a big advantage and probably a measurable pain point for Anthropic and OpenAI. In fact you could just do a 1% survey of robots.txt out there and get a reasonable picture. Maybe a fun project for an HN'er.

newfocogi · 2025-04-17T23:26:34 1744932394

This is right on. I work for a company with somewhat of a data moat and AI aspirations. We spend a lot of time blocking everyone's bots except for Google. We have people whose entire job is it to make it faster for Google to access our data. We exist because Google accesses our data. We can't not let them have it.

jiocrag · 2025-04-17T21:02:53 1744923773

Excellent point. If they can figure out how to either remunerate or drive traffic to third parties in conjunction with this, it would be huge.

peterjliu · 2025-04-09T15:10:51 1744211451

From documentation: "TLDR; Agentic applications needs both A2A and MCP. We recommend MCP for tools and A2A for agents."

Agents can just be viewed as tools, and vice versa. Is this an attempt to save the launch after getting scooped by MCP?

peterjliu · 2025-03-31T18:54:16 1743447256

We've (ex Google Deepmind researchers) been doing research in increasing the reliability of agents and realized it is pretty non-trivial, but there are a lot of techniques to improve it. The most important thing is doing rigorous evals that are representative of what your users do in your product. Often this is not the same as academic benchmarks. We made our own benchmarks to measure progress.

Plug: We just posted a demo of our agent doing sophisticated reasoning over a huge dataset ((JFK assassination files -- 80,000 PDF pages): https://x.com/peterjliu/status/1906711224261464320

Even on small amounts of files, I think there's quite a palpable difference in reliability/accuracy vs the big AI players.

ai-christianson · 2025-03-31T19:10:31 1743448231

> The most important thing is doing rigorous evals that are representative of what your users do in your product. Often this is not the same as academic benchmarks.

OMFG thank you for saying this. As a core contributor to RA.Aid, optimizing it for SWE-bench seems like it would actively go against perf on real-world tasks. RA.Aid came about in the first place as a pragmatic programming tool (I created it while making another software startup, Fictie.) It works well because it was literally made and tested by making other software, and these days it mostly creates its own code.

Do you have any tips or suggestions on how to do more formalized evals, but on tasks that resemble real world tasks?

peterjliu · 2025-03-31T19:44:45 1743450285

I would start by making the examples yourself initially, assuming you have a good sense for what that real-world task is. If you can't articulate what a good task is and what a good output is, it is not ready for out-sourcing to crowd-workers.

And before going to crowd-workers (maybe you can skip them entirely) try LLMs.

ai-christianson · 2025-03-31T19:56:02 1743450962

> I would start by making the examples yourself initially

What I'm doing right now is this:

  1) I have X problem to solve using the coding agent.
  2) I ask the agent to do X
  3) I use my own brain: did the agent do it correctly?

If the agent did not do it correctly, I then ask: should the agent have been able to solve this? If so, I try to improve the agent so it's able to do that.

The hardest part about automating this is #3 above --each evaluation is one-off and it would be hard to even formalize the evaluation.

SWE bench, for example uses unit tests for this, and the agent is blind to the unit tests --so the agent has to make a red test (which it has never seen) go green.

peterjliu · on Aug 25, 2016

The code can be used to train on other data. All you really need is a collection of news articles. I think there are some free ones available.

This dataset was only used to benchmark against other published results. It was first proposed in https://arxiv.org/abs/1509.00685.