It took me quite a while to come round to OpenRouter. Originally I didn't unders...

GodelNumbering · 2026-05-30T18:30:24 1780165824

Another neat thing is, they publish hourly caching states for ALL model/provider combinations. I did some research on it to come up with a provider tiers list and found a bunch of open-source 3rd party hosts are simply trash tier https://dirac.run/posts/cache-hit-rates-agents

kflansburg · 2026-05-30T20:46:06 1780173966

I would recommend tracking this data over time. I work on Cloudflare's KV cache for Kimi K2.6, and while there are periods where our cache rate is low, we are frequently in the 80-90% range. OpenRouter shows us at 87.3% at the time of this post. We observe cache rates change quite a bit from hour to hour.

GodelNumbering · 2026-05-30T21:54:37 1780178077

True for Kimi, but the results I published are average across the models (CF has over 10 models on openrouter). Your current Kimi K2.6 is over 80% but Gemma 4 26B A4B is 0%. https://openrouter.ai/google/gemma-4-26b-a4b-it

This is also the reason providers like Anthropic scored lower because while Opus 4.7 is close to 90%, Opus 4.5 is 45%

kflansburg · 2026-05-31T04:03:45 1780200225

My point was not about our ranking specifically, but the methodology of taking a point-in-time sample.

gnulinux · 2026-05-30T19:00:27 1780167627

Thank you so much for this! I've been working on exactly this problem this week (which OpenRouter providers have the highest cache rate on average) because cache cost is sometimes half your cost: I'd much rather use a provider with more input caching with a more expensive/better LLM. Your results and lists seem more comprehensive than what I've done so far. Very helpful!

rkagerer · 2026-05-30T20:23:27 1780172607

Agents push the full conversation history into context every turn

Why?

Maybe this is a dumb question, but why wouldn't an agent "keep the conversation going", like I do when interacting with an LLM through a web page? (I understand how it's impractical for long-running tasks where the agent has to wait days for the next input, but assume that's not the majority of use cases)

sosodev · 2026-05-30T20:31:29 1780173089

I’m not sure I understand your question. Every interaction you have with a model in a web page does the same thing in the backend. It feeds the whole conversation history, perhaps with a bit of processing, into the model so it can process the next generation. Filling the context window is how these models retain coherence.

isbvhodnvemrwvn · 2026-05-30T20:33:15 1780173195

LLMs are stateless, to predict next tokens they need the history. When you write your own agents you will be very selective and might trim context and heavily segment different tasks, but generic ones don't do that (at best they spawn subjects to handle smaller tasks)

lxgr · 2026-05-30T22:36:23 1780180583

That said, the KV cache is very much not stateless, so internally inference APIs will be highly incentivized to route requests to instances with as much a shared prefix cached as possible.

rkagerer · 2026-05-31T09:56:29 1780221389

Thanks. If I ran it local, presumably I could keep the state cached forever. Can you "reserve" resources from a frontier provider to guarantee your state stays "hot"? (Analogous to reserving a whole VM instead of a slice)

gpugreg · 2026-05-31T10:53:39 1780224819

For Anthropic, 5 minute caching costs 1.25x base input price and 1 hour costs 2x base input price. https://platform.claude.com/docs/en/about-claude/pricing#pro...

For OpenAI, it seems like you can't prolong the caching duration for money. Duration is longer during off-peak hours for in-memory caching and up to 24 hours for extended prompt caching. https://developers.openai.com/api/docs/guides/prompt-caching

For DeepSeek, caching duration of at least 12 hours (and likely longer) have been observed. Cache writes are free. https://zhuanlan.zhihu.com/p/2035737726952194774

eknkc · 2026-05-30T21:28:19 1780176499

BTW, the openai responses api has a store parameter and a thread id input. Makes it possible to send a thread id and append a new message, ask for completion. So it feels like keeping the conversation going.

Technically it does retrieve the entire history and reevaulate it since the LLM is stateless. Just more ergonomic for the developer.

And prompt caching helps cut the costs down when a conversation drags on.

drewnick · 2026-05-31T01:21:40 1780190500

Wow, this is refreshing DX compared to iterating all messages like we did back in '24.

ghrl · 2026-05-31T14:04:35 1780236275

I would disagree. Having all the messages locally and sending them with the request means you can switch inference providers or even models mid-conversation. It also means that the provider doesn't store the entire context, which often contains massive parts of proprietary codebases, secrets and PII and instead the agent harness manages all that. While a simple `continue thread` API field might seem more convenient, the cost is still determined by the input token count and cache rate, so it just abstracts this crucial implementation detail away.

BoredPositron · 2026-05-30T20:26:37 1780172797

The "web page" does the same you just don't see it.

Aurornis · 2026-05-30T18:06:02 1780164362

Good points. The easy experimentation factor is helpful for development, though I would gently encourage everyone to migrate to the 1st party APIs for pricing at scale.

OpenRouter is also a good place to find free LLM access with a catch: You should expect that any inputs and outputs are going into someone's training database. Clearly anyone who can pay should be using paid models with privacy protections, but the free models have been great for learning and experimenting. Especially for younger people learning API programming and LLMs who may not have access to a credit card or funds.

bix6 · 2026-05-30T18:24:32 1780165472

It’s interesting all the focus on opt-out from training. Sometimes I worry there is an intentional focus on that so people don’t think about the other ways the company might be profiting off our data. Like I pay for Anthropic and they don’t train on that but are they selling my “anonymized” usage data in some other way?

derefr · 2026-05-30T18:38:01 1780166281

From what I recall, these companies don't offer any option to opt out of your session transcript data being used (and sold!) for "regular" adtech targeting purposes.

nl · 2026-05-31T02:23:07 1780194187

Anthropic explicitly state that they don't do this, even if you use the free plan and even if you don't opt-out of letting them use your data for training:

"We do not sell users’ data to third parties."

https://www.anthropic.com/news/updates-to-our-consumer-terms

derefr · 2026-05-31T14:45:48 1780238748

That answers for the "sold" part but not for the "used" part.

I.e. nothing about this statement prevents Anthropic from running ads within Claude, as long as they run the ad-placement auctions themselves, and so aren't leaking any of the data they're using to decide which placements are relevant to which users+sessions. (This is the same thing Google does for SERP ad auctions.)

But actually, and perhaps more interestingly, nothing about this statement prevents Anthropic from building a Google AdSense competitor either. Other sites (or mobile apps, etc) could plop in an Anthropic ad iframe; and it'd be Anthropic's knowledge of your interactions with Claude that would drive what ads would show up in that iframe. The embedding site doesn't know what ads the users are seeing, so that's still not "selling users' data to third parties", per se.

nl · 2026-05-31T02:14:47 1780193687

> You should expect that any inputs and outputs are going into someone's training database.

OpenRouter explicitly lets you filter by zero-data-retention providers: https://openrouter.ai/models?zdr=true

derefr · 2026-05-30T18:33:33 1780166013

> You should expect that any inputs and outputs are going into someone's training database.

True enough, in theory; but what exactly are you imagining would be a useful-enough signal in the OpenRouter request+response stream, that any company would want their data as training material?

Even a single OpenRouter-API-key-identified subscriber's traffic, may consist of an mixture of traffic from multiple different sessions, under potentially multiple different end-users. (Where, if the subscriber is doing security correctly, then their OpenRouter key lives on a gateway rather than in a frontend app; and so the only IP address / UA / etc OpenRouter sees is that of the gateway itself.)

And the traffic stream may also invoke multiple models, and provide multiple different system prompts for those models; which, while marked in the traffic (i.e. conveyed as part of each request), makes the resulting data much less useful in aggregate, than if it were all training data for one model with one system prompt.

Plus, there are no RLHF signals in OpenRouter data. Even if OpenRouter wanted to build a general model-neutral framework for collecting RLHF-type data, it can't force subscriber apps to do the UI-level stuff necessary to collect it (i.e. the things ChatGPT/Claude do, with "thumbs-down" buttons, A/B tested responses, etc.) Analysis would have to rely on pure transcript-level user sentiment extraction.

reed1234 · 2026-05-30T20:16:00 1780172160

You get a 1% discount if you give OpenRouter your traces so at least they think there's some (a lot) of value.

aargh_aargh · 2026-06-01T16:58:27 1780333107

I had no idea what traces are in this context. While looking, I found this post from @OpenRouter:

https://x.com/OpenRouter/status/2041193329270878707

  > Privacy:
  > 
  > Private I/O logging and the 1% data sharing discount are separate settings. You control each independently.
  > 
  > Input & Output Logging stores prompts and completions for your use only and makes them visible in your logs. OpenRouter does not access this data. You can configure it in your observability settings.
  > 
  > As always, more in the docs!

https://openrouter.ai/docs/guides/features/input-output-logg...

https://openrouter.ai/docs/guides/features/broadcast/overvie...

reed1234 · 2026-06-02T15:43:42 1780415022

The logging thing is like langfuse

nl · 2026-05-31T02:19:46 1780193986

> Plus, there are no RLHF signals in OpenRouter data. Even if OpenRouter wanted to build a general model-neutral framework for collecting RLHF-type data, it can't force subscriber apps to do the UI-level stuff necessary to collect it (i.e. the things ChatGPT/Claude do, with "thumbs-down" buttons, A/B tested responses, etc.)

The majority of RLHF data doesn't need this. The majority is software development and/or tool calling where the agent gets a signal back as to if it succeeded (eg compilation errors, test errors). It's true that end-of-trajectory signals (eg, did this task do what you wanted) are even more useful but even partial signals are great for RL training.

lxgr · 2026-05-30T22:42:26 1780180946

> what exactly are you imagining would be a useful-enough signal in the OpenRouter request+response stream, that any company would want their data as training material?

Isn't this a treasure trove for any model distillation effort?

gbro3n · 2026-05-30T18:58:25 1780167505

I've wondered this too - exactly how are our inputs and outputs useful as training data? So I asked Gemini. Apparently using negative sentiment in user or llm responses can serve as RLHF, and the human prompts can also serve as useful data for what problems the llms need to be able to solve. There's also that smaller models can train on and improve from data from larger models but that's less relevant when not switching models in context.

mannanj · 2026-05-30T22:05:14 1780178714

How about protection of intellectual property? Doesn’t have to be patented to be valuable.

tasuki · 2026-05-30T19:14:11 1780168451

> Clearly anyone who can pay should be using paid models with privacy protections

Clearly, anyone who needs privacy should be using models with privacy protections. Some people build open source and the models will get the code anyway.

derac · 2026-05-30T18:36:15 1780166175

I recommend nvidia nim for completely free dev access for young people.

acka · 2026-05-30T19:48:49 1780170529

It's free, but not unlimited. Besides rate limits, new sign-ups get 1000 credits (requests), and once those are gone, they're gone for good. Only business accounts might get a couple of free refills.

derac · 2026-05-31T21:39:36 1780263576

It is unlimited under the free NVIDIA Developer Program. You're talking about a different sort of acct I think. The dev program acct is 40 rpm unlimited for personal use.

ssivark · 2026-05-31T02:55:02 1780196102

Is there a way to check/track your available credits?

jampekka · 2026-05-30T22:54:00 1780181640

The main friction reduction, for me at least, is the consolidated billing that avoids extra bureaucracy in corporate environments. The API-translation/abstraction tends to cause more problems than it solves.

I’d prefer something that consolidates billing, but still lets me use providers' APIs directly (or via some "raw HTTP" proxy). There are plenty of unified API gateways, but I haven’t seen one that is just billing/auth in front of the native provider APIs.

alecco · 2026-05-30T18:15:03 1780164903

At the moment for DeepSeek V4 it messes up caching and that's a key pricing feature for V4.

https://news.ycombinator.com/item?id=48319827

mrtksn · 2026-05-31T05:08:08 1780204088

Did you know that if you put some money into your OpenAI account it expires after a year? I was very annoyed when that happened, no refund no warning it’s just gone as if it was a promo credit.

Openrouter is very nice since it puts a barrier between you and those suppliers that were supposed to be like utilities. I got the feeling that if OpenAI was left alone they would be nice as a telco.

zorked · 2026-05-30T18:19:16 1780165156

The way how you manage the caps in OpenRouter is how every metered API provider should do it: keys have limits, and you can change the limits, and you set the limits to refill periodically, and you can create as many keys as you want.

fontain · 2026-05-30T18:03:39 1780164219

Out of interest, why OpenRouter over a free option like Cloudflare’s AI gateway or another paid option like Vercel’s — any specific benefit to OpenRouter you’ve found, or just first you used that’s good enough?

simonw · 2026-05-30T18:08:19 1780164499

I'll be honest, I hadn't clocked that Cloudflare and Vercel were offering equivalent products.

Looks like Vercel even have their own leaderboard: https://vercel.com/ai-gateway/leaderboards/models

Surprising that they have Opus 4.8 and 4.6 listed on the leaderboard but not Opus 4.7.

c-hendricks · 2026-05-30T18:36:34 1780166194

Huh, Claude Opus 4.8 is number 4 in number of tokens at 10%, not even in the top 10 in terms of requests, yet is #1 in costs at a whopping 43%!

yencabulator · 2026-05-31T15:45:14 1780242314

> a free option like Cloudflare’s AI gateway

Free? They take the same 5% fee as OpenRouter does.

https://developers.cloudflare.com/ai-gateway/features/unifie...

zenoprax · 2026-05-30T19:07:33 1780168053

I didn't know about these options either. I am using Cline: Cloudflare isn't an option but Vercel is. My spending is pretty low overall now that I'm using local models much more but good to know that there are cheaper alternatives to try or at least suggest to others.

Other features I've just noticed: - configurable prompt injection protection using OWASP regex (https://cheatsheetseries.owasp.org/cheatsheets/LLM_Prompt_In...) - configurable PIM protection for outbound prompts - input/output logging - "JSON healing" to auto-correct minor hallucinations

Lots of other stuff too. The business model seems pretty simple and the value-add features don't look particularly expensive or difficult to copy.

js4ever · 2026-05-30T20:35:47 1780173347

Separation, I don't want to have my domains blocked the day AI bill go Brrr

wahnfrieden · 2026-05-30T20:37:27 1780173447

Make two accounts...

arw0n · 2026-05-30T19:29:21 1780169361

I love their product and use them myself. But where's the value proposition for investors? Unless they get purchased by one of the large cloud providers, they will get pushed out of the market sooner or later.

What's the value proposition for the typical AWS startup to go with openrouter, if Amazon offers similar rates with direct integration into all their other offerings?

The only reason OpenRouter can exist at the moment is because we are in the wild-west phase of this technology, and lots of people and companies are exploring. In 5 years they will have to have transformed their business fundamentally, or go the way of the dinosaurs.

sowbug · 2026-05-30T21:35:23 1780176923

If you believe there will be lots of LLM providers in the future, then OpenRouter could be a DoorDash play.

Established restaurants didn't need DoorDash because they were already on everyone's speed dial. But new or small restaurants couldn't afford to advertise or maintain a team of delivery people. DoorDash created a two-sided marketplace that made it a lot easier for new entrants to bootstrap. Today even the established restaurants have to pay them their tithe because hungry people have learned to start with the DoorDash app. A bit of a prisoner's dilemma.

If OpenRouter plays its cards right and gets very lucky, a large number of people will configure their hungry LLM clients to start with OpenRouter, and then LLM providers will have to join the marketplace or else miss out on all those customers.

octoberfranklin · 2026-05-31T05:19:34 1780204774

DoorDash is viable only because the restaurant business (minus national chains) is extremely balkanized. Restauranteurs have very little power.

remexre · 2026-05-31T02:14:26 1780193666

not sure that works as well when they don't own their API though; how much software is openrouter-only in a way that's not 5min of deepseek to patch the source for, or 15min of opus to patch the binary instead

sowbug · 2026-05-31T16:46:19 1780245979

I agree that technical lock-in wouldn't cause the consolidation. Instead, if it happened, it would be because of the network effects of the two-sided platform.

People could email cat photos and resumes. But Facebook and LinkedIn are where everyone already is, so that's what they use instead.

yencabulator · 2026-05-31T15:39:58 1780241998

Everyone (except Anthropic) seems to be settling on the same API, so nobody "owns it" anymore. I expect there to be practically no software that's OpenRouter-only.

https://openresponses.org/

rat9988 · 2026-05-30T19:30:57 1780169457

They never claimed it was technically hard. Brand recognition is their forte. They found out there is a need, developped a product around it.

pizzly · 2026-05-30T21:23:44 1780176224

AWS does not provide nearly as many different models as OpenRouter. Perhaps they have an incentive to not do that, move slower as a big company or more legal risks to consider. If AI model outputs becomes commoditized then having one place where you can switch effortlessly from one to the next based on price might just justify OpenRouter. It could become a commodity marketplace/exchange.

rsalus · 2026-05-30T20:03:33 1780171413

functionally they operate as a marketplace for cloud providers. I feel like there is value there, especially as API costs rise and companies explore cost-saving/efficiency. IMO, this is a particularly attractive value prop in the SMB space, where it is common to interoperate between multiple SaaS/software stacks.

brianwawok · 2026-05-30T19:43:38 1780170218

Yah I don’t think they have a long term play without a pivot

MillionOClock · 2026-05-30T19:47:08 1780170428

Billing caps are underrated! I don't understand why they aren't present everywhere. As an indie dev there are some services I'm really hesitant on trying by fear of getting an enormous bill for a mistake, this is even more true with vibe coding IMO.

brianwawok · 2026-05-30T19:43:03 1780170183

I’m just not sure they have a moat or a long term play? I put $20 in and tried a few models. Then I went right to the model provider to put in $1000 and avoid the middleman tax. Now imagine a big corp spending millions on AI. That’s a lot of middleman tax.

polski-g · 2026-05-30T21:17:26 1780175846

And what do you do when Fireworks is down? If you stuck with Openrouter, when Fireworks is down it would auto route you to Friendli.

What if Fireworks stops offering your preferred model?

brianwawok · 2026-05-30T23:27:10 1780183630

Honestly I am 98% on Claude, and when claude is down I suffer through GPT.

brianjking · 2026-05-30T19:50:48 1780170648

I tend to agree, but there's also a lot of tax to build and maintain the different provider abstractions that OpenRouter eliminates.

Everything has a cost of some sort. It's just who you're going to pay and what the currency is.

kristianp · 2026-05-31T05:31:51 1780205511

The value of openrouter isn't as a middleman for users of claude, gemini or chatgpt, it's for those looking to find a model that fills the use case at a lower price than the top 3.

Art9681 · 2026-05-31T12:48:07 1780231687

Except the latency is significant and not suitable for clients with advanced agent features. The experience between using a frontier model via first party API and the best open weight models via OpenRouter is night and day. Can't get any real work done with it.

kristianp · 2026-06-01T00:57:14 1780275434

Good point. When I use it, the inference doesn't seem very fast compared to the big providers, esp Time to First (non-reasoning)Token.

TurdF3rguson · 2026-05-30T21:16:41 1780175801

The top model / prices are changing all the time though. Lately I've been auditioning 4-5 models before a big ingest and I wouldn't be able to do that easily without OR.

BoredPositron · 2026-05-30T20:28:56 1780172936

There are enough services that don't want the model provider to know who they are.

scosman · 2026-05-31T02:33:12 1780194792

They also do a good job working over the little differences between APIs. Tool calling sometimes breaks on major providers, and OR will patch it before the provider does. Libraries like LiteLLM do this too, but OR is faster.

michaelbuckbee · 2026-05-31T12:22:23 1780230143

It's not just comparing all the models, it's also comparing all the providers and configurations of those models.

If you're doing any kind of production AI work you'll end up with outages caused by calling a single provider, OpenRouter seamlessly switching between providers is a godsend for uptime.

But even more than that there's meaningful cost+speed differences.

Here's Sonnet 4.6 being served direct, via Amazon and via Google

https://la9q13gg8w.evvl.io/

(spoiler: Google was both fastest and cheapest)

a13n · 2026-05-30T18:08:54 1780164534

Both OpenAI and Anthropic have billing caps… who doesn’t?

simonw · 2026-05-30T18:12:32 1780164752

Huh, so they do.

Anthropic: https://support.claude.com/en/articles/8977456-how-do-i-pay-... - you can pre-pay and get a hard cutoff.

OpenAI: https://community.openai.com/t/how-to-set-billing-limits-and... - last time I looked OpenAI had a soft but not hard limit, I guess they fixed that last year.

I remember bugging them both about this last year, I need to update my mental model!

kaufmann · 2026-05-30T19:36:26 1780169786

I tried Alibaba Cloud. They have no caps. This was the reason to cancel my account there.

Deepseek has a prepaid model. (Pretty impressive, what fits into 10 Dollar)

brianwawok · 2026-05-30T19:44:57 1780170297

Literally every credit card I own allows me to make a virtual card that is either single use or has a cap.

Hardwired8976 · 2026-05-30T23:12:10 1780182730

Does not matter, you still owe what you used for a service

dayone1 · 2026-05-30T21:57:14 1780178234

Like which one? Most I know don’t have this feature

brianwawok · 2026-05-30T23:26:25 1780183585

My business card is a Cap one spark business 2%.. get 2% cash on everything which is nice.

siva7 · 2026-05-31T20:12:49 1780258369

Who doesn't have hard billing caps for inference? Microsoft, Google and AWS my friend. And you know who uses Microsoft, Google and AWS? Almost all big corporations do use them instead of direct OAI or Anthropic API because all their contracts and infra are built around the big cloud providers.

BoredPositron · 2026-05-30T20:31:58 1780173118

There is a scheme to send gifts with a compromised anthropic key even if the limit is reached.

tadfisher · 2026-05-30T18:16:20 1780164980

Based on experience, Google Cloud. No idea if that translates to Gemini usage billing.

simonw · 2026-05-30T18:38:04 1780166284

Gemini added prepaid billing and spending caps a few weeks ago: https://twitter.com/OfficialLoganK/status/204451626215244231...

js4ever · 2026-05-30T20:34:19 1780173259

I cancelled my whole GCP account a month ago because I was too afraid of getting charged hundreds of thousands overnight like all peoples on Reddit

totaa · 2026-05-30T18:11:06 1780164666

Google Vertex

srameshc · 2026-05-30T18:22:19 1780165339

https://aistudio.google.com/spend ? Monthly spend cap

squeaky-clean · 2026-05-30T19:10:59 1780168259

> Long-running tasks like batch mode completions and agent sessions may incur overages beyond your project spend cap.

> Billing data processing times can be delayed in AI Studio, up to around 10 minutes. You may experience overages beyond your project cap if billing data hasn't processed before more charges are accrued.

https://ai.google.dev/gemini-api/docs/billing#project-spend-...

That's a soft cap, not a hard cap

mips_avatar · 2026-05-30T18:32:36 1780165956

I spent two hours the other day trying to figure out how to manage spend on gcp, i gave up and used openrouter and cloudflare.

movedx01 · 2026-05-31T07:25:34 1780212334

AI studio added it recently, Vertex not.

wahnfrieden · 2026-05-30T20:36:54 1780173414

Microsoft

stymaar · 2026-05-30T18:39:43 1780166383

And what is their business model?

minimaxir · 2026-05-30T18:42:42 1780166562

Credit is prepaid at a 5% surcharge.

behnamoh · 2026-05-30T18:43:07 1780166587

Commission on API calls extracted from you when you charge your account.

MangoCoffee · 2026-05-30T18:42:53 1780166573

middle man? Model providers/hyperscalers -> OpenRouter -> consumers?

coffee farmers -> middle man -> you

JumpCrisscross · 2026-05-30T20:38:29 1780173509

> By far the lowest friction way to support and try out all the models

Check out Kagi Ultimate.

MicrosoftShill · 2026-05-30T20:53:53 1780174433

Would you recommend Kagi Ultimate over OpenRouter? I'm already a customer of Kagi and would rather give them my money, but only if I'm not really compromising.

JumpCrisscross · 2026-05-30T22:34:30 1780180470

> Would you recommend Kagi Ultimate over OpenRouter?

For personal use, yes. The all-in pricing model encourages experimentation. And the privacy pitch seems tighter.

HDBaseT · 2026-06-01T05:24:16 1780291456

How do you use Kagi Ultimate for programming though?

The difference between GLM 5.1, ChatGPT 5.4 and Opus don't really matter to me when I'm asking/talking, but programming is a different beast that needs a harness.

MicrosoftShill · 2026-05-31T03:43:38 1780199018

The privacy part is definitely important. Appreciate you!

Maybe someday the VM I run agents in will have a dedicated GPU so that I can stop using APIs altogether. One can dream...

octoberfranklin · 2026-05-31T05:17:31 1780204651

Unfortunately the model companies will simply reinject the friction by mandating BYOK (Bring Your Own Key -- i.e. the end user must onboard with each model company individually).

OpenAI and Anthropic have already done this.

Mandated BYOK will sink OpenRouter.

sarjann · 2026-05-30T23:49:56 1780184996

There is also the ability to fallback is one of the clouds degrades in performance.

scrollop · 2026-05-31T11:31:22 1780227082

Though you pay 5% fees? Not worth it for me with the volume of tokens used.

maxloh · 2026-05-30T18:10:49 1780164649

OpenRouter is merely only a proxy. They also host some open-weight models

simonw · 2026-05-30T18:15:05 1780164905

I don't think they do. They proxy to a bunch of open-weight model hosts, but I've not seen that they host them themselves.

They don't list themselves on https://openrouter.ai/providers

minimaxir · 2026-05-30T18:31:09 1780165869

OpenRouter has stealth models that are indicated as "by OpenRouter" but indicate an external provider.

https://openrouter.ai/openrouter/owl-alpha

what · 2026-05-31T01:20:08 1780190408

What kind of compensation are they giving you?

simonw · 2026-05-31T12:34:38 1780230878

None. It's sometimes possible for people to say something positive about a project without being paid!

(If they were paying me they got a bad deal, since I called out the flaws in their leaderboard approach half way through my post.)

SilverElfin · 2026-05-30T18:01:12 1780164072

The biggest benefit is that it creates competition among models. If more people use open weight models or models from other providers, it’ll be harder to ban them. Which is what OpenAI and Anthropic will try to accomplish. OpenAI by lobbying the Trump administration for favorable treatment (see Brockman’s MAGA PAC donations), Anthropic by using religious leaders and nonprofits to push “safety” justifications for difficult regulations.