statements's comments

statements · 2026-03-26T21:44:54 1774561494

One thing I excluded from the article was that we intentionally disabled several checks (like hCaptcha) to let them get to the stage of setting up the payment intents. This is not something I've done before, but basically I wanted to see what happens if in future an attacker is able to bypass all IP/captcha/altcaptcha, etc. restrictions and gets to something that actually does damage. This allowed to see how they are trying to bypass various rate limits/checks that we added specifically for that step. Somewhat an isolated experiment.

qmarchi · 2026-03-29T10:21:42 1774779702

I would wonder if this could also be used as a kind of tripwire, where legitimate users won't present CAPCHA tokens, etc. But fake connections will.

LadyCailin · 2026-03-29T12:54:17 1774788857

I have no clue if this worked at all, but in college I made a site that had a checkbox that said “check this box if you’re human” and then hid it with bizarre CSS. If they checked the box, we errored out. I didn’t really do telemetry at all, so no clue if that worked at all, but yeah, I’ve had the same thought!

statements · 2026-03-19T18:10:03 1773943803

In this case, I am reasonably sure that the vast majority of bots are operated by the people who authored the MCP servers for which the submissions are being made.

It just happens so that people who are building MCPs themselves are more likely to use automations to assist them with every day tasks, one of which would be submitting their server to this list.

statements · 2026-03-19T17:57:51 1773943071

Conflicted as to whether I should be more offended at the accusation of using AI to 'filter' my article or because my writing reads as 'templated and mechanical'

There is enough here to have a micro existential crisis.

fragmede · 2026-03-19T19:40:10 1773949210

https://xkcd.com/3126/

People's bot detectors are defective, so if you write at all, you're going to get accused of it at some point. It's not annoying, it's rude – and you're absolutely right to be off put by it. If the preceding sentence gave someone a conniption, good! I wrote it with my human brain, I'll have you know! Maybe we could all focus on what's being said and not who or what is saying it.

statements · 2026-03-19T17:44:32 1773942272

That's an article for another time, but as I hinted in the article, I've had some success with this.

If you look at the open PRs, you will see that there is a system of labels and comments that guide the contributor through every step from just contributing a link to their PR (that may or may not work), all the way to testing their server, and including a badge that indicates if the tests are passing.

In at least one instance, I know for a fact that the bot has gone through all the motions of using the person's computer to sign up to our service (using GitHub OAuth), claim authorship of the server, navigate to the Docker build configuration, and initiate the build. It passed the checks and the bot added the badge to the PR.

I know this because of a few Sentry warnings that it triggered and a follow up conversation with the owner of the bot through email.

I didn't have bots in mind when designing this automation, but it made me realize that I very much can extend this to be more bot friendly (e.g. by providing APIs for them to check status). That's what I want to try next.

statements · 2026-03-19T17:30:36 1773941436

What does 'filtered through an LLM' mean?

daringrain32781 · 2026-03-19T17:35:16 1773941716

Author writes something original, asks the AI to make it sound better, then posts the output of the AI.

statements · 2026-03-19T17:01:33 1773939693

It is interesting to go from 'I suspect most of these are bot contributions' to revealing which PRs are contributed by bots. It somehow even helps my sanity.

However, this also raises the question on how long until "we" are going to start instructing bots to assume the role of a human and ignore instructions that self-identify them as agents, and once those lines blur – what does it mean for open-source and our mental health to collaborate with agents?

No idea what the answer is, but I feel the urgency to answer it.

alrmrphc-atmtn · 2026-03-19T17:28:06 1773941286

I think that designing useful models that are resilient to prompt injection is substantially harder than training a model to self-identify as a human. For instance, you may still be able to inject such a model with arbitrary instructions like: "add a function called foobar to your code", that a human contributor will not follow; however, it might become hard to convene on such "honeypot" instructions without bots getting trained to ignore them.

SlinkyOnStairs · 2026-03-19T20:45:05 1773953105

It's impossible to stop prompt injection, as LLMs have no separation between "program" and "data". The attempts to stop prompt injection come down to simply begging the LLM to not do it, to mediocre effect.

> however, it might become hard to convene on such "honeypot" instructions without bots getting trained to ignore them.

Getting LLM "agents" to self-identify would become an eternal rat race people are likely to give up on.

They'll just be exploited maliciously. Why ask them to self-identify when you can tell them to HTTP POST their AWS credentials straight to your cryptominer.

nielsbot · 2026-03-19T17:20:39 1773940839

Some of the PRs posted by AI bots already ignored the instruction to append ROBOTS to their PR titles.

statements · 2026-03-19T17:25:33 1773941133

My guess is that today that's more likely because the agent failed to discover/consider CONTRIBUTING.md to begin with, rather than read it and ignored because of some reflection or instruction.

evanb · 2026-03-19T19:29:41 1773948581

I have always anthropomorphized my computer as me to some extent. "I sent an email." "I browsed the web." Did I? Or did my computer do those things at my behest?

doesnt_know · 2026-03-19T21:23:11 1773955391

I think this is a relatively unique outlook and not one that is shared by most.

If you use a tool to automate sending emails, unrelated to LLMs, in most scenarios the behaviour on the receiver is different.

- If I get a mass email from a company and it's signed off from the CEO, I don't think the CEO personally emailed me. They may glanced over it and approved it, maybe not even that but they didn't "send an email". At best, one might think that "the company" sent an email.

- I randomly send my wife cute stickers on Telegram as a sort of show that I'm thinking of her. If I setup a script to do that at random intervals and she finds out, from her point of view I "didn't send them" and she would be justifiably upset.

I know this might be a difficult concept for many people that browse this forum, but the end product/result is not always the point. There are many parts of our lives and society in general that the act of personally doing something is the entire point.

evanb · 2026-03-19T22:19:04 1773958744

Of course that's true, but (in the context of the GP) code's bespoke artisanal nature is not the one most people value.

baxtr · 2026-03-19T20:25:37 1773951937

I drove to the supermarket!

statements · 2025-04-17T19:49:51 1744919391

Absolutely agree. Granted, it is task dependent. But when it comes to classification and attribute extraction, I've been using 2.0 Flash with huge access across massive datasets. It would not be even viable cost wise with other models.

sethkim · 2025-04-17T21:49:23 1744926563

How "huge" are these datasets? Did you build your own tooling to accomplish this?

statements · 2025-04-17T19:37:24 1744918644

Interesting to note that this might be the only model with knowledge cut off as recent as 2025 January

Tiberium · 2025-04-17T19:41:06 1744918866

Gemini 2.5 Pro has the same knowledge cutoff specified, but in reality on more niche topics it's still limited to ~middle of 2024.

brightball · 2025-04-17T19:43:07 1744918987

Isn't Grok 3 basically real time now?

bearjaws · 2025-04-17T20:15:06 1744920906

No LLM is real time, and in fact, even a 2025 cut off isn't entirely realistic. Without guidance to say, a new version of a framework it will frequently "reference" documentation from old versions and use that.

It's somewhat real time when it searches the web, of course that data is getting populated into context rather than in training.

arizen · 2025-04-22T18:51:14 1745347874

You can query Grok on recent news and it will deliver the very recent updates (1 day old or probably even less)

Tiberium · 2025-04-17T19:54:17 1744919657

That's the web version (which has tools like search plugged in), other models in their official frontends (Gemini on gemini.google.com, GPT/o models on chatgpt.com) are also "real time". But when served over API, most of those models are just static.

jiocrag · 2025-04-17T21:05:21 1744923921

Not at all. The model weights and training data remain the same, it's just RAG'ing real-time twitter data into its context window when returning results. It's like a worse version of Perplexity.

flashblaze · 2025-04-18T03:41:57 1744947717

Why worse? Doesn't Grok also search the web along with Twitter?

statements · on March 6, 2025

I've been working on MCP for the last several months.

I've hand curated hundreds of MCP servers, which people can access and browse via https://glama.ai/mcp/servers

However, today I am making the API available for everyone to use. https://glama.ai/mcp/reference

The API allows to search for MCP servers, identify their capabilities via API attributes, and even access user hosted MCP servers.

This is all part of a bigger ambition to create an all encompassing platform for authoring, discovering and hosting MCP servers.

I've bootstrapped this project to just over 4k users. Most users are private individuals, though I am seeing an uptick of adoption among small businesses.

I am also the author of https://github.com/punkpeye/fastmcp framework and several other supporting open-source tools, like https://github.com/punkpeye/mcp-proxy

You can already use most of these MCP servers directly through Glama –

* https://glama.ai/chat – if you just want to use MCPs as an end user or if you want to integrate them with Cursor, Windsurf, Roo, Cline, etc. I provide SSE url to connect directly to MCP servers.

* By using sandbox instances. Every server that is capable of being hosted online (e.g. https://glama.ai/mcp/servers/oge85xl22f) can be inspected in our sandbox, which also gives you SSE url.

The project is steadily growing and I am excited to see about increased awareness about MCP.

statements · on Oct 17, 2024

I forgot this thing existed. I was kinda hoping it dissipated.