Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Supply Chain Vuln Compromised Core AWS GitHub Repos & Threatened the AWS Console (wiz.io)
147 points by uvuv 1 day ago | hide | past | favorite | 35 comments




Breaking this down, several of AWS's core repos like the JS SDK use an allowlist of which contributor ids can run workflow actions in their PRs. The list was a regex, contained several short ids, and wasn't anchored with ^$, so if it allowed user 12345, then any userid containing 12345 could run their own actions on the PR, including one that exfiltrated access tokens. So they spammed GH with user creation requests, got an id that matched, and they were in like Flynn.

Said tokens didn't have admin access, but had enough privileges to invite other users to become full admins. Not sure if they were rotated, but github tokens are usually long-lived, like up to a year. Hey, isn't AWS the one always lecturing us to use temporary credentials? To be fair, AWS did more than just fix the regex, they introduced an "approve workflow run" UI unto the PR process that I think GH is also using now (not sure about that).


As a security dude I spend way too much of my time fixing missing anchors or unescaped wildcards in regex. The good news is that it's trivial to detect with static analysis tooling. The bad news is that broken regex is often used for security checks.

Sometimes I wish regexes were full matches by default and required prefixing and postfixing with `.*` to get the current behaviour

Java's Pattern.match() method works that way. Python has two separate methods: re.match auto-anchors, re.search does not.

a match isn't boolean, it's substring. the original (and more common) use-cases would become excessively verbose


> Said tokens didn't have admin access, but had enough privileges to invite other users to become full admins.

Ah... Github permissions. What fun.

Github actually has a way to federate with AWS for short-lived credentials, but then it screws everything up by completely half-assing the ghcr.io implementation. It's only available using the old deprecated classic access tokens.


Right? How is it that you still need a PAT or a custom app installation to access a registry?

> The list was a regex ...

Regexpes for security allow lists: what could possibly every go wrong uh!?


At least the vuln was old enough so that they couldn't blame AI for it, otherwise the article would read different ;)

Ironically (?) an AI code review would very likely have noticed the overly-permissive regex.

This doesn't really matter as long as they also find 10x more nits that create noise for the human reviewer.

This is a good point. On my GH I’ve disabled Copilot reviews because the vast majority of them are false positives, but I’m reconsidering that position as it might still be worth it to wade through the spurious reviews just to catch some real issues.

I filter for false positives with language like this:

    For each bug you find, write a failing test. Run the test to make sure it fails. If it passes, try 1-3 times to fix the test. If you can't get it to work, delete the test and move on to the next bug.
It's not perfect, you still get some non-bugs where the test fails because it's premises are wrong. Eg, recently I tossed out some tests that were asserting they could index a list at `foo.len()` instead of `foo.len() - 1`. But I've found a bunch of bugs this way too.

Nice, I’ll give this a try

Another success story for Regexes! Let's keep using this cryptic mess!

I met regexes when I was 13, I think. I spent a little time reading the Java API docs on the language's regex implementation and played with a couple of regex testing websites during an introductory programming class at that age. I've used them for the rest of my life without any difficulty. Strict (formal) regexes are extremely simple, and even when using crazy implementations that allow all kinds of backreferences and conditionals, 99.999% of regexes in the wild are extremely simple as well. And that's true in the example from TFA! There's nothing tricky or cryptic about this regex.

That said, what this regex wanted to be was obviously just a list. AWS should offer simpler abstractions (like lists) where they make sense.


> That said, what this regex wanted to be was obviously just a list. AWS should offer simpler abstractions (like lists) where they make sense.

Agree. I would understand if there was some obvious advantage here, but it doesn’t really seem like there is a dimension here where regex has an advantage over a list. It’s (1) harder to implement, (2) harder to review, (3) much harder to test comprehensively, (4) harder for users to use (correctly/safely).


Presumably the advantage was ease and speed of developing the filtering feature.

Wrong tradeoff, to be sure.


[flagged]


This is too hot a take. Regular expressions are used in some cases where they shouldn’t be, yes, but there’s also been a ton of code which used other string operations but had bugs due to the complexity or edge-cases which would have been easier to avoid with a regex. You should know both tools and when they’re appropriate.

From an educational perspective, regular expressions are also a great way to teach about state machines, computational complexity, formal languages, and grammars in a way that has direct applications to tools that are long-lived and ubiquitous in industry.

It's also this context that reveals how much simpler strict regular expressions are than general purpose programming languages like Python or JavaScript. That simplicity is also part of what makes regexes so ubiquitous: due to its lower computational complexity, regex parsing is really fast and doesn't take much memory.

When I say regexes are simple, I'm not really talking about compactness. I mean low complexity in a computational sense! As someone who rather likes regex, I think it would be totally fair for a team to rule out all uses of PCRE2 that go beyond the scope of regular languages. Those uses of regex may be compact, but they're no longer simple.

I'm also someone who is sensitive to readability-centered critiques of terse languages. Awk, sed, and even Bash parameter expansion can efficiently do precise transformations, too. But sometimes they should be avoided in favor of solutions that are more verbose, more explicit, and involve less special syntax. (Note also that Bash, awk, and sed are also all much more complex than regex!)


Regex is not used for parsing HTML or C++ code. So it is not good for complex tasks.

What is the claim? That it is compact for simple cases. Well Brainfuck is a compact programming language but I don't see it in production. Why?

Because the whole point of programming is that multiple eyeballs of different competence are looking at the same code. It has to be as legible as possible.


> Regex is not used for parsing HTML or C++ code. So it is not good for complex tasks.

Again, this is too binary a way if thinking. There are string matching operations which are not parsing source code and regular expressions can be a concise choice there. I’ve had cases where someone wrote multiple pages of convoluted logic trying to validate things where the regular expression was not only much easier to read but also correct because while someone was writing the third else-if block they missed a detail.


> To escalate privileges, we abused the token’s repo scope, which can manage repository collaborators, and invited our own GitHub user to be a repository administrator.

From everything I know about pentesting, they should have stopped before doing this, right? From https://hackerone.com/aws_vdp?type=team :

> You may only interact with accounts you own or with explicit written permission from AWS or the account owner


I think it comes down to what you do with the access. Since this is a public repo I don't think I'd be too upset at the addition of a new admin so long as they didn't do anything with that access. It's a good way to prove the impact. If it were a private repo I might feel differently.

This comes entirely down to the scope of the agreement for the assessment. Some teams are looking for you to identify and exploit vulns in order to demonstrate the potential impact that those vulnerabilities could have.

This is oftentimes political. The CISO wants additional budget for secure coding training and to hire more security engineers, let the pentesting firm demonstrate a massive compromise and watch the dollars roll in.

A lot of time, especially in smaller companies, it's the opposite. No one is responsible for security and customers demand some kind of audit. "Don't touch anything we don't authorize and don't do anything that might impact our systems without explicit permissions."

Wiz is a very prominent cloud security company who probably has incredibly lucrative contracts with AWS already, and their specialty, as I understand it, is identifying full "kill chains" in cloud environments. From access issues all the way to compromise of sensitive assets.


It’s possible that AWS is a Wiz customer, which would allow them to do more stuff.

I’d guess that we would not have had the pleasure of reading this article if wiz was payed by AWS. There were multiple high impact bug in 2025 that we read about here, where security researchers had to turn down small six figure bounties to avoid NDAs…

I worked on docs at GitHub which are open source, synced to an internal repo, and deployed on internal infra. I recall jumping through many hoops to make it work safely. These were workflows that had secrets access for deployments, and I recall zipping files, doing some weird handoffs/file filtering between different workflows based on the triggers and permissions. Security folks were really quick to find any gaps =)

Glad to see a few more security knobs on actions these days!


I always wondered if their decision to limit availability of CodeCommit had something to do with the overall quality of the underlying implementation. It always came off as an "also ran" product without any real care or effort put into it. Either that or the team responsible for creating it ultimately left the company.. anyways..

This article lends some credibility to that notion.


How did they create so many GitHub accounts? I used login with GitHub in the past to prevent spam but I feel like, after hearing this, I need to check for something like account age to prevent spam.

they explain in the article how they create hundreds of “bot” accounts using github apps, which seemingly aren't subject to the same rate limiting and captchas as user accounts

I try to avoid regexes like the plague, it is right up there with passing stuff into SQL strings. It is tempting enough to be used but it always goes wrong, no matter how good your sanitation. Even if the original author gets it right sooner or later someone will tweak the regex just a little to allow some edgecase and accidentally open the door to a whole pile of other cases. It's just too finicky and too powerful.

Oh no, is the AWS Console ok?

happens to the best of us



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: