As someone who's been quite heavily involved with web-platform-tests, I'd cautio...

tssva · 2025-10-06T20:13:49 1759781629

The tweet mentions that this is an arbitrary metric thrust upon them by Apple, so I don’t think they would necessarily disagree with you. During the monthly updates they do also show the passing number of tests without including the encoding tests because of how much they skew things.

troupo · 2025-10-06T20:57:17 1759784237

The problem is, there's no other good metric. We used to have Acid tests for CSS, but in absence of that, it's as good metric as any.

nicoburns · 2025-10-07T02:20:16 1759803616

Some modern ACID-style tests are a nice idea actually.

oblio · 2025-10-06T23:32:37 1759793557

Are Acid tests no longer available?

phire · 2025-10-07T02:08:18 1759802898

Acid 2 bakes in the assumption that you will be displaying it on a desktop/laptop monitor with 100% scaling; It depends on pixel accuracy.

This was a reasonably universal assumption in 2005, but became less and less valid over time, we now have high-dpi screens and the whole idea of pixel accuracy has fallen out of favour (it was never a good idea, but 2005) as phone browsers are expected to rescale websites for better readability/usability.

The result is that Acid 2 fails on my phone, and on my laptop it will pass/fail depending on which screen the window is on.

Acid 3 was too forwards looking and rigid. While Acid 2 was (mostly) testing accepted standards (which IE6 implemented very poorly), Acid 3 tested a bunch of draft standards. It was very strict on many things that weren't well defined and later versions of the standards took the opposite approach.

Basically, Acid 2 was very good at shaming Microsoft into fixing Internet Explorer; But in the long run the whole concept of popular cherry picked torture tests proved to be of limited usefulness (and actually counterproductive) to promoting standards compliant browsers.

alganet · 2025-10-06T23:39:17 1759793957

They no longer reflect what the average user expects their browser to support. You can pass it and miss on several important things that are considered widespread features nowadays.

ac29 · 2025-10-06T23:46:52 1759794412

They are, but they arent great tests of what a browser is capable of. For example, Firefox does not pass Acid2 or Acid3

munchlax · 2025-10-06T21:43:41 1759787021

Ladybird will be faster than anything with an arbitrary metric thrust

cdaringe · 2025-10-08T21:14:24 1759958064

mmm yes and lift

koolala · 2025-10-06T23:11:43 1759792303

Could a hand-picked subset be selected to make that metric?

culi · 2025-10-07T01:48:22 1759801702

Everything you said sounds very reasonable, yet the "Browser-Specific Failures" graph on the main page of the wpt.fyi website explicitly misleads us into thinking

PS I'm a big fan of the work and appreciate what you do. I check the interop page about once a week!

jebronie · 2025-10-07T09:17:01 1759828621

As someone who's been quite heavily involved with having a brain, I'd advocate for using of the test pass rate as a metric for how many tests are passed.

manmal · 2025-10-06T20:12:02 1759781522

Why are you bringing this up, when it’s not been implemented as a metric here, but because Apple requires it for iOS.

Klonoar · 2025-10-06T22:18:10 1759789090

This is a headline that is very easy to misread and or misunderstand. I don’t find their comment to be that out of place at all.

manmal · 2025-10-07T07:57:49 1759823869

Root comment is lecturing the ladybird team about not using this suite as a metric, which is totally uncalled for. That’s what I’m trying to convey.

Klonoar · 2025-10-08T03:39:26 1759894766

"lecturing" is carrying a lot of needless weight here. Their comment doesn't read like that, they're just pointing out that the metric itself isn't what it seems to be.

hamandcheese · 2025-10-06T20:17:09 1759781829

> but because Apple requires it for iOS

Therefore it is a metric used by Apple.

fmajid · 2025-10-06T20:52:13 1759783933

In the spirit of malicious compliance, thus being a bad metric would probably be a feature in their book.

anextio · 2025-10-07T02:10:52 1759803052

Malicious compliance?

The EU DMA says they have to allow third party browser engines access to the same resources (the JIT) that Safari has. It specifically allows them to place reasonable requirements on those third party alternatives:

> The gatekeeper shall not be prevented from taking, to the extent that they are strictly necessary and proportionate, measures to ensure that third-party software applications or software application stores do not endanger the integrity of the hardware or operating system provided by the gatekeeper, provided that such measures are duly justified by the gatekeeper.

Access to rwx memory is inherently dangerous, and it's completely reasonable to expect third parties to have proven that they are serious about producing a usable browser engine before putting such a risky product on the market for consumers to download. The law does not require them to allow any third party application to access the JIT, only a third party application that competes with Safari (a usable web browser).

impossiblefork · 2025-10-07T09:48:16 1759830496

Yes, but that doesn't require rendering performance or anything like that, but absence of security problems.

You can't justify a requirement for a minimum level of performance or some capability. You can justify a requirement of a guaranteed absence of security bugs, provided that that's a standard you impose on yourself throughout the system.

troupo · 2025-10-06T21:00:46 1759784446

There are literally no other metrics.

Web Platform Tests were literally a project to align browsers on compatible implementations of a bunch of web APIs. Started by Opera and w3c and maintained by w3c https://www.bocoup.com/blog/wpt-an-overview-and-history

sleepybrett · 2025-10-06T19:57:32 1759780652

Then talk to apple. They are the ones who put this bar in place.