Your terrifying idea mischaracterizes the nature of false positives. Any photo i...

tjmc · on Aug 26, 2021

So that picture of my driver's license I took for an ID check or that sensitive work document I scanned with my phone are just as likely to be sent? Great.

simondotau · on Aug 26, 2021

The image would need to be vaguely similar in terms of gross shapes and arrangement. It's exceedingly unlikely that any CSAM would ever be remotely similar to an ID card or a sheet of paper.

If there are ever going to be any "natural" matches to any CSAM hashes, it's probably going to be a photograph of people who are coincidentally in a similar pose at a nearly identical angle and strikingly shading.

waz0wski · on Aug 26, 2021

In the myriad of articles about this systems many issues there have been comments from people who have worked with the NCMEC upstream database and note that it's filled with mundane photos, empty rooms, etc - I think it was in one of the hackerfactor article discussions

This entire system is ripe for false positives AND adversarial attacks.

simondotau · on Aug 27, 2021

I've no doubt the totality of the database contains a lot of photos, but only photos tagged as A1, A2, B1, or B2 would be considered illegal to possess. And then only the absolute worst of the worst (images categorised as "A1") are being included in the hash set on iOS. The category definitions are:

  A = prepubescent minor
  B = pubescent minor
  1 = sex act
  2 = "lascivious exhibition"

The categories are described in further detail (ugh) in this PDF, page 22: https://www.prosecutingattorneys.org/wp-content/uploads/Pres...

shuckles · on Aug 26, 2021

The NCMEC database is large and graded to distinguish types of photos. There’s evidence in the false positive calculations that Apple is only using a subset, presumably the one where photos are graded as depicting active abuse.

It’s not reasonable to dispute the 1 in 1e12 false positive claim on mere speculation.

brokenmachine · on Aug 27, 2021

>It’s not reasonable to dispute the 1 in 1e12 false positive claim on mere speculation.

It's entirely reasonable. Have you seen https://thishashcollisionisnotporn.com/ ?

Extraordinary claims require extraordinary evidence.

simondotau · on Aug 27, 2021

Collision attacks make for a fun tech demo, but I've yet to hear anyone suggest any plausible scenario where they could be used against Apple's implementation. It would require absurdly elaborate, Oceans Eleven style espionage to achieve any outcome whatsoever. And it would be immediately apparent to anyone involved that a collision attack was involved.

It would be far easier (and far more effective) to just acquire child porn, break into your victim's house, stash physical prints under their mattress, and then contact the police.

Furthermore, the website includes numerous misleading statements about Apple's system, or makes critical omissions on the description of Apple's system. Whatever side you're on, misleading arguments should be dismissed for what they are.

shuckles · on Aug 27, 2021

This is apples to oranges. The whole thread was about random false positives and not adversarial ones.

brokenmachine · on Aug 28, 2021

If it's that easy to generate a false positive then I believe it will be more common to accidentally have one.

Onge again, extraordinary claims require extraordinary evidence.

simondotau · on Aug 28, 2021

The ease of adversarial collisions has no relationship to the probability of natural collisions.

It's entirely possible to make a cryptographic hash algorithm that has an exceptionally low probability of natural collisions but where adversarial collisions are trivial.

It's also possible to create a cryptographic hash algorithm where occasional natural collisions are expected, but adversarial collisions require brute force.

shuckles · on Aug 26, 2021

The chance that any pictures from your library are revealed at all is at most one in one trillion (mod you not storing CSAM or being attacked by someone trying to plant evidence on you). Contrast this to a server side scanning system where every photo in your library will be accessed with unknown false positive characteristics.