White-Box Attacks on PhotoDNA Perceptual Hash Function
πβππ‘ππ·ππ΄ is a widely deployed perceptual hash function used for the detection of illicit content such as Child Sexual Abuse Material (CSAM). This paper presents the first mathematical description of π΄ππππππ πβππ‘ππ·ππ΄, a new function which has identical outputs to that of πβππ‘ππ·ππ΄ for a large database of test images. From this description, several design weaknesses are identified: the algorithm is piece-wise linear and differentiable, the hash value only depends on the sum of the RGB values of each pixel, and it is trivial to find images with hash value equal to all zeroes.
The paper further demonstrates that gradient-based optimization techniques and quadratic programming can exploit the mathematical weaknesses of π΄ππππππ πβππ‘ππ·ππ΄ and πβππ‘ππ·ππ΄ to produce visually appealing exact collisions and second preimages; for near-collisions and near-second-preimages the image quality can be further improved. The same techniques can be used to recover the rough shapes of an image from its hash value, disproving the claim from the designer that πβππ‘ππ·ππ΄ is irreversible. Finally, it is also shown that it is easy to produce high-quality perceptually identical images with a hash value that is far from the original image allowing to avoid detection. We have implemented our attacks on a large set of varied images and we have tested them on both π΄ππππππ πβππ‘ππ·ππ΄ and πβππ‘ππ·ππ΄. Our attacks have success rates close or equal to 100% and run in seconds or minutes on a personal laptop; they present a substantial improvement over earlier work that requires hours on parallel machines and that results only in near-collisions. We believe that with additional optimization of the parameters, the image quality and/or the attack performance can be further improved.
Our work demonstrates that πβππ‘ππ·ππ΄ is unreliable for the detection of illicit content: it is easy to incriminate someone by sending them false content with a hash value close to illicit content (a false positive) and to avoid detection of illicit content with minimal modifications to an image (a false negative). False positives and leakage of information are particularly problematic in a Client Side Scanning (CSS) scenario as envisaged by several countries, where large hash databases would be stored on every user device and billions of images would be hashed with πβππ‘ππ·ππ΄ every day. Overall, our research cast serious doubts on the suitability of πβππ‘ππ·ππ΄for the large-scale detection of illicit content.
eprint.iacr.org Β· IACR Cryptology ePrint Archive