How Can We Measure Privacy Risk in Machine Learning?

Jul 4, 2025

This week, we're looking at the measurable risks of data leakage in machine learning— including what tools exist for auditing model privacy and what it takes to operationalise AI compliance.

One Paper

Membership inference is often treated as a binary “yes/no” question. This paper goes deeper: by attacking GPT-2 models trained on up to 20B tokens, it asks how much training data leaks, under what conditions, and at what computational cost.

One YouTube Video

In this talk, Nicholas Carlini shows how a membership inference attack—built from first principles—can pinpoint which data was in the training set with surprising accuracy. The takeaway? Even models that seem secure on average can expose individual examples under scrutiny.

One Article

In this article, Reza Shokri walks through a formal framework for privacy risk assessment and introduces ML Privacy Meter, a tool designed to quantify how much personal data a model may be exposing.

One New Method

In this LinkedIn post, Yves-Alexandre de Montjoye previews a zero-cost way to flag memorised training samples—just by analysing their loss curves. This paper shows it works with 92% precision at zero additional compute—all with a single line of code.

One Meme

Source: ProgrammerHumor.io