You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pages/index.md
+13-11Lines changed: 13 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,22 +5,24 @@ permalink: /
5
5
icon: 'fa-house'
6
6
---
7
7
8
-
This document serves as a guide for understanding and using the [repository](https://docs.google.com/spreadsheets/d/1yG-B58i29vz0xp-yKjFPy1yj6fchGfp4keASS-zXdq8/edit#gid=0) on output privacy attacks and auditing. The repository is designed to be an open resource for the community, cataloging a wide range of scientific papers that explore various output privacy attacks. The repository classifies papers according to various dimensions such as the type of data targeted, the adversarial threat model employed, and the success metrics used to evaluate the effectiveness of these attacks.
8
+
### Overview
9
9
10
-
This document will also include the necessary information to navigate the repository, and should offer a rationale for the repository’s design, explaining the thought process behind the categorization of the papers. Additionally, we can include examples to illustrate how researchers and practitioners can effectively use the repository to advance their work.
10
+
This document serves as a guide for understanding and using the [repository](https://docs.google.com/spreadsheets/d/1yG-B58i29vz0xp-yKjFPy1yj6fchGfp4keASS-zXdq8/edit#gid=0) on output privacy attacks and auditing. The repository is designed to be an open resource for the community, cataloging a wide range of scientific papers that explore various output privacy attacks. The repository classifies papers according to various dimensions such as:
11
+
* the type of data targeted
12
+
* the adversarial threat model employed
13
+
* the success metrics used to evaluate the effectiveness of these attacks
11
14
12
-
**This repository is a living resource.** We aim to keep it up to date, but relevant work may occasionally be missing. If you notice an omission, we welcome your contributions to help improve and expand this collection.
15
+
See the page on [How to use the repository](/how-to-use-the-repository) for detailed instructions and an overview of the rationale behind its design.
13
16
17
+
**NOTE**: This repository is a living resource. We aim to keep it up to date, but relevant work may occasionally be missing. If you notice an omission, we welcome your contributions to help improve and expand this collection.
14
18
15
-
### What is output privacy?
16
19
17
-
Privacy is multifaceted, with many qualitatively different types of attacks being described as “privacy violations.” Our repository, and this document, only consider what we call “output privacy” in the context of “statistical data releases.” These are privacy violations that arise when an attacker uses the intentional outputs of some kind of statistical system (e.g. summary statistics or predictive models) to make inferences about individuals. Examples of some of the attacks on “output privacy” that we consider in this work include:
20
+
### What is output privacy?
18
21
19
-
* Reconstruction attacks that use the summary statistics released by the Census to recover information about specific individuals in the population
20
-
* Membership-inference attacks that can determine if a given image was used in training a photo-tagging model
21
-
* Data extraction attacks that cause a language model like ChatGPT to output specific sensitive information from its training data
22
+
Privacy is multifaceted, with many qualitatively different types of attacks being described as “privacy violations.” Our repository, and this document, only consider what we call “**output privacy**” in the context of “**statistical data releases**.” These are privacy violations that arise when an attacker uses the intentional outputs of some kind of statistical system (e.g. summary statistics or predictive models) to make inferences about individuals. Examples of some of the attacks on output privacy that we consider in this work include:
22
23
23
-
There are many types of privacy attacks that aren’t in scope for this repository, either because they are not “statistical” or because the use something other than the “intended output” to violate privacy. While there is some room for gray area, some representative examples include;
24
+
***_Reconstruction attacks_** that use the summary statistics released by the Census to recover information about specific individuals in the population
25
+
***_Membership-inference attacks_** that can determine if a given image was used in training a photo-tagging model
26
+
***_Data extraction attacks_** that cause a language model like ChatGPT to output specific sensitive information from its training data
24
27
25
-
* Re-identification attacks that link anonymized tabular microdata to specific individuals (these are not “statistical” in our terminology)
26
-
* System-level attacks and side-channel attacks that allow an attacker to gain access to restricted information (these are not “output privacy” in our terminology)
28
+
Attacks outside the scope of this repository include those that do not rely on the _intended_ outputs of the system, such as re-identification of anonymized microdata, system-level exploits, and side-channel attacks.
0 commit comments