[CVPR 2023] Boundary Unlearning: Rapid Forgetting of Deep Networks via Shifting the Decision Boundary

Paper Review/Privacy Protection (UL, Anonymize)

[CVPR 2023] Boundary Unlearning: Rapid Forgetting of Deep Networks via Shifting the Decision Boundary

이성훈 Ethan 2023. 9. 17. 01:00

728x90

- Introduction

Machine Unlearning 이란?

Machine Learning: 경험을 통해 자동으로 개선하는 컴퓨터 알고리즘의 연구

머신러닝은 정의에서도 알 수 있듯이 새로운 데이터를 학습시키는 반면, Machine unlearning 은 말 그대로 배우지 않게 하는 것, 즉 배운 것을 까먹게 하는 것이라고 할 수 있음

기존 unlearning 방법들은 model parameter 를 scrub 하는 방식으로 forgetting data 의 정보를 파괴

하지만 이 방법은 large dimension 으로 인해 parameter space 가 너무 크기 때문에 expensive (Fisher Information Matrix)

Retrain 한 model 의 decision space 를 visualize 하여 intuition 을 얻음

CIFAR-10 에서 1개 class 를 unlearn 한 decision space = CIFAR-10 에서 9개 class 만 가지고 학습한 (retrain) decision space

라는 intuition

Observation

Forgetting sample 은 decision space 에 흩어져있음
대부분의 forgetting sample 들은 다른 cluster 의 경계로 이동

► Boundary Unlearning: Unlearning an entire class

마침 위에서 발견한 observation 이 machine unlearning 의 두 가지 goal 과 매치됨

Utility guarantee - Generalize badly on forgetting data
Privacy guarantee - Unlearned model should not leak any information of the forgetting data

Boundary Shrink: Breaks the decision boundary of the forgetting class by splitting the forgetting features into other classes

Boundary Expanding: Disperses the activation about the forgetting class by remapping and pruning an extra shadow class

- Preliminaries and Notation

Train dataset: $\mathcal{D}=\left\{\mathbf{x}_i, \mathbf{y}_i\right\}_{i=1}^N \subseteq \mathcal{X} \times \mathcal{Y}$

Label

Labels: $\mathcal{Y}=\{1, \ldots, K\}$, Total number of classes: $K$

Forget set: $\mathcal{D}_f$ consists of the samples of the entire class

Remain set: $\mathcal{D}_r=\mathcal{D} \backslash \mathcal{D}_f$

Model trained on $\mathcal{D}$: $f_{\mathbf{w}_0}$ which is parameterized by $\mathbf{w}_0$

Model retrained on $\mathcal{D}_r$: $f_{\mathbf{w}^*}$ which is parameterized by ${\mathbf{w}^*}$

Model unlearned on $\mathcal{D}_f$: $f_{\mathbf{w}^{\prime}}$ which is parameterized by ${\mathbf{w}^{\prime}}$

- Method

Boundary Shrink

Neighbor searching method 를 통해 가장 가깝지만 틀린 class 로 guide

Find the nearest but incorrect label for each forgetting sample

Initial forgetting sample: $\mathbf{x}_f$

Cross sample: $\mathbf{x}_f^{\prime}=\mathbf{x}_f+\epsilon \cdot \operatorname{sign}\left(\nabla \mathbf{x}_f \mathcal{L}\left(\mathbf{x}_f, \mathbf{y}, \mathbf{w}_0\right)\right)$

Cross sample 이란 adversarial attack 에서 착안한 방식으로, initial forgetting sample 에 noise 를 추가한 sample

이 Cross sample 을 original model 을 통해 새로운 label 을 예측하여 부여 $\mathbf{y}_{n b i} \leftarrow \operatorname{softmax}\left(f_{\mathbf{w}_0}\left(\mathbf{x}_f^{\prime}\right)\right)$

그 다음 기존 forgetting sample 에 새로 부여 받은 label 로 orginal model 을 finetune

$\mathbf{w}^{\prime}=\underset{\mathbf{w}}{\arg \min } \sum_{\left(\mathbf{x}_i, \mathbf{y}_{n b i}\right) \in \mathcal{D}_f} \mathcal{L}\left(\mathbf{x}_i, \mathbf{y}_{n b i}, \mathbf{w}_0\right)$

Utility guarantee

Deactivates forgetting class
Barely hurts on remaining classes
Achieves the privacy guarantee better compared to random label finetune

Boundary Expanding

Boundary shrink: Neighbor search 가 너무 많은 시간을 소요시킴

Shadow class 라는 extra label 을 만들어서 forgetting sample 을 거기로 가도록 학습

Classifier 에 node 를 하나 더 만들고 finetune

$\mathbf{w}^{\prime}=\underset{\mathbf{w}}{\arg \min } \sum_{\left(\mathbf{x}_i, \mathbf{y}_{\text {shadow }}\right) \in \mathcal{D}_f} \mathcal{L}\left(\mathbf{x}_i, \mathbf{y}_{\text {shadow }}, \mathbf{w}_0\right)$

다 학습된 후엔 추가된 node 를 deactivate 하여 forgetting class 에 대한 정보를 없앰

- Experiment

Datasets

CIFAR-10
Vggface2

Baseline

Finetune
Random Labels
Negative Gradient
Fisher Forgetting
Amnesiac Unlearning

- Discussion

- Reference

[1] Chen, Min, et al. "Boundary Unlearning: Rapid Forgetting of Deep Networks via Shifting the Decision Boundary." CVPR 2023 [Paper link]

728x90

저작자표시 (새창열림)

'Paper Review > Privacy Protection (UL, Anonymize)' 카테고리의 다른 글

[IJCAI 2021] Learning with Selective Forgetting (0)	2025.05.18

현재글[CVPR 2023] Boundary Unlearning: Rapid Forgetting of Deep Networks via Shifting the Decision Boundary

Ethan's Winery

이성훈 Ethan

250x250

GAN, fewshot, incremental learning, Continual Learning, dl, 딥러닝, image classification, 용어,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Ethan's Winery