Attacks on Private Training Data

(Figure courtesy of Dr Ali Hatamizadeh, the author of GradViT)

Recent studies have shown that training samples can be recovered from the corresponding gradients. This phenomenon is called Gradient Inversion (GradInv) attacks. The principle behind GradInv is to learn the input so that its gradients match the target gradients. So when can GradInv happen? Should we worry about it? How to protect our data from it?

For the first question, I will extract this piece from [1] for a popular scenario:

Distributed learning or federated learning is a popular paradigm for achieving collaborative training and data privacy at the same time. In a centralized training process, the parameter server initially sends a global model to each participant. After training with local data, the participants are only required to share gradients for model updates. Then the server aggregates the gradients and transmits the updated model back to each user. However, recent studies have shown that gradient sharing is not as secure as it is supposed to be. We consider an honest-but-curious attacker, who can be the centralized server or a neighbour in decentralized training. The attacker can observe the gradients of a victim at time t, and he/she attempts to recover data x(t) or labels y(t) from gradients. In general, we name such attacks as Gradient Inversion (GradInv) attacks.

Federated learning is favourable in AI for health care since it enables privacy for training data. However, GradInv poses a challenge to security in federated learning. One of the state-of-the-art vision models called Vision Transformer has been reported to be even more susceptible to Gradient Inversion attacks than the previously dominant method in computer vision, convolutional neural networks [2].

Methods to protect private training data are categorised into 3 approaches:

Obscuration of original data: to directly protect raw data before training. The expectation is that private input is difficult to be reconstructed while the model utility does not degrade too much [3].
Improvement of training models: to investigate which modifications to the network structure can defend against the GradInv attacks, e.g. adding a dropout layer and changing the number of filters in convolutional layers. Also, increasing the number of local training iterations (before sharing) can also protect training data better.
Protection from gradient sharing: several approaches to protect shared gradients directly: cryptography-based methods, gradient perturbation, and compression-based methods.

Refer to the papers in the References for more details.

References:

[1] Zhang, R., Guo, S., Wang, J., Xie, X., & Tao, D. (2022). A survey on gradient inversion: Attacks, defenses and future directions. arXiv preprint arXiv:2206.07284.

[2] Hatamizadeh, A., Yin, H., Roth, H. R., Li, W., Kautz, J., Xu, D., & Molchanov, P. (2022). Gradvit: Gradient inversion of vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10021-10030).

[3] Zhang, H., Cisse, M., Dauphin, Y. N., & Lopez-Paz, D. (2017). mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412.

Attacks on Private Training Data

Recent Posts

Archives