The Internet of Things (IoT) is penetrating in our daily lives with multiple intelligent services and applications empowered by Artificial Intelligence (AI).  Traditionally, AI techniques require centralized data collection and processing that may not be feasible in realistic application scenarios due to the high scalability of modern IoT networks. The main issue with the centralized collection of data is that it can expose individuals to privacy risks and organizations to legal risks if data is not properly managed. Federated Learning (FL) has emerged as a distributed collaborative AI approach that can enable many intelligent IoT applications, by allowing AI training to be performed at distributed IoT devices without the need for data sharing. Numerous open-source frameworks have been released that implement FL e.g., Flower [1], NV Flare [2] (NVIDIA Flare), TensorFlow Federated [3]. In this blog we will deal with the NV Flare by analyzing the privacy preserving techniques that it provides.

NV FLARE mainly consists of the main node (server) and the federated nodes (clients). There is consecutive communication between them; specifically, the server is responsible for broadcasting tasks to clients (e.g., train, validation). After the clients have executed their tasks, they return the results to the server, where they are aggregated. Definitely, FL solves many privacy problems compared to traditional AI training but is still prone to malicious attacks. To minimize the odds of malicious attacks NV Flare provides different types of privacy-preserving mechanisms, such as percentile privacy, homomorphic encryption, etc. The supported privacy-preserving mechanisms can be applied as filters when the information is sent or received between peers. These mechanisms are briefly described below.

  • Exclude Vars

The first filter is the “Exclude Vars” and the behavior of the filter depends on the input. If the input is a list of variable/layer names, only specified variables will be excluded. In the case that the input is a string, it will be converted into a regular expression, and only matched variables will be excluded.

  • Percentile Privacy

The second filter supported by NV Flare is the one referring to “Percentile Privacy”. This filter is based on the “largest percentile to share” privacy preserving policy which is presented by Shokri and Shmatikov [4]. The main idea is that participants train independently on their own datasets and share small subsets of their models’ parameters during training. The number of shared models’ parameters depends on the percentile variable of the filter, which acts as a threshold. Using the “Percentile Privacy” filter, the client can control the percentile of parameters desired to be shared.

  • SVT Privacy

Another filter supported is the differential privacy method, which is provided by NV Flare through the Sparse Vector Technique (SVT) [5]. This filter applies a fundamental method for satisfying differential privacy by adding noise to ML model weights. SVT takes a sequence of queries and a certain threshold  and outputs a vector {⊥,⊤}, where is the number of queries answered, T specifies that the corresponding query answer is above the threshold, conversely ⊥ indicates it is below [6]. This algorithm, after identifying the meaningful queries, adds standard differentially private noise from the Laplace distribution.

  • Homomorphic encryption

Homomorphic encryption (HE) is also available on NV Flare as a privacy-preserving choice. In HE, the clients receive keys to homomorphically encrypt their model updates before sending them to the server. The server does not own a key, it only sees the encrypted model updates and can aggregate these encrypted weights. As soon as the weights are aggregated, the server sends the updated model back to the clients, where they can decrypt the model weights because they have the keys.

In some cases, using the mechanisms mentioned above there is a trade-off between privacy and model performance. Specifically, in case of SVT privacy by adding noise to the ML model weights there is interference tο weights and therefore deterioration of the performance. Also, it is possible to combine 2 privacy preserving techniques at the same time. For example, a combination of “SVT Privacy” with “Homomorphic encryption” can be performed where the first adds differential privacy to weights, while the second homomorphically encrypts the model.

FL is a distributed AI approach that has sparked great interest to realize privacy-enhancing and scalable IoT services and applications. Malicious attacks had awakened the interest of multiple researchers to investigate methods in order to prevent them. NV Flare, as described above, provides 4 different privacy preserving mechanisms to avert malicious attacks and support researchers in protecting data privacy in a multitude of application scenarios.


References

[1] T. F. Authors, ““A friendly federated learning framework”,” [Online]. Available: https://flower.dev/. [Accessed 07 Nov 2022].
[2] N. Developer, “”Nvidia Flare”,” [Online]. Available: https://developer.nvidia.com/flare. [Accessed 7 Nov 2022].
[3] ““Tensorflow Federated”,” TensorFlow, [Online]. Available: https://www.tensorflow.org/federated. [Accessed 7 Nov 2022].
[4] R. Shokri, “Privacy-preserving deep learning,” in Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, 2015.
[5] C. Dwork, Cynthia and Naor, Moni and Reingold, Omer and Rothblum, Guy N and Vadhan and Salil, “On the complexity of differentially private data release: efficient algorithms and hardness results,” Proceedings of the forty-first annual ACM symposium on Theory of computing, pp. 381–390, 2009.
[6] Lyu, Min and Su, Dong and Li and Ninghui, “Understanding the sparse vector technique for differential privacy,” arXiv preprint arXiv:1603.01699, 2016.