Locations
On 18 December 2024, the European Data Protection Board ("EDPB") released its Opinion 28/2024 on the processing of personal data in the context of AI models in response to a request submitted by the Irish Data Protection Authority under Article 64(2) of the GDPR (the "Opinion").
This Opinion examines the application of key GDPR definitions and principles to AI models during their development and deployment phases. It focuses on three main topics: the processing of personal data in the context of AI models and their potential anonymous nature, the appropriateness of legitimate interest as a legal basis for processing, and the impact of unlawful processing during the development of an AI model on the lawfulness of the subsequent processing or operation of that AI model.
- Development and deployment of AI models
The EDPB distinguishes between the development and deployment of an AI model:
- The development of an AI model covers "all stages before any deployment of the AI model, and includes, inter alia, code development, collection of training personal data, pre-processing of training personal data, and training".
- The deployment of an AI model covers "all stages relating to the use of an AI model and may include any operations conducted after the development phase".
- Criteria for an AI model to be considered anonymous
The manner in which the concept of "personal data" applies to AI model training is a key topic.
The EDPB acknowledges that, although AI models are not intentionally designed to produce personal data, such data may still be embedded in the model's parameters through mathematical objects. While these may differ from the original training data points, they can still retain the original information, which can potentially be extracted or otherwise accessed, directly or indirectly, from the model.
To determine whether an AI model is anonymous, two criteria must be considered, namely: (i) the likelihood of directly extracting personal data about individuals whose data was used to develop the model, and (ii) the likelihood of intentionally or unintentionally retrieving such personal data from queries.
In this respect, to demonstrate the anonymity of an AI model, controllers must:
- Implement anonymization techniques to ensure it is not possible to single out, link or infer information from the supposedly anonymous dataset.
- Consider all the 'reasonably means' likely to be used by the controller or another person to identify individuals, inter alia, the characteristics of the training data, the training process, the deployment context, and any additional information that enables identification, as well as the cost and time required to obtain such identifiable information and the technology available in relation it.
- Evaluate the risk of identification by the controller and third parties, including in the event of unauthorized access by third parties.
Additionally, the EDPB provides a non-exhaustive list of elements to be considered by supervisory authorities to assess the residual risk of identification, including the AI model's design, selection of training data sources, data preparation and minimization, methodological choices that may reduce or eliminate identifiability, AI model testing, and resistance to attacks.
- Legitimate interest as a lawful basis
The EDPB confirms that the legitimate interest may be relied upon in the development and the deployment phases of AI models, provided that three conditions are met:
- The controller or third party must demonstrate the existence of a legitimate interest: Such interest must be lawful, clearly and precisely articulated, and real and present. Examples of such legitimate interests include developing a conversational agent to assist users, developing an AI system to detect fraudulent content or behaviour, or improving threat detection in an information system.
- Necessity of the processing to pursue the legitimate interest: The processing must be necessary to pursue the legitimate interest, meaning that the processing must allow the pursuit of the purpose and there should be no less intrusive way to pursue the purpose. The controller or third party must assess whether the intended volume of personal data is justified and whether the purpose can be pursued without the processing of personal data.
- Balancing test: The interests of the controller or the third party must not be overridden by the interests, fundamental rights and freedoms of the data subject.
- Data subjects' interests, fundamental rights and freedoms: Examples of potential risks for data subjects' during the development and deployment of an AI model include (i) retaining control over one's own personal data, (ii) the risk of self-censoring caused by the sense of surveillance for data subjects and (iii) risks on the fundamental rights and freedoms of individuals (e.g. if job applicants are pre-selected using an AI model).
- Impact of the processing on data subjects: The impact of the processing on data subjects may be influenced by (i) the nature of the data processed by the model (e.g. highly private information such as financial data); (ii) the context of the processing (e.g. the way in which the AI model was developed or will be deployed); and (iii) the further consequences that the processing may have (e.g. the risks to fundamental rights).
- Reasonable expectations of data subjects: The EDPB highlights the importance of informing the data subjects. In this regard, the controller or third party must consider the wider context of the processing, including whether the personal data was made publicly available, the nature of the relationship between the data subject and the controller, the sources of the data used and the potential further uses of the AI model.
- Mitigating measures: Various mitigating measures may be implemented during the development and the deployment phases, such as technical measures, measures that facilitate the exercise of individual's rights and transparency measures. Specific measures may be applied in the context of web scraping, e.g. excluding certain data categories, sources or websites.
- Impact of unlawful processing in the development of an AI model on the lawfulness of subsequent processing or operation
The Opinion distinguishes three scenarios:
- A controller unlawfully processes personal data in the development phase, which personal data is retained in the model and is subsequently processed by the same controller: In such case, the unlawful processing of personal data in the development phase may negatively impact the data subjects if they do not expect the subsequent processing.
- A controller unlawfully processes personal data in the development phase, which personal data is retained in the model and is subsequently processed by another controller: Each controller must ensure the lawfulness of its own processing and must be able to demonstrate that the AI model was not developed by unlawfully processing personal data.
- A controller unlawfully processes personal data to develop a model, before anonymizing the data and further processing the data during the deployment of the AI model: In this scenario, the GDPR is unlikely to apply to the subsequent processing as anonymized data does not entail the processing of personal data. Consequently, the unlawfulness of the initial processing is unlikely to have a negative impact on the subsequent operation of the model.
- Implications for businesses
The Opinion will guide national data protection authorities in applying GDPR provisions during audits and investigations. Businesses developing AI models, or using AI models developed by others in their products or services, should consider the following measures:
- Evaluate anonymity: Assess whether AI models previously considered anonymous meet the anonymity standards set out in the Opinion and implement any necessary anonymization measures.
- Review Legitimate Interest Assessments ("LIAs"): Reevaluate existing LIAs and create new ones if needed. Determine if any mitigating measures should be implemented to satisfy the balancing test.
- Assess Unlawful Processing: Evaluate any potential impact of any unlawful processing during AI model development on subsequent operations and implement rectifying measures if necessary.
Certain AI models will also fall under the new EU AI Act, which will be enforced by national competent authorities designated by EU Member States, or by the AI Office for general-purpose AI models. Businesses are encouraged to adopt a unified compliance strategy to meet the requirements of both the GDPR and the AI Act.
If you have any further questions about the regulation of AI models under the GDPR or the AI Act, please reach out to the Fieldfisher Tech & Data team.