Bridging the Gap: AI and Privacy Legislation in Perspective
By Shradhanjali Sarma and Shatakshi Shekhar
Introduction
Artificial intelligence (AI) has garnered unparalleled attention and become the focal point of widespread discourse and analysis. An aspect of artificial intelligence that has garnered considerable attention is its implications for data protection and privacy. As artificial intelligence rapidly integrates into diverse sectors, understanding the intricate relationship between AI and privacy has become increasingly significant.
The existing privacy legislation across different jurisdictions addresses the inter-play of AI with privacy; nevertheless the discussion regarding it remains in its early stages. The General Data Protection Regulation (GDPR) addresses automated decision making. Similarly, in the US we have seen the inclusion of the concept of automated decision making under privacy legislations of different states such as California, Colorado, Connecticut and Virginia.
In order to understand how to regulate AI and privacy, it is important to understand how AI models work. AI today is not limited to robots and instead encompasses related technologies. In the past, AI used to be understood in relation to robots and supercomputers, and it was only in 1955 that the term “artificial intelligence” was used for the first time. However, with time, we have seen rapid progression of AI, especially with the release of ChatGPT by OpenAI last year. Comprehending the intricate workings and nuances of these emerging models is crucial for understanding the interplay between artificial intelligence and privacy.
Automation and Privacy
The opaque nature of AI poses challenges in comprehending how data collected by the data controller influences the output. AI models are recognized for their capacity to derive new data from the input data points. For instance, in the context of a resume screening AI model, basic data points like name, address, and date of birth are provided. However, the AI model goes beyond these known data points to generate entirely new sets of information for resume assessment. Although individuals may have consented to the use of their personal data for this purpose, they did not explicitly agree to the creation of new information through data generation. The inherent complexity of AI models makes it difficult to gauge the impact of such data generation on individuals’ privacy.
There is no legislation which prevents automation through AI and the collection of data through AI. However, the nature of AI makes it difficult to apply the privacy and data protection principles. While data protection principles provide control to the users/individuals, it becomes challenging to provide such control to individuals with respect to AI generated data.
The concept of control outlined in data protection legislation, such as the General Data Protection Regulation (GDPR), grants rights to data subjects, including access, rectification, erasure, and data portability. This provision places the responsibility for protecting these rights squarely on the shoulders of data subjects. To effectively manage their data privacy, individuals must be well-informed and equipped to comprehend how their data is collected and utilised. However, privacy policies are often lengthy, intricate, and challenging to grasp, leading many data subjects to skip reading them altogether and consequently remain unaware of how their data is processed.
This challenge is compounded when artificial intelligence (AI) is employed for data collection, as the opaque nature of AI algorithms presents a significant barrier to understanding how data is utilised. In such cases, data subjects find it even more challenging to comprehend the mechanisms by which AI processes their data due to the "black box problem" associated with AI systems. It makes it difficult for data subjects to understand the extent of harm that would be caused in such a case.
For someone without technical expertise, understanding the intricacies of an algorithm can be challenging, making it hard for them to assess the potential harm caused by the collection and disclosure of their personal data.
Data Subject Access Request under Privacy Legislations
As previously discussed, data subjects have the right to access their personal data collected by businesses, known as Data Subject Access Requests (DSARs), under the GDPR. DSARs are a perfect example to understand the challenges that AI brings to the current privacy legislation ecosystem.
According to Article 15 of the GDPR, a data subject can request a data controller to provide them with access to various information, including the purposes of the processing, categories of personal data involved, recipients or categories of recipients to whom the data will be disclosed, duration of data storage, rights to rectify or erase personal data, the right to lodge a complaint, the existence of automated decision-making, and the logic behind it, as well as the potential consequences for the data subject.
Under Article 12 of the GDPR, data controllers are obligated to communicate information related to DSARs clearly and in plain language. This information must be provided within one month, and if there's a delay, the reasons must be communicated to the data subjects. If the controller doesn't take action on the request, they must inform the data subject promptly, explaining the reasons and the option to lodge a complaint or seek judicial remedy. Additionally, this information must be provided free of charge, although the controller may charge a fee for repetitive requests, provided they can demonstrate the request's excessive or unfounded nature.
Similar to GDPR, the California Consumer Privacy Act (CCPA) and its follow-up, the California Privacy Rights Act (CPRA), provide consumers with similar rights regarding their personal information. Under the CCPA, consumers have the right to request that businesses disclose certain information. This includes the categories of personal information collected, the sources from which it was collected, the business or commercial purpose for collecting, selling, or sharing the information, the categories of third parties with whom the information is shared, and the specific pieces of personal information collected about the consumer.
While these privacy legislations grant data subjects the right to access their personal data, they also present certain challenges. Firstly, in the case of personal data collected through AI, it becomes difficult for the data controller to provide precise information. This is due to the nature of AI, where collected data is used to generate new data through inference, resulting in a dataset that may differ significantly from the original data provided by the data subject. This phenomenon, as described by Daniel Solove, is known as the 'aggregation effect,' where small pieces of data lead to the creation of a substantial amount of data, often exceeding the initial purpose for which it was collected. While entities/businesses collect consent through privacy notices from data subjects for data ‘collection and usage’, the privacy policy never mentions data ‘generation’. Data generation based on inferences defeats the principle of ‘data minimization’, which is one of the foundations of data protection legislation.
Secondly, DSARs necessitate that the data controller elucidate the rationale behind decisions made. While this is relatively straightforward for manual decision-making processes, it becomes challenging when decisions are automated by artificial intelligence (AI). As previously discussed, algorithms are intricate, and their opaque nature makes it arduous to comprehend how a specific decision was reached. In an era where AI increasingly governs decision-making processes, fulfilling such a right for data subjects becomes practically unattainable.
Road Ahead
As the influence of AI expands, it becomes imperative to reassess how privacy legislations address automated decision-making processes. In addition to granting rights to data subjects, it would be advantageous to subject AI systems to risk assessments. Such assessments would facilitate an understanding of the potential risks associated with data collection and generation, allowing for the implementation of appropriate measures to mitigate these risks. This could be complemented by robust data auditing practices. Furthermore, adopting a regulatory framework tailored to address the evolving challenges posed by AI would be beneficial. This approach, coupled with transparent explanations of AI models, would enable data controllers to effectively manage DSAR inquiries and strategically plan their responses.