On 19 June 2025, the French Data Protection Authority (CNIL) adopted a guideline outlining the obligations for data controllers collecting personal data through web scraping, particularly when relying on legitimate interest as a legal basis to develop Artificial Intelligence (AI) systems. The guidance applies to organisations engaging in harvesting publicly accessible data and emphasises the need to minimise harm to individuals’ rights under the General Data Protection Regulation. It highlights mandatory safeguards, including defining collection criteria, excluding sensitive or unnecessary data, respecting technical and legal opposition to scraping, avoiding data from vulnerable populations or private contexts, and ensuring transparency and avenues for objection. Additional measures include pseudonymisation, anonymisation, and preventing inappropriate cross-referencing of identifiers. The guideline also urges controllers to assess whether such processing aligns with individuals’ reasonable expectations and to ensure compliance with other applicable laws, including copyright.
Original source