On 20 May 2024, Assembly Bill No. 2013 regarding Generative artificial intelligence: training data transparency was passed by the House of Representatives of California and forwarded to the Senate. The Bill defines generative AI as artificial intelligence which can create synthetic content such as text, images, video, and audio based on its training data. This Bill mandates that, by 1 January 2026, developers of generative artificial intelligence systems or services released after 1 January 2022 must publish documentation on their websites detailing the data used to train these systems. Further, each time after 1 January that a generative artificial intelligence system or service is publicly released, the same obligation applies. The documentation would need to include a high-level summary of the datasets, including their sources, purposes, data points, and whether they contain any copyrighted or personal information. The disclosure would also need to detail whether synthetic data generation was used. Certain AI systems would be exempt from these requirements, including those designed solely for security and integrity.
Original source