Does General-Purpose AI Code of Practice (GPAI CoP) require Data Governance?
European Union • enforcing
Yes — 1 provision
Requirements at a glance
This regulation imposes 5 specific requirements for Data Governance across 1 provision:
- Copyright compliance policy — Implement and maintain a policy for compliance with EU copyright law throughout the training data pipeline
- Robots.txt compliance — Honor robots.txt opt-out protocols when crawling data for training
- Infringing output prevention — Establish mechanisms to prevent generation of copyright-infringing outputs
- Complaint mechanism — Create a complaint mechanism for rights holders regarding copyright infringements
- Training data disclosure — Publicly disclose a summary of training data used, including data sources and characteristics
Training Data and Copyright Governance (Article 53) #
All GPAI providers must implement copyright-compliant training data policies — including robots.txt compliance, mechanisms to prevent infringing outputs, and public training data disclosure. This directly affects every foundation model provider operating in or serving the EU, making EU copyright law a de facto data governance standard for global AI training pipelines.
Requirements
| Requirement | Details |
|---|---|
| Copyright compliance policy | Implement and maintain a policy for compliance with EU copyright law throughout the training data pipeline |
| Robots.txt compliance | Honor robots.txt opt-out protocols when crawling data for training |
| Infringing output prevention | Establish mechanisms to prevent generation of copyright-infringing outputs |
| Complaint mechanism | Create a complaint mechanism for rights holders regarding copyright infringements |
| Training data disclosure | Publicly disclose a summary of training data used, including data sources and characteristics |
Penalties
| Violation | Fine |
|---|---|
| AI Act Article 53 infringement | Up to €15 million or 3% of worldwide annual turnover (whichever is higher) |