Data Governance Plan (EU AI Act Art. 10)
Legal basis: Art. 10 (data and data governance for high-risk AI systems), with GDPR interface controls and documentation links to Art. 9, 11, 12, and 15 where relevant.
Objective: Ensure datasets are relevant, representative, free of avoidable errors, and governed end-to-end.
1) Governance Foundation
- Plan owner: [Role]
- Data steward(s): [Roles]
- Scope (systems/datasets): [List]
- Review frequency: [Monthly/Quarterly]
- Change control process: [Link]
2) Dataset Inventory and Provenance
- Dataset ID/name
- Source (internal/vendor/public)
- Collection method
- License/usage restrictions
- Geography and population coverage
- Provenance evidence location
3) Data Quality Controls
- Completeness thresholds
- Consistency checks
- Label quality validation
- Missing value handling policy
- Outlier detection strategy
- Data drift indicators
4) Representativeness and Bias Testing
- Target population definition
- Segment coverage matrix
- Fairness metrics used
- Bias testing cadence
- Bias remediation actions
- Residual bias acceptance criteria
5) Data Preparation and Lineage
- Preprocessing pipeline steps
- Feature engineering documentation
- Data transformation logs
- Reproducibility controls
- Lineage tracking tooling
6) Access and Security Controls
- Access model (least privilege)
- Authentication/authorization controls
- Encryption at rest/in transit
- Audit logging for access events
- Third-party access controls
7) GDPR Interface
- Personal data categories
- Special category data handling
- Lawful basis summary
- Data minimisation controls
- Data subject rights workflow
- Retention/deletion schedule
8) Validation, Monitoring, and Retraining
- Validation dataset refresh cadence
- Monitoring KPIs
- Drift/quality alert thresholds
- Trigger rules for retraining
- Post-retraining validation checklist
9) Documentation Artifacts
- Dataset cards
- Labeling guidelines
- Quality reports
- Bias reports
- Access audit reports
- Exception logs
10) Common Mistakes to Avoid
- Using vendor datasets without provenance checks
- One-time bias test with no recurring cadence
- No link between drift signals and retraining decisions
- Missing ownership for data quality remediation
11) Approval and Review
- Prepared by: [Name/Date]
- Reviewed by (Data): [Name/Date]
- Reviewed by (Compliance): [Name/Date]
- Approved by: [Name/Date]
- Next review date: [YYYY-MM-DD]