Hello :) Today is Day 353!
A quick summary of today:
- started reading a book about EU’s AI act
The AI Engineer’s Guide to Surviving the EU AI Act - Chapter 1 - Understanding the AI Regulations
As people, organizations, and the public sector increasingly rely on AI to drive decision-making, the technology must be trustworthy. The EU AI Act aims to provide a legal framework for developing, deploying, and using AI technologies within the EU, emphasizing safety, transparency, and ethical considerations. The EU AI Act is a regulatory framework for artificial intelligence that includes specific requirements for AI systems of different risk categories within the European Union
The Motivation for the EU AI Act: Trustworthy AI
As AI integrates into critical areas like human resources, finance, and medicine, ensuring trustworthiness in AI systems is essential. Predictive accuracy alone is insufficient. AI must adhere to principles like fairness, privacy, and accountability to be deemed trustworthy. Trustworthy AI, also known as responsible or ethical AI, encompasses technical requirements (like robustness, explainability, and transparency) and ethical requirements (like fairness and privacy).
Grounded in lawfulness, ethics, and robustness, trustworthy AI ensures compliance with regulations, ethical inclusivity, and risk-aware design. The European Commission’s High-Level Expert Group on AI outlined seven requirements for trustworthy AI:
- Human agency and oversight
- Technical robustness and safety
- Privacy and data governance
- Transparency
- Diversity, non-discrimination and fairness
- Societal and environmental well-being
- Accountability
Human Agency and Oversight
Human oversight ensures humans can monitor, evaluate, and intervene in AI systems. It involves governance processes, transparency, interpretability, and control mechanisms, allowing humans to understand and influence AI decisions, including overriding or stopping the system when necessary.
Technical Robustness and Safety
AI systems need to be accurate, reliable, and reproducible. They need to have a fall-back plan if something does not work properly.
The author distinguishes robustness on different levels of AI systems, namely data, algorithms, and underlying software systems
In academia, robustness has two main aspects:
- non-adversarial robustness: refers to how well the model performs when it is given corrupted or altered inputs that may not match the original data distribution. It means that the model can handle unintentional image corruptions, distributional shifts, or changes in the data generating process while maintaining its performance
- adversarial robustness: refers to the model’s ability to resist adversarial examples, which are intentionally crafted input perturbations designed to fool the model into making incorrect predictions
We can evaluate the robustness based on certain metrics:
- accuracy
- error rate
- sensitivity
- specificity
Or through robustness curves which show how the model’s performance changes as a function of parameters affecting data quality, such as noise level, missing values, outliers, data size, or feature selection
Privacy and Data Governance
AI systems must ensure full respect for privacy and data protection. Since privacy is a fundamental right within the European Union, privacy is a prerequisite to building trustworthy AI systems
Key elements of data governance:
Data governance is becoming more important and its ultimate goal is to enhance the trustworthiness of data. To ensure trust in data, three key aspects of data governance must be addressed: discoverability, security, and accountability of data.
- discoverability: refers to the availability of the dataset’s metadata, data provenance (lineage), and domain entities glossary; data quality is also crucial for trust
- data security and privacy: about protecting data and ensuring adherence to regulations such as GDPR
- accountability: consider data as a product - a data unit that is valid within a business domain where the domain team has a clear responsibility for this data unit
Example metrics to define and track for privacy and data governance include:
- data encryption levels
- access control
- data retention and deletion policies
- data product usage metric
Transparency
- explainability
- data transparency
- algorithmic transparency
- governance transparency
- communication transparency
Diversity, Non-Discrimination and Fairness
AI systems must mitigate unfair bias. Addressing potential bias should happen at every stage of the AI system development. The EU AI Act ensures that every entity along the AI system’s value chain, from producer to deployer, is responsible for providing users with a fair and ethical experience
There are several metrics used to evaluate fairness in ML systems:
- demographic parity (measures the difference in the probability of receiving a positive outcome between different demographic groups)
- equal opportunity (measures whether the model has an equal chance of correctly predicting positive outcomes for individuals from different groups who actually deserve a positive outcome)
- equalised odds (ensures that the model not only has equal opportunity but also equal mistreatment across groups)
- predictive parity (evaluates whether the precision of the model is consistent for all groups)
- individual fairness (ensures that similar individuals are treated similarly by the model, based on a predefined similarity metric)
- counterfactual fairness (evaluates whether changing an individual’s sensitive attribute would change the model’s prediction for that individual)
Societal and Environmental Well-being
AI systems should benefit all human beings. The EU Act is concerned specifically with the societal and environmental aspects of trustworthy AI areas.
The primary environmental concerns associated with AI include energy consumption, carbon footprint, e-waste, and indirect environmental impact
The amount of computing resources used to train deep learning models has increased 300,000x in six years from 2012 to 2018:
The proposed ‘Green AI’ approach considers efficiency as an additional evaluation criterion, together with the usual eval metrics. ‘Green AI’ efficiency is measured as the number of floating-point operations required to generate a result
The rise of AI has led to job polarization, with high-skilled jobs increasing and low-skilled jobs facing obsolescence. This worsens income inequality and creates challenges for those without access to education and training.
Lastly, in addition to ethical concerns and job changes, AI systems can negatively impact society at large and democracy through misinformation and surveillance.
Accountability
AI accountability means establishing mechanisms for holding AI developers and users accountable for their systems’ impacts along the AI system’s complete development cycle. It implies that information about the AI system’s purpose, design, data, and processes is available to internal and external auditors
Google has a responsible GenAI Toolkit that covers risk and mitigation techniques to address safety, privacy, fairness, and accountability:
Several mechanisms should be implemented to create accountability. Here are a few:
- clear responsibility guidelines and processes
- transparency and explainability
- human oversight and intervention
- auditing and evaluation
- redress and complaint mechanisms
EU AI Act in a Nutshell
The EU AI Act is about the adoption of human-centered and trustworthy artificial intelligence. The objective is to guarantee a high level of protection for health, safety, and fundamental rights as outlined in the Charter, including democracy, the rule of law, and environmental protection, against the damaging effects of AI systems in the European Union, all while fostering innovation
The subject of the EU AI Act is the following:
- “Harmonized rules for the placing on the market, the putting into service, and the use of AI systems in the Union;
- Prohibitions of certain AI practices;
- Specific requirements for high-risk AI systems and obligations for operators of such systems;
- Harmonized transparency rules for certain AI systems;
- Harmonized rules for the placing on the market of general-purpose AI models;
- Rules on market monitoring, market surveillance, governance and enforcement;
- Measures to support innovation, with a particular focus on SMEs, including startups.”
AI Definition
AI system’ means a machine-based system that is
* designed to operate with varying levels of autonomy and
* that may exhibit adaptiveness after deployment, and
* that, for explicit or implicit objectives, infers, from the input it receives,
* how to generate outputs such as predictions, content, recommendations, or decisions
* that can influence physical or virtual environments;
The following AI techniques and approaches refer to machine-based systems mentioned in the AI Act:
- ML methods (such as supervised, unsupervised, semi-supervised machine learning)
- DL methods
- Reinforcement Learning
- Logic- and Knowledge-based Methods (such as logic programming, expert systems, inference and deductive engines, reasoning engines)
- Statistical approaches
- Bayesian Methods
- Search and Optimization Approaches
The AI Act provides a definition of the ‘general-purpose AI model’:
AI model, including where such an AI model is trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable of competently performing a wide range of distinct tasks regardless of the way the model is placed on the market and that can be integrated into a variety of downstream systems or applications, except AI models that are used for research, development or prototyping activities before they are placed on the market
Key Players From Creation to Market Operation
- provider - a person or organization that develops an AI system or model, or has one developed, and puts it on the market or uses it, whether for payment or for free, under their own name or brand
- importer - a natural or legal person in the EU who trades an AI system bearing the name or trademark of a person in a non-EU country
- distributor - a person or company in the supply chain, other than the provider or the importer, that makes an AI system available on the Union market
- authorized representative - a person or organization in the EU who has been given written permission by an AI system or general-purpose AI model provider to carry out the responsibilities and procedures outlined in this Regulation on their behalf
- deployer - refers to a person or organization using an AI system for professional activities
- user - persons impacted by an AI system
Classification of the AI Systems by Risk Levels
- prohibited
- high-risk
- limited-risk
- minimal risk
Enforcement and Implementation of the EU AI Act
The Full Picture of Compliance
Steps | Guiding Questions |
---|---|
1. Creating AI System Landscape and AI System Risk Classification | How many AI systems are in place and are intended to be put into production? What risk category do these AI systems belong to? |
2. Compliance Requirements Structuring for AI Systems and GPAI | What requirements do we need to fulfill? |
3. Compliance Operationalization | What processes, structures, engineering practices, and roles need to be established to comply with the AI Act? |
4. Pre-market Compliance Verification | What has to be done before placing AI systems on the market and putting them into service? What are internal and external conformity assessments for compliance verification? How do we CE-mark our AI systems? Where should our AI system be registered (database)? |
5. Post-market Continuous Compliance | What has to be done to ensure compliance after putting the AI system into service? |
Knowing the risk category will shape all the requirements to be fulfilled to become compliant. It’s important to distinguish between obligations for AI systems and for usage General Purpose AI in the systems. For each of the groups, there are different articles in the AI Act with concrete requirements.
Penalties for EU AI Act Violation
Violation Details | Penalties |
---|---|
Prohibited AI Practices (AI practices listed in Article 5, such as exploiting vulnerabilities, social scoring, real-time biometric identification in public spaces, etc.) | Administrative fines of up to €35 million or 7% of their total worldwide annual turnover for the preceding financial year, whichever is higher. |
High-Risk AI Systems (non-compliance with requirements for high-risk AI systems under Article 10). | Administrative fines of up to €30 million or 6% of their total worldwide annual turnover, whichever is higher. |
Other non-compliance with other obligations under the AI Act, apart from Articles 5 and 10. | Administrative fines of up to €20 million or 4% of their total worldwide annual turnover, whichever is higher. |
Providing incorrect, incomplete, or misleading information to authorities. | Administrative fines of up to €10 million or 2% of their total worldwide annual turnover, whichever is higher. |
Existing AI Regulations and Standards
- UNESCO AI Ethics Recommendations
- U.S. Executive Order on Trustworthy AI
- China Generative AI Service Law
- NIST AI Risk Management Framework
- IEEE P2863 - Recommended Practice for Organizational Governance of Artificial Intelligence
Chapter 2 - MLOps: A Proactive Compliance Catalyst
Designing the ML-powered Application
An AI product:
- delivers value
- incorporates AI technology
- interacts and adapts
- operational integration
Backward Thinking For Designing AI Systems
It’s important to start an AI endeavor with the business goal in mind because 80-90% of AI initiatives never make it to production. The remaining few that make it is a result of, often, alignment to the business problem, solving a real user’s pain.
‘Backward Thinking’ provides a strategic framework for scoping, planning, and executing ML projects. Starting with the end in mind helps maximize the chances of delivering a useful ML solution while avoiding common pitfalls around poorly defined objectives, irrelevant features, and misaligned architectures.
Benefits:
- clear scope upfront
- business-aligned AI products
- minimum viable data
- narrowing solution space
- iterative-incremental dev process
The ML canvas is referenced (again, like in other books):
(this is a good chapter to reference when filling a ML canvas as it provides in-detail info about what can be good to fill and think about in each section of the canvas)
Value Proposition for AI Products
Suggested value proposition statement:
For [target customer]
Who [statement of the need or opportunity]
The [product name] is a [product category]
That [statement of key benefit – compelling reason to buy or use]
Unlike [primary competitive alternative]
Our AI product [statement of primary differentiation]
Business Value Validation Through Metrics
- business metrics
- AI product health metrics
- AI technical metrics
- system technical metrics
The success of AI products is measured through a holistic system of metrics, encompassing business impacts, product health, and technical performance, as drafted in the Monitoring section of the ML Canvas.
Prediction Part of the AI System
- prediction task and heuristic benchmark:
Define the task by specifying inputs, outputs, and problem type. Use heuristic baselines (e.g., rules or statistics) for initial benchmarks instead of simple ML models. For instance, anomaly detection can use the 99th percentile value as a baseline
- workflow integration:
Incorporate predictions into workflows, aligning decisions with predictions to achieve business goals. Use cases illustrate predictions leading to decisions (e.g. credit scoring predicts default probability, decision to reject loans if above a threshold)
- prediction serving:
Specify prediction frequency, volume, and latency needs. Choose between batch (e.g., periodic demand forecasting) or real-time serving (e.g., instant anomaly detection). Optimize serving for cost-efficiency and performance using techniques like REST APIs, Kubernetes, and quantization.
- deployment decision:
Simulate deployment to evaluate system performance on representative test cases. Use meaningful metrics and recent data for realistic assessment. Establish governance, audits, and safety measures to ensure reliable deployment. Combine offline and online evaluations to measure real-world performance and align with business objectives.
Training Part of the AI System
The training part of the AI system is responsible for all aspects required for building an ML model, namely data preparation, new data collection, feature engineering, and building and rebuilding ML models. The origin of ML models is data, and here are steps we need to consider
- data landscape
- data collection
- model training frequency
- feature engineering strategy
Requirements Engineering for ML and alignment with the EU AI Act Requirements
Using the ML canvas we end up with clearly defined, user-centric ML solutions and promotes cross-functional collaboration between data scientists, ML engineers, and businesses to build production-ready systems that deliver value.
Each of the ten building blocks of the canvas considers one particular aspect of the ML project, be it training or prediction. In total, we distinguish between six categories of requirements that we need to gather to build an AI product:
- strategic business alignment
- reusability of ML subsystems
- retraining frequency of ML models
- prediction serving mode
- ML code change frequency
- what kind of audit needs to be implemented?
With the application of the EU AI Act, the ML Canvas becomes an additional layer of requirements. According to the EU AI Act, we should consider the following requirements for high-risk AI systems:
- high-quality datasets
- technical documentation and record-keeping
- transparency and provision of information to users
- robustness, resilience, accuracy, and security
- human oversight
- specific requirements for AI systems in critical infrastructure
- respect for fundamental rights (privacy, data protection, non-discrimination, and consumer protection laws)
- post-market monitoring
Structuring Machine Learning Development Process With CRISP-ML(Q)
CRISP-ML(Q) defines six key phases in the ML development process:
- Business and Data Understanding (determine business objectives, identify data sources, collect initial data, describe data, explore data, verify data quality)
- Data Preparation (select, clean, construct, integrate, and format the data)
- Modeling (select modeling techniques, generate test design, build model, assess model)
- Evaluation (evaluate prediction results, review process, decide on the next steps)
- Deployment (plan deployment, plan monitoring and maintenance, produce final report, review project)
- Monitoring and Maintenance (monitoring the model, evaluating the model, retraining/updating the model as needed)
Working with CRISP-ML(Q) will help incorporate the requirements of the AI Act for high-risk and limited-risk AI systems. Furthermore, for each task in an ML development phase, CRISP-ML(Q) defines requirements and constraints to identify risks, such as bias, overfitting, or lack of reproducibility, that can impact the success of the ML system. Moreover, CRISP-ML(Q) mandates that AI systems must be thoroughly tested prior to deployment to ensure they perform consistently for their intended purpose and comply with requirements. Importantly, CRISP-ML(Q) requires that you document the entire ML development process, including the risk management measures.
The MLOps graph
Defining Technical Components With MLOps Stack Canvas
Value Proposition
By answering the following questions, you can confidently articulate your value proposition:
What are we trying to do for the end-user(s)?
What is the problem?
Why is this an important problem?
Who is our persona? (ML Engineer, Data Scientist, Operation/Business user)
Who owns the models in production?
With the example value proposition, we can identify the key elements such as:
- target customer: data science and ML teams
- need: efficiently building and deploying ML models at scale
- product name and category: MLOps Platform, an end-to-end machine learning platform
- key benefit: rapidly develop and operationalize ML to drive business value
- competition: other ML platforms that are complex and lack features such as observability and monitoring
- differentiation: fully managed, intuitive, automated, with experiment tracking, AutoML, and seamless deployment to accelerate the full ML lifecycle
Data and Code Management
- data sources and versioning
- data analysis and experiment management
- feature store and workflows
- reflect upon DevOps foundations
- ML pipe orchestration
Model Management
- model registry and model versioning
- model deployment
- prediction serving
- ML model, data, and system monitoring
Metadata Management
Three Dilemmas of MLOps
- tooling: should we purchase, use existing open-source, or develop in-house tools for any of the MLOps components? What are the risks, trade-offs, and impacts of each decision?
- platforms: should we standardize on a single MLOps platform or create a hybrid solution? What are the risks, trade-offs, and impacts of each decision?
- skills: what is the cost associated with either hiring or training our own machine learning engineering talent?
Conclusion
The synergy between MLOps and CRISP-ML(Q). CRISP-ML(Q) defines a structured process, while MLOps provides technical/organizational best practices
Definitely a great book so far.
That is all for today!
See you tomorrow :)