(Day 353) Started reading - The AI Engineer's Guide to Surviving the EU AI Act

Ivan Ivanov · December 19, 2024

data-eng theory mlops

Hello :) Today is Day 353!

A quick summary of today:

started reading a book about EU’s AI act

The AI Engineer’s Guide to Surviving the EU AI Act - Chapter 1 - Understanding the AI Regulations

As people, organizations, and the public sector increasingly rely on AI to drive decision-making, the technology must be trustworthy. The EU AI Act aims to provide a legal framework for developing, deploying, and using AI technologies within the EU, emphasizing safety, transparency, and ethical considerations. The EU AI Act is a regulatory framework for artificial intelligence that includes specific requirements for AI systems of different risk categories within the European Union

The Motivation for the EU AI Act: Trustworthy AI

As AI integrates into critical areas like human resources, finance, and medicine, ensuring trustworthiness in AI systems is essential. Predictive accuracy alone is insufficient. AI must adhere to principles like fairness, privacy, and accountability to be deemed trustworthy. Trustworthy AI, also known as responsible or ethical AI, encompasses technical requirements (like robustness, explainability, and transparency) and ethical requirements (like fairness and privacy).

Grounded in lawfulness, ethics, and robustness, trustworthy AI ensures compliance with regulations, ethical inclusivity, and risk-aware design. The European Commission’s High-Level Expert Group on AI outlined seven requirements for trustworthy AI:

Human agency and oversight
Technical robustness and safety
Privacy and data governance
Transparency
Diversity, non-discrimination and fairness
Societal and environmental well-being
Accountability

Human Agency and Oversight

Human oversight ensures humans can monitor, evaluate, and intervene in AI systems. It involves governance processes, transparency, interpretability, and control mechanisms, allowing humans to understand and influence AI decisions, including overriding or stopping the system when necessary.

Technical Robustness and Safety

AI systems need to be accurate, reliable, and reproducible. They need to have a fall-back plan if something does not work properly.

The author distinguishes robustness on different levels of AI systems, namely data, algorithms, and underlying software systems

In academia, robustness has two main aspects:

non-adversarial robustness: refers to how well the model performs when it is given corrupted or altered inputs that may not match the original data distribution. It means that the model can handle unintentional image corruptions, distributional shifts, or changes in the data generating process while maintaining its performance
adversarial robustness: refers to the model’s ability to resist adversarial examples, which are intentionally crafted input perturbations designed to fool the model into making incorrect predictions

We can evaluate the robustness based on certain metrics:

accuracy
error rate
sensitivity
specificity

Or through robustness curves which show how the model’s performance changes as a function of parameters affecting data quality, such as noise level, missing values, outliers, data size, or feature selection

Privacy and Data Governance

AI systems must ensure full respect for privacy and data protection. Since privacy is a fundamental right within the European Union, privacy is a prerequisite to building trustworthy AI systems

Key elements of data governance:

Data governance is becoming more important and its ultimate goal is to enhance the trustworthiness of data. To ensure trust in data, three key aspects of data governance must be addressed: discoverability, security, and accountability of data.

discoverability: refers to the availability of the dataset’s metadata, data provenance (lineage), and domain entities glossary; data quality is also crucial for trust
data security and privacy: about protecting data and ensuring adherence to regulations such as GDPR
accountability: consider data as a product - a data unit that is valid within a business domain where the domain team has a clear responsibility for this data unit

Example metrics to define and track for privacy and data governance include:

data encryption levels
access control
data retention and deletion policies
data product usage metric

Transparency

explainability
data transparency
algorithmic transparency
governance transparency
communication transparency

Diversity, Non-Discrimination and Fairness

AI systems must mitigate unfair bias. Addressing potential bias should happen at every stage of the AI system development. The EU AI Act ensures that every entity along the AI system’s value chain, from producer to deployer, is responsible for providing users with a fair and ethical experience

There are several metrics used to evaluate fairness in ML systems:

demographic parity (measures the difference in the probability of receiving a positive outcome between different demographic groups)
equal opportunity (measures whether the model has an equal chance of correctly predicting positive outcomes for individuals from different groups who actually deserve a positive outcome)
equalised odds (ensures that the model not only has equal opportunity but also equal mistreatment across groups)
predictive parity (evaluates whether the precision of the model is consistent for all groups)
individual fairness (ensures that similar individuals are treated similarly by the model, based on a predefined similarity metric)
counterfactual fairness (evaluates whether changing an individual’s sensitive attribute would change the model’s prediction for that individual)

Societal and Environmental Well-being

AI systems should benefit all human beings. The EU Act is concerned specifically with the societal and environmental aspects of trustworthy AI areas.

The primary environmental concerns associated with AI include energy consumption, carbon footprint, e-waste, and indirect environmental impact

The amount of computing resources used to train deep learning models has increased 300,000x in six years from 2012 to 2018:

The proposed ‘Green AI’ approach considers efficiency as an additional evaluation criterion, together with the usual eval metrics. ‘Green AI’ efficiency is measured as the number of floating-point operations required to generate a result

The rise of AI has led to job polarization, with high-skilled jobs increasing and low-skilled jobs facing obsolescence. This worsens income inequality and creates challenges for those without access to education and training.

Lastly, in addition to ethical concerns and job changes, AI systems can negatively impact society at large and democracy through misinformation and surveillance.

Accountability

AI accountability means establishing mechanisms for holding AI developers and users accountable for their systems’ impacts along the AI system’s complete development cycle. It implies that information about the AI system’s purpose, design, data, and processes is available to internal and external auditors

Google has a responsible GenAI Toolkit that covers risk and mitigation techniques to address safety, privacy, fairness, and accountability:

Several mechanisms should be implemented to create accountability. Here are a few:

clear responsibility guidelines and processes
transparency and explainability
human oversight and intervention
auditing and evaluation
redress and complaint mechanisms

EU AI Act in a Nutshell

The EU AI Act is about the adoption of human-centered and trustworthy artificial intelligence. The objective is to guarantee a high level of protection for health, safety, and fundamental rights as outlined in the Charter, including democracy, the rule of law, and environmental protection, against the damaging effects of AI systems in the European Union, all while fostering innovation

The subject of the EU AI Act is the following:

“Harmonized rules for the placing on the market, the putting into service, and the use of AI systems in the Union;
Prohibitions of certain AI practices;
Specific requirements for high-risk AI systems and obligations for operators of such systems;
Harmonized transparency rules for certain AI systems;
Harmonized rules for the placing on the market of general-purpose AI models;
Rules on market monitoring, market surveillance, governance and enforcement;
Measures to support innovation, with a particular focus on SMEs, including startups.”

Here

AI Definition

AI system’ means a machine-based system that is

* designed to operate with varying levels of autonomy and
* that may exhibit adaptiveness after deployment, and
* that, for explicit or implicit objectives, infers, from the input it receives,
* how to generate outputs such as predictions, content, recommendations, or decisions
* that can influence physical or virtual environments;

The following AI techniques and approaches refer to machine-based systems mentioned in the AI Act:

ML methods (such as supervised, unsupervised, semi-supervised machine learning)
DL methods
Reinforcement Learning
Logic- and Knowledge-based Methods (such as logic programming, expert systems, inference and deductive engines, reasoning engines)
Statistical approaches
Bayesian Methods
Search and Optimization Approaches

The AI Act provides a definition of the ‘general-purpose AI model’:

AI model, including where such an AI model is trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable of competently performing a wide range of distinct tasks regardless of the way the model is placed on the market and that can be integrated into a variety of downstream systems or applications, except AI models that are used for research, development or prototyping activities before they are placed on the market

Key Players From Creation to Market Operation

provider - a person or organization that develops an AI system or model, or has one developed, and puts it on the market or uses it, whether for payment or for free, under their own name or brand
importer - a natural or legal person in the EU who trades an AI system bearing the name or trademark of a person in a non-EU country
distributor - a person or company in the supply chain, other than the provider or the importer, that makes an AI system available on the Union market
authorized representative - a person or organization in the EU who has been given written permission by an AI system or general-purpose AI model provider to carry out the responsibilities and procedures outlined in this Regulation on their behalf
deployer - refers to a person or organization using an AI system for professional activities
user - persons impacted by an AI system

Classification of the AI Systems by Risk Levels

prohibited
high-risk
limited-risk
minimal risk

Enforcement and Implementation of the EU AI Act

The Full Picture of Compliance

Steps	Guiding Questions
1. Creating AI System Landscape and AI System Risk Classification	How many AI systems are in place and are intended to be put into production? What risk category do these AI systems belong to?
2. Compliance Requirements Structuring for AI Systems and GPAI	What requirements do we need to fulfill?
3. Compliance Operationalization	What processes, structures, engineering practices, and roles need to be established to comply with the AI Act?
4. Pre-market Compliance Verification	What has to be done before placing AI systems on the market and putting them into service? What are internal and external conformity assessments for compliance verification? How do we CE-mark our AI systems? Where should our AI system be registered (database)?
5. Post-market Continuous Compliance	What has to be done to ensure compliance after putting the AI system into service?

Knowing the risk category will shape all the requirements to be fulfilled to become compliant. It’s important to distinguish between obligations for AI systems and for usage General Purpose AI in the systems. For each of the groups, there are different articles in the AI Act with concrete requirements.

Penalties for EU AI Act Violation

Violation Details	Penalties
Prohibited AI Practices (AI practices listed in Article 5, such as exploiting vulnerabilities, social scoring, real-time biometric identification in public spaces, etc.)	Administrative fines of up to €35 million or 7% of their total worldwide annual turnover for the preceding financial year, whichever is higher.
High-Risk AI Systems (non-compliance with requirements for high-risk AI systems under Article 10).	Administrative fines of up to €30 million or 6% of their total worldwide annual turnover, whichever is higher.
Other non-compliance with other obligations under the AI Act, apart from Articles 5 and 10.	Administrative fines of up to €20 million or 4% of their total worldwide annual turnover, whichever is higher.
Providing incorrect, incomplete, or misleading information to authorities.	Administrative fines of up to €10 million or 2% of their total worldwide annual turnover, whichever is higher.

Existing AI Regulations and Standards

Chapter 2 - MLOps: A Proactive Compliance Catalyst

Designing the ML-powered Application

An AI product:

delivers value
incorporates AI technology
interacts and adapts
operational integration

Backward Thinking For Designing AI Systems

It’s important to start an AI endeavor with the business goal in mind because 80-90% of AI initiatives never make it to production. The remaining few that make it is a result of, often, alignment to the business problem, solving a real user’s pain.

‘Backward Thinking’ provides a strategic framework for scoping, planning, and executing ML projects. Starting with the end in mind helps maximize the chances of delivering a useful ML solution while avoiding common pitfalls around poorly defined objectives, irrelevant features, and misaligned architectures.

Benefits:

clear scope upfront
business-aligned AI products
minimum viable data
narrowing solution space
iterative-incremental dev process

The ML canvas is referenced (again, like in other books):

(this is a good chapter to reference when filling a ML canvas as it provides in-detail info about what can be good to fill and think about in each section of the canvas)

Value Proposition for AI Products

Suggested value proposition statement:

For [target customer]

Who [statement of the need or opportunity]

The [product name] is a [product category]

That [statement of key benefit – compelling reason to buy or use]

Unlike [primary competitive alternative]

Our AI product [statement of primary differentiation]

Business Value Validation Through Metrics

business metrics
AI product health metrics
AI technical metrics
system technical metrics

The success of AI products is measured through a holistic system of metrics, encompassing business impacts, product health, and technical performance, as drafted in the Monitoring section of the ML Canvas.

Prediction Part of the AI System

prediction task and heuristic benchmark:

Define the task by specifying inputs, outputs, and problem type. Use heuristic baselines (e.g., rules or statistics) for initial benchmarks instead of simple ML models. For instance, anomaly detection can use the 99th percentile value as a baseline

workflow integration:

Incorporate predictions into workflows, aligning decisions with predictions to achieve business goals. Use cases illustrate predictions leading to decisions (e.g. credit scoring predicts default probability, decision to reject loans if above a threshold)

prediction serving:

Specify prediction frequency, volume, and latency needs. Choose between batch (e.g., periodic demand forecasting) or real-time serving (e.g., instant anomaly detection). Optimize serving for cost-efficiency and performance using techniques like REST APIs, Kubernetes, and quantization.

deployment decision:

Simulate deployment to evaluate system performance on representative test cases. Use meaningful metrics and recent data for realistic assessment. Establish governance, audits, and safety measures to ensure reliable deployment. Combine offline and online evaluations to measure real-world performance and align with business objectives.

Training Part of the AI System

The training part of the AI system is responsible for all aspects required for building an ML model, namely data preparation, new data collection, feature engineering, and building and rebuilding ML models. The origin of ML models is data, and here are steps we need to consider

data landscape
data collection
model training frequency
feature engineering strategy

Requirements Engineering for ML and alignment with the EU AI Act Requirements

Using the ML canvas we end up with clearly defined, user-centric ML solutions and promotes cross-functional collaboration between data scientists, ML engineers, and businesses to build production-ready systems that deliver value.

Each of the ten building blocks of the canvas considers one particular aspect of the ML project, be it training or prediction. In total, we distinguish between six categories of requirements that we need to gather to build an AI product:

strategic business alignment
reusability of ML subsystems
retraining frequency of ML models
prediction serving mode
ML code change frequency
what kind of audit needs to be implemented?

With the application of the EU AI Act, the ML Canvas becomes an additional layer of requirements. According to the EU AI Act, we should consider the following requirements for high-risk AI systems:

high-quality datasets
technical documentation and record-keeping
transparency and provision of information to users
robustness, resilience, accuracy, and security
human oversight
specific requirements for AI systems in critical infrastructure
respect for fundamental rights (privacy, data protection, non-discrimination, and consumer protection laws)
post-market monitoring

Structuring Machine Learning Development Process With CRISP-ML(Q)

CRISP-ML(Q) defines six key phases in the ML development process:

Business and Data Understanding (determine business objectives, identify data sources, collect initial data, describe data, explore data, verify data quality)
Data Preparation (select, clean, construct, integrate, and format the data)
Modeling (select modeling techniques, generate test design, build model, assess model)
Evaluation (evaluate prediction results, review process, decide on the next steps)
Deployment (plan deployment, plan monitoring and maintenance, produce final report, review project)
Monitoring and Maintenance (monitoring the model, evaluating the model, retraining/updating the model as needed)

Working with CRISP-ML(Q) will help incorporate the requirements of the AI Act for high-risk and limited-risk AI systems. Furthermore, for each task in an ML development phase, CRISP-ML(Q) defines requirements and constraints to identify risks, such as bias, overfitting, or lack of reproducibility, that can impact the success of the ML system. Moreover, CRISP-ML(Q) mandates that AI systems must be thoroughly tested prior to deployment to ensure they perform consistently for their intended purpose and comply with requirements. Importantly, CRISP-ML(Q) requires that you document the entire ML development process, including the risk management measures.

The MLOps graph

Defining Technical Components With MLOps Stack Canvas

Value Proposition

By answering the following questions, you can confidently articulate your value proposition:

What are we trying to do for the end-user(s)?

What is the problem?

Why is this an important problem?

Who is our persona? (ML Engineer, Data Scientist, Operation/Business user)

Who owns the models in production?

With the example value proposition, we can identify the key elements such as:

target customer: data science and ML teams
need: efficiently building and deploying ML models at scale
product name and category: MLOps Platform, an end-to-end machine learning platform
key benefit: rapidly develop and operationalize ML to drive business value
competition: other ML platforms that are complex and lack features such as observability and monitoring
differentiation: fully managed, intuitive, automated, with experiment tracking, AutoML, and seamless deployment to accelerate the full ML lifecycle

Data and Code Management

data sources and versioning
data analysis and experiment management
feature store and workflows
reflect upon DevOps foundations
ML pipe orchestration

Model Management

model registry and model versioning
model deployment
prediction serving
ML model, data, and system monitoring

Metadata Management

Three Dilemmas of MLOps

tooling: should we purchase, use existing open-source, or develop in-house tools for any of the MLOps components? What are the risks, trade-offs, and impacts of each decision?
platforms: should we standardize on a single MLOps platform or create a hybrid solution? What are the risks, trade-offs, and impacts of each decision?
skills: what is the cost associated with either hiring or training our own machine learning engineering talent?

Conclusion

The synergy between MLOps and CRISP-ML(Q). CRISP-ML(Q) defines a structured process, while MLOps provides technical/organizational best practices

Definitely a great book so far.

That is all for today!

See you tomorrow :)