<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=4373740&amp;fmt=gif">
MLSecOps-favicon PAI-favicon-120423 icon3

AI/ML Security in Retrospect: Insights from Season 1 of The MLSecOps Podcast (Part 2)

Audio Only:

0:00 - Opening
0:25 - Intro by Charlie McCarthy
2:29 - S1E9 with Guest Diya Wynn
6:32 - S1E4 with Guest Dr. Cari Miller, CMP, FHCA
11:03 - S1E17 with Guest Nick Schmidt
16:46 - S1E7 with Guest Shea Brown, PhD
22:06 - S1E8 with Guest Patrick Hall
26:12 - S1E14 with Guest Katharine Jarmul
32:01 - S1E13 with Guest Jennifer Prendki, PhD
36:44 - S1E18 with Guest Rob van der Veer


[Intro] 0:00

Charlie McCarthy 0:25

Welcome back, everyone, to The MLSecOps Podcast. We’re thrilled to have you with us for Part 2 of our two-part season finale, as we wrap up Season 1 and look forward to an exciting and revamped Season 2.     

In this two-part season recap, we’ve been revisiting some of the most captivating discussions from our first season, offering an overview on essential topics related to AI and machine learning security.     

Part 1 of this series focused on topics like adversarial machine learning, ML supply chain vulnerabilities, and red teaming for AI/ML. Here in Part 2, we've handpicked some standout moments from Season 1 episodes that will take you on a tour through categories such as model provenance; governance, risk, & compliance; and Trusted AI. Our wonderful guests on the show delve into topics like defining  responsible AI, bias detection and prevention, model fairness, AI audits, incident response plans, privacy engineering, and the significance of data in MLSecOps.     

These episodes have been a testament to the expertise and insights of our fantastic guests, and we're excited to share their wisdom with you once again. Whether you're a long-time listener or joining us for the first time, there's something here for everyone, and if you missed the full-length versions of any of these thought-provoking discussions or simply want to revisit them, you can find links to the full episodes and transcripts on our website at www.mlsecops.com/podcast.

Without further ado, thank you for being a part of our Season 1 journey on the MLSecOps Podcast, and we look forward to seeing you in Season 2. Stay tuned and thanks for listening.


[From Responsible AI: Defining, Implementing, and Navigating the Future; with Guest Diya Wynn]

D Dehghanpisheh 2:29 

So, let's talk about that, Diya. Your title has the term Responsible AI in it. And here on MLSecOps Podcast, we've talked with many guests who use terms like “Trusted AI,” “Robust AI,” “Ethical AI.” Can you talk to us a little bit about how you define Responsible AI? And is it the same thing as Robust AI, or Trusted AI, or Ethical AI? Are they all the same? Are they all different? 

Diya Wynn 2:55 

I think they largely are used interchangeably, although I feel like Responsible AI gives us a little bit more in terms of the full breadth, or areas of focus, that we can have. So, it's not just about ethics, although you could say what happens in the environment and all that kind of stuff might be considered that, but it really is broader than that. 

So my definition of Responsible AI, and this is core to the way in which we engage with our customers, is thinking about Responsible AI as an operating approach that encompasses people process and technology so that you can minimize the unintended impact and harm, or unintended impact and risks, and maximize the benefit. Right? And so that gives us some things to think about or a structure in order to be able to bring a lot of things into the picture. 

So our value alignment, considerations around privacy, security, and sustainability, that all becomes part of the conversation. I think by and large, I like the notion of trustworthy as well, because the intent is that we want to do these things, or put in place these practices and have this governance structure in a way so that we can engender trust. 

You all know people fear what they don't understand, and that generates a lack of trust. And we ultimately want to have trust in the technology because that's going to have an impact on our ability to be able to use it, and use it for the good things that it's actually capable of doing. 

I think that the foundation of trust is having Responsible AI. But by and large, I think all of these terms are used interchangeably with some of the same core tenets or areas of focus, like privacy, security, transparency, explainability. We add in there value alignment, inclusion, training and education, along with accountability. 

D Dehghanpisheh 4:56 

So you add in extras beyond just the technology, or extras beyond just that and go to the heart of the matter, if you will, pun intended, in terms of making sure that you're aligning to the values of a company, that if you consider yourself to be ethical in a certain dimension or have values of diversity in a certain dimension, that that flows through from technical selection all the way to data curation to bringing about people to determine whether or not the use case is appropriate. 

Is that a fair articulation? 

Diya Wynn 5:25

Absolutely. You got it. 

So we're in an interesting position, of course, right? Because we are providing services for people in all sorts of places, and jurisdictions, and countries, and different industries, and we don't have full visibility into all the ways in which they might be using them and we're certainly not in a position of trying to police them. 

But I think that we get to come in, partnering with our customers to help them unpack, in whatever use cases they might be exploring, one, what matters to them and make sure that's being infused into the way in which they are looking at leveraging the technology. But also thinking about the things that they should be concerned about, or consider, to make sure that they're building on a solid foundation that will yield inclusive and responsible technology on the tail end. 

And when they do that, then they're going to have the trust of their customers and that also is going to help support the trust that the marketplace or industry will have in the technology and the products that are similar to those.

[FROM Unpacking AI Bias: Impact, Detection, Prevention, and Policy; with Guest Dr. Cari Miller]

Charlie McCarthy 6:32 

Cari, how prevalent would you say is bias in AI, and how do we know? What information do we have to support that prevalence? Do we have numbers? What are the numbers? 

Dr. Cari Miller, CMP, FHCA 6:43 

That's a trick question. It's use case by use case and developer by developer. So, for example, there are some developers that they're very aware, they're very evolved, and they really try hard to mitigate. So they have processes in place, they have ethicists in place.

They really challenge themselves on every single question. I know, like, for example, I've talked to some people at Modern Hire, and they really go out of their way.

They have I/O psychologists on staff, and they really think through, okay, are these people giving us this purposefully or involuntarily? What does that mean for how we should let the machine treat that information? I mean, they are very deep in what they're doing.

There are other developers that, they're only schooled in churning data and so they don't bring in that psychology or the sociology of what's behind the data. And those are going to present more harms because they haven't been as thoughtful about how they're preparing their system.

Now that's more - I'm going to use a phrase here - structured data is one set of problems. The audio and the video data is a different set of problems depending on how it's trained.

So, for example, NIST set out and did a set of tests on facial recognition providers and what they found was even if they were very diligent about what they were doing, decay happens on images. So they trained the images on a bunch of really good photographs and the training set was really robust and very dynamic over time.

The decay happens because people age and you wear glasses and you grow a beard and you get a little saggy in your face and all of a sudden the facial recognition doesn't work as well. And then your pigment changes and pigment issues are always going to be a problem in facial recognition. So you asked a trick question, so I gave you a very long answer. So you're welcome. 

Charlie McCarthy 8:54 

Okay, so AI is biased. People are biased. Regarding AI bias, why should people care? Why should the public care? Why are we considering AI bias to be harmful and maybe more harmful than human bias?

You mentioned the scaling issue. How are people affected by this? How are they going to be affected by this?

Dr. Cari Miller, CMP, FHCA 9:16 

Yeah, so the EU is sort of the front runners in this. They've drawn a line – a red line – and they said some systems are okay, and some systems, we're going to qualify them as high risk systems, which is very interesting.

So, for example, a system that's only going to be used to maybe paint a car door, and it's not going to have flaws in the paint, is that high risk? No. Is that going to hurt someone? No, it's not.

Here's the high risk system: does it affect someone's education, health, housing, employment, security, policing? These are high risk systems.

So anything that's going to affect your dignity, your human rights, your economic status, those things are high risk. And so the reason we care is because there are no laws about this stuff right now.

There's some regulation, but there are no laws that are really good that say, hey, developers, when you create a high risk system, we're going to need you to go ahead and make sure you don't do this, that you do do that, and whatever those things need to be.

And there is a lot, actually. And we want you to register your system. We want you to have a checkup every year, whatever the things are. So that doesn't exist right now. So we're wholly banking on people doing the right thing, which is fine, except for we haven't always taught them to do the right thing.

And sometimes it's expensive to do the right thing. So you're kind of like, we'll do that later. We just want to get to market. We put a lot of money in this.

So there's a lot of reasons why we don't do the right things all the time, which means the risk gets passed right onto the buyer, and the buyer just trial sounded good when I bought it. And we all accept the risks.

[From ML Model Fairness: Measuring and Mitigating Algorithmic Disparities; with Guest Nick Schmidt]

D Dehghanpisheh 11:03

At its core, I would imagine there's a lot of principles of fairness that you're trying to imbibe into artificial intelligence applications and the use of AI. 

With that, I assume, comes model governance. And when you think about model governance, what are the key principles and aspects that organizations should consider?

Nick Schmidt 11:21

That’s right. 

And I think that it's important to realize that fairness is one piece of model governance. And if you don't really have a good model governance structure, then you're probably not going to have fair algorithms. And so when I think about model governance, what I'm really thinking about are a few different things. 

One of them is having accountability and ownership. And so, what this means is that everybody who touches the model has some level of ownership and accountability for it. And then I think about transparency and explainability, and that is that everybody in your organization who is involved in the modeling process ought to know as much as they need to know in order to ensure that the model is safe and fair and robust. 

Explainability is another important one, where what I would say is you don't want to have a model that is unnecessarily opaque. And even when a model is opaque or a black box-like large language model, you want to understand as much as you can about it and be able to understand where your limits are. So that's another principle of governance. 

And then fairness and robustness. And this goes to both the question of what's affecting the customers and what's affecting the business. You can't have a model in production that isn't being fair to the customers, that's a legal, regulatory, and reputational risk. And you can't have a model that's in production that's going to bankrupt the business. So you really need to have a model governance process that's looking at the model and making sure that it's robust to any business conditions that might change or might fluctuate once you get it into production. 

And then data quality and integrity. Making sure that you're not putting garbage in. Because even though there's a lot of people who seem to think you can do anything with machine learning and it's this sort of magical thing, if you put garbage in, you get garbage out. 

And then the final thing is monitoring and validation. And that's really just making sure that your model is doing what you expect it to do after it goes into production. So regularly monitoring and checking for things like fairness, robustness, model performance and so forth. 

D Dehghanpisheh 13:30

So you just briefly spoke about how model governance is applied at all of the stages of the model lifecycle or of an ML pipeline, whether it's talking about data at the input components all the way out through understanding what happens at the inference, i.e. the engagement of a person or an entity with a machine learning model in production. 

When you think about that entire pipeline, from experimentation to inference, to re-training, all of it – when you think about the complexity of that, why is it important to ensure that fairness is integrated into all those steps? 

What are specific steps or actions that you would recommend engineers employ at each stage of the life cycle? Could you decompose that for us a little bit? 

Nick Schmidt 14:03

Sure. And I think maybe before I jump into what you should do, I think that it's worth touching on why you should do it. 

And to me, there are a number of reasons why we have to integrate fairness throughout the machine learning pipeline. But at a principal level, or a top down level, they come down to two things. One is ethics, and one is business. 

Humans have ethical standards, and if we're putting models into production that don't meet those standards, we're really not living up to what we want to be. And then on the business side, suppose you're a non ethical person and all you care about is money. Well, that's fine, but one thing that's really not good for your job is if your CEO is in front of a Senate subcommittee apologizing for what your AI model has done. 

D Dehghanpisheh 14:46

I do not recall, Senator. 

Nick Schmidt 14:48

Exactly. That is not a good career move. 

And so I think that's the broad reason for why we want to make sure that fairness is incorporated throughout the pipeline, but then breaking it down and thinking about what's important there. It's asking the question, “where can discrimination come into a model? How can a model become unfair?” 

And that can happen anywhere, really. 

D Dehghanpisheh 15:13

So is there a particular stage of the model lifecycle where it's more likely to occur than not? 

Nick Schmidt 15:21

Yeah, I think there are two places. One is in the data itself, what you've decided to put into the model– 

D Dehghanpisheh 15:28 

In terms of the data training set? 

Nick Schmidt 15:30

The data training set, exactly. 

And then there's the model itself, the built model. There's a lot of chance that you could have bias in that. And so in the data stage, I think there's been so much press about data being biased, it may even seem like it's relatively obvious. But if you think about credit data for determining whether or not people get loans, there's historical and present day discrimination that's encoded in that data. 

For example, African Americans are much more likely to be fired without cause than are whites. And if you look at why people default on loans, very frequently because they lose their jobs. 

And so if you've got that kind of information that's going into the data, you're going to encode that kind of bias. So even if you have the best intentions in the world, if you're using a data set that's biased, it's going to give you a biased result. 

The other way that data can be biased, and this is a more subtle one, I think is in coverage. One of the things that we see very frequently is problems with missing data, and missing data being more frequently occurring for minorities. 

[From AI Audits: Uncovering Risks in ML Systems; with Guest Shea Brown, PhD]

D Dehghanpisheh 16:46 

So with that Dr. Brown, bias in AI and the need for understanding these AI systems–AI auditing, if you will–is entering the mainstream conversations, thanks to ChatGPT, large language models, and it's really kind of getting a lot of buzz. And you talked about the recidivism predictive rates that ProPublica was talking about. 

What are some harms that you're concerned about right now that you think more AI auditing needs to be applied towards?

Shea Brown 17:15 

Almost anything that you can think of, there is a potential for harm. And so, I'll speak very generally real quick, because I think often times people will focus on a particular use case. For instance, recidivism algorithms, or AI used in hiring processes that could be biased, or giving people loans. Those are all things where it's very obvious that if you have a biased algorithm, that it could cause harm, could discriminate against people. If you have an algorithm in some cases, like facial recognition, where simply misusing, it doesn't have to be biased even–and often they are, of course–but it could simply be misused and it could cause harm. 

But I think in general, if you take a process that has typically been done by a human in the past, and then you try to automate that process, almost by definition you have these filter effects that happen because you're taking something where you have this very holistic decision that was made by a human and you have to squish it down and conform it into some automation. Select certain features. Select information and turn it into an automated process that is going to give rise to some sort of biases or some sort of filter effect. Almost without fail. You can't avoid it. 

And so I think all of these systems, whether they use machine learning or not, those sort of things have to be considered and typically they haven't really in a kind of rigorous way. So in my mind, almost any automated system that makes some consequential decision about a human's life–whether they get a loan, or a job, or get social services has to be inspected through this lens of potential for harm.

Charlie McCarthy 18:48 

Switching from what we might consider to be public harms to more organizational risk. What types of risks do organizations face without audits and risk assessments of these AI/ML systems and/or their automated decision making tools that are powered by algorithms? 

Shea Brown 19:05 

The biggest one is probably reputational risk. The biggest one meaning the first one. I think the biggest risk that happened early on was reputational risk. Someone would write something about an algorithm that was doing something wrong that was not good for the companies that were operating those algorithms. And that was sort of the first wave of interest in doing something about this. 

Recently, there has been another wave of liability risk. People are starting to sue companies for this, and now regulatory risk is rearing its head. Because the EU is very active, local laws in the United States, federal laws or federal enforcement agencies are starting to talk about these issues. And so for an organization, these are kind of the top level risks. 

There's also a whole lot of things that could happen on the ground, so to speak, where you try an AI initiative, you spend a lot of money on it, it turns out to be not good for a variety of reasons, and you end up wasting a lot of money. And so there's some sort of organizational efficiency and money lost basically, for not having considered potential downsides of these tools as opposed to just focusing on the upside. 

D Dehghanpisheh 20:10 

You've talked about reputational risk. You've talked about, let's call it brand risk, or what used to be called “New York Times front page risk,” right? And you also talk about the risk, if you will, of capital, time, and human beings capabilities largely being wasted in doing these types of audits. 

I guess that brings back the center question of who is asking for AI audits? Especially since they're not necessarily being mandated yet. Who in organizations are raising their hands saying, “You know what, we absolutely need this, we must invest in this,” whether it's time, people, technology, or something else?

Shea Brown 20:46 

Good question. So people are asking for them, and it kind of comes with a similar wave. So the first wave that we saw of interest, where people really wanted this done, was coming from that reputational wave. So people who had reputational pressure, it was very obvious that they had to do something about it in a way that was thoughtful, and increased transparency. And auditing or assessments of their algorithms by external parties is kind of a clear way to go. And now there are laws like New York City Local Law 144, which requires auditing for hiring algorithms– that's forcing a lot of people. A lot of people we're working with right now have to comply with that law and need to get audits done. 

But what we're also seeing is procurement. So, these bigger companies, the enterprise companies, Fortune 50 companies which have had some reputational risk and are seeing the regulatory risk on the horizon, are now in the procurement process, asking tougher questions of their vendors. So, someone who produces an AI and is trying to sell it to an enterprise organization, they're going to have to answer some tough questions about, have you tested for bias? Have you taken care of potential adversarial or cybersecurity risks that are unique to AI? And those sort of things are being required of– these large enterprises, are requiring them of the vendors. And so we're starting to see this market push of trying to push the ball down into the vendor's court and have them actually do something.

[From ML Security: AI Incident Response Plans and Enterprise Risk Culture; with Guest Patrick Hall]

D Dehghanpisheh 22:06 

Most public companies have a risk and compliance function. And it feels to me like on one hand there's kind of risk and compliance, but underneath that risk and compliance entity is the function of governance, right? 

So, if you think about governance in that nested element and you think about governance of AI applications and machine learning systems in terms of mitigating risk, with the introduction of an AI application which is powered by an ML system, what are the aspects of governing those machine learning models and those machine learning systems as you see it? And why does that matter? 

Patrick Hall 22:42 

I’m a big proponent of governance of people. Now, of course, data and technology systems also need governance. But forgive me for not believing the paper that says this is the beginning of Artificial General Intelligence. Like, we're still in the days where if a computer misbehaves, you just go unplug it. And so, governance should be mostly about people. And there's many different ways to do this. 

One that's likely not realistic for most companies. But I like to point it out because I think it is very effective. In large US Banks there's what's known as a Chief Model Risk Officer. And this is actually recommended by the Federal Reserve and other banking regulators. And so, this is a single person, a single throat to choke when things go wrong. And it's also important to say that this person is rewarded when things go well and that this person, in contrast to many directors of Responsible AI out there, this person has a large budget, very high stature in the organization, lots of staff. They have the ability to do things. 

And so I would point out that if you can put one person in charge who gets in trouble when things go wrong and gets rewarded when things go right, that's a very, very serious sort of governance mechanism. Now, most companies will have to–because they don't have the regulatory mandate to do this–they'll have to come up with sort of software structures that involve committees and that involve sort of regular audits and things like this. And not to disparage those, because I think they're very important, and that's most of what companies can do. I just like to throw out this idea that I think is very effective of the Chief Model Risk Officer, which is a real thing in the largest banks in the US. 

D Dehghanpisheh 24:19 

Well, we’ve seen that with roles like that of Agus at Wells Fargo, who we both know. Agus is Executive Vice President, Head of Corporate Model Risk at Wells Fargo. And there are others in finance where this role of managing “model and AI risk” is getting increasingly important. 

Patrick, from your experience, what is that role in terms of guiding and determining everything from model selection to monitoring various feedback loops, et cetera? Talk to us a little bit about that.

Patrick Hall 24:48

Well, whoever it is, ideally it's a Chief Model Risk Officer, maybe it's an AI Risk Committee, maybe it's some other more standard risk function. They need technical people reporting to them. And what really is important, and again I see a lot of technology organizations struggle with this, is that the people on the testing and validation side have equal stature and equal pay with developers. 

And it's really important that even if you can't do that, even if you can't pull off that trick where the testers are at the same level of organizational stature as the developers, you need very qualified people on the risk and validation side who can engage in what's known as effective challenge. Who can say, why did you pick this model? Why are you only monitoring this six months? Last time we had a model like this, it blew up after three months. Why did you use this algorithm that you just made up instead of this very similar algorithm from this big textbook that everybody likes? 

So, ideally, these people operate with the same stature as the developers, but you need the testers and validators to have the same kind of technical background as the developers so that they can essentially ask hard questions. And that promotes effective challenge and critical thinking and common sense. A lot of these technology problems are human problems, as you well know.

[From Privacy Engineering: Safeguarding AI & ML Systems in a Data-Driven Era; with Guest Katharine Jarmul]

D Dehghanpisheh 26:12

Talk to us a little bit about some of the main data privacy and security risks associated with the ML models. And I'm wondering if, for our audience, you could categorize those risks based on where they occur in the model lifecycle. 

Katharine Jarmul 26:24

Yeah, so, I mean, obviously we have the entire data infrastructure that might collect the data. And depending on how advanced your machine learning setup might be, this might be a series of pipelines and a series of lakes or warehouses or however you have it. And if people are using feature stores, then you might even have a situation where you're ingesting data and putting it directly into feature store. 

The biggest problem that we have in these systems, whether or not you have, like, a staging and then a production and then a feature store that's derived from production. What happens when somebody comes and they said, like, “Oh, my data is incorrect,” or, “Oh, I'd like to delete my data,” or you have data retention questions, or you have consent questions with regard to data privacy regulation or something like this. 

And we often don't have very good lineage tracking in these systems. And often when we have lineage tracking, it's fairly, let's say, not well connected and well understood how to query that in an automated way. And so we often run into these problems where in feature store we actually have some features that we've put maybe even into production use cases or that we've essentially engineered and put into either a feature store or indirect pipelines that go into training models. 

And nobody can answer the question of consent information. Nobody can answer the question, when, where, how is the data collected? Nobody could answer the question, did we get the ability to use this for machine learning? And nobody can answer the question, if somebody comes and they ask for their data to be removed, what are we supposed to do with the feature pipelines or the artifacts that were created with those features? And this is a huge mess. 

So this is the first stage, which is basically data processing, preparation, and maybe even feature engineering. We haven't even got to training and we have these big problems. Sorry, go ahead. 

D Dehghanpisheh 28:28

Yeah, and I mean, just on that alone it seems like most of the regulatory frameworks that guide a lot of the thinking in terms of data privacy as it exists now, take GDPR or others. They're not encompassing for that model lifecycle, which means I may have solved it and said, okay, I got a request to remove this person's information, but it's memorized in a training data set. 

And if that training data set is memorized in the model, it just kind of cascades through and it's like, well, that was a good, valiant attempt, but it failed because it's already kind of like spawned literally into all the other models and maybe even this kind of cascading design of model training where the output of one is training, the input of another. 

Is there really a way to comprehensively address this in your mind? 

Katharine Jarmul 29:15

Yeah. So as part of the book, I wrote an idea that I had which is kind of based off of I'm not sure if somebody's I'm certain somebody's referenced it, but let me just reference it first for listeners to this episode is the idea of model cards. This model card mindset of, hey, what data went into it? What were the training mechanisms? What was the evaluation criteria? And really, we should have this anyways. 

Just as you all know, from MLOps perspective, we need to have a more organized and better able to query and use way of comparing models. And we have some systems, like some of the things that I know first, weights & biases led, but now many, many people work on, which is, how do we evaluate from a performance perspective, the many different models that we might be training. We might even have shadow model systems, and how do we evaluate and switch models? 

But we don't have that really, from a governance perspective very often. And from a governance perspective, data governance perspective. And also, I guess you could call it also model governance perspective. How do I quickly look up what models have been trained with data, from what locations? How do I quickly segregate, let's say, models trained on data of Californians, right? Whenever the new privacy law in California goes into effect versus other places, how do I look at sunsetting contributions of certain regions over time? 

And these are also really important questions from just model drift and data drift perspective, as well as to better understand where is the data coming from? What's the population represented here? In what use cases do we want to divide things by regions or divide things by particular user groups or something like this? And coming back to my social scientist roots, population selection is a nontrivial problem in data science by default from a statistical perspective. And I think that it's an undervalued problem in a lot of machine learning workflows to actually think through what is population selection. 

And as for companies that operate in Europe, as the AI act gets going, you have to take a lot of thinking about population selection. And these types of things are definitely going to need to be better documented. 

So in the book, I describe what I call privacy card for machine learning models, where I suggest maybe some things that you can do, like what privacy technologies you evaluated. Where does the data come from? Does the model need to be deleted at a certain point in time? How are you allowed to use the model based on the data that was used to train it? 

And I think thinking through model retention periods is probably just a good practice by default. It's sad when I meet teams that say they're using the same model for a year, because I think that that probably means they're not actually evaluating the model in any type of real-time sense.

[From The Intersection of MLSecOps and DataPrepOps; with Guest Jennifer Prendki, PhD]

D Dehghanpisheh 32:01

Beyond security, there's this whole world and realm of governance and compliance, right? How should we think about DataPrepOps in regards to AI regulations, particularly AI regulations like the EU Act that's getting ready to be activated, or GDPR? 

How does DataPrepOps intersect with data protection regulations? Can you talk a little bit about that? 

Jennifer Prendki, PhD 32:25

Of course, yeah. 

To put more perspective on your question, let's take a step back. I'll try to explain basically where I see DataPrepOps on the market. Right now people are familiar with the concept of MLOps, which in my mind is almost an overloaded term, because MLOps has really become the space of model preparation, model tuning, model deployment, and the operations of managing the model. 

And when you look at companies in that space, even though most landscape maps still have this little data preparation box in there, on the other end of the spectrum, you have the DataOps space, which is data privacy, data management, data storage, and everything that's the management of a raw data set. And so, in fact, DataOps is not necessarily associated to machine learning. You also need DataOps for business analytics of any sort, basically like sort of an in between kind of thing, right? 

So practitioners would know that in order to take that safely-stored data set on something like Databricks or Snowflake or one of the cloud provider's products, you basically need to have those manual operations. Normally we would do that manually, where you would transform data, you would annotate the data, you would remove outliers, you would potentially apply augmentations or whatnot, and then that would go into your machine learning model, right? 

DataPrepOps sits in between. So it's like the operational side and an attempt at automating and creating an end-to-end workflow from the moment the data leaves the DataOps layer and enters the MLOps layer. And so I think there is a lot of awareness to protect the data and provide security solutions in the DataOps layer. If you buy a product from a DataOps provider, you can have some faith that they're doing something to protect your data, avoid data leakage and whatnot. 

However, from what we said earlier, the model layer is not necessarily that protected because there is still no real awareness that there is a copy of the information that used to live in your data set and is now stored in your model. That's literally what MLSecOps is about, right? So, DataOps being in the middle, there needs to be more regulation on the model layer, the model preparation. Because you have a machine learning model, if you're going to deploy that machine learning model somewhere that's a vulnerability to your organization, you could have adversarial attacks and lots of problems from that. Right? I mean, so in fact, somebody who could just trigger an inference endpoint could figure out a lot of things. 

DataPrepOps being in the middle is even worse because, being in between, you're exposed to both the data side and the model side. The way we operate is really this logic where though you're training the model, assessing which subset of the data would be useful and pulling that data at this specific point in time, whatever new technologies and regulations exist and are going to be created for data security are going to apply to DataPrepOps and security implications for the model are also going to apply to DataPrepOps. So this is where everything needs to go. 

Actually this is something I recently just realized. This is very important, I think, for security people to realize in the context of AI regulation. GDPR comes from Europe, right? Culturally speaking, people in the US have a culture of asking for forgiveness more than permission, while Europe has the other way around, which basically gives an edge for a researcher in the US to move faster, right? Let's move, let's figure out the regulations afterwards. But that gives an edge to develop technology faster. 

While Europeans typically have the other position, which is we cannot move technology forward unless everything's properly regulated. It's like two sides of the same card in many ways. So the way I see a global awareness that there needs to be more security conversations for AI and for machine learning models is going to come from an enhanced conversation and the collision of European mindset and the US mindset into something that's acceptable from both parties. 

And I think we're going in this direction because as you probably know, the CEO of Open AI, Sam Altman, just called for regulation himself. So I think we're getting there. 

[From A Holistic Approach to Understanding the AI Lifecycle and Securing ML Systems: Protecting AI Through People, Processes & Technology; with Guest Rob van der Veer]

Ian Swanson 36:44

That starts to get me to think about a lot of the regulation and as we go through governance and compliance of AI, especially as it relates to the EU AI Act. 

How can somebody leverage 5338 in relation to the EU AI Act? 

And what are your thoughts in terms of as we talk about the risk of AI and understanding the development of A rI with 5338 and how it correlates with the EU AI Act?

Rob van der Veer 37:06

Yes, the EU AI Act asks organizations to be aware of the AI initiatives and do a risk analysis. And indeed, to help do that risk analysis they can use 5338. 

5338 mentions several aspects that are new for AI but doesn't detail them. So if you want to learn more about fairness, for example, there are other standards and guidelines that deeply go into that. So I think that 5338 helps with regards to the risk analysis process in taking care of that part of the AI Act compliance. Yeah. 

Ian Swanson 37:45

Yeah, I agree. And as I read the EU AI Act, I definitely think that there's a tie in to be able to say, okay, as we are trying to de-risk AI, understand that we're building responsible AI, I think there's a good framework in 5338 that we can lean on to make sure that we're starting to set ourselves up as perhaps an enterprise or organization for the EU AI act that's coming. 

For your team and for you. And people are seeking experts, and they're reaching out to you. Is there a particular area, perhaps, maybe as it relates to 5338, that they're looking for expertise? 

Rob van der Veer 38:20

Mostly, people are wondering how they should prepare for the AI Act – the EU AI Act, which will set an example for regulations elsewhere. 

It's hard for them to predict how the standardization, how the exact technical standards will turn out. I'm in the middle of the action there. I'm part of the working group that works on these standards. And it's not an easy thing. Therefore, it's also hard to predict what technical criteria and process criteria will come out of that. So people want to have a better understanding of that. 

And also they want to have a better understanding of elements like fairness. That's what we see a lot, and next to that, security, I would say. And we see many AI initiatives struggling with transferring AI initiatives between teams. And that's not for the organization per se an AI question or issue. It's an engineering issue. Nevertheless, the gap that we were talking about in this session, that's the exact root cause of these engineering issues. 

So we get approached to make sure that portfolios are becoming better maintainable, better transferable, and then, well, in many cases, it turns out there's a data science, software engineering disconnect that is the root cause of these problems in software portfolios. 

Ian Swanson 39:47

Sticking on that point, I find that to be really interesting in terms of the transfer of work and kind of breaking down the silos, if you will. 

What is some of your advice that you give organizations there? Is it including data scientists and ML practitioners within the business units? You talked about developing best engineering practices. What is that initial advice that you give companies? 

Rob van der Veer 40:10

Well, first of all, and you mentioned it, the AI Act is about taking responsibility and knowing what you're doing and understanding the risks. I think that's the best place to start. 

So take responsibility for AI, create an inventory of your initiatives, make someone responsible for analyzing and managing those risks. And for the high risk systems then, you need to make sure that you arrange transparency through communication, through documentation, auditability. And these are all things that the EU AI Act mentions. Countermeasures against bias, human or automated oversight, such things, they all belong to taking responsibility. 

And then you can start mixing your software engineers and data scientists in teams. We feel that's one of the best ways to let data scientists be taught about software engineering, and at the same time let software engineers be taught about data science. Because there really is something to learn about AI that's insightful and useful for sort of traditional software engineers. 

So, mix teams and then make your AI teams part of the software engineering and security programs. Step by step. Not all at once, step by step. And you need to involve some expertise there while you're taking every step. So, you could do this for part of your organization, or maybe start with one team, see how it goes, and then, you know, let the success spread. Not big bang change. Just do it step by step. 

And while you're doing this, train your people on the AI particularities that we discussed, including model attacks. And that's about it. 

[Closing] 41:52

Additional tools and resources to check out:

Protect AI Radar

Protect AI’s ML Security-Focused Open Source Tools

LLM Guard - The Security Toolkit for LLM Interactions

Huntr - The World's First AI/Machine Learning Bug Bounty Platform

Thanks for listening! Find more episodes and transcripts at https://mlsecops.com/podcast.