Understanding the AI Lifecycle and Securing ML Systems

Sep 6, 2023 By Guest

Transcription:

[Intro] 0:00

Charlie McCarthy 0:29

Hello MLSecOps community and welcome back to the MLSecOps Podcast where we dive deep into the world of Machine Learning Security Operations!

Today we have a very special episode lined up for you. Joining us for the first time as a guest host is Protect AI’s CEO and founder, Ian Swanson. Ian is joined this week by Rob van der Veer, a pioneer in AI and security. Rob gave a presentation at Global AppSec Dublin earlier this year called “Attacking and Protecting Artificial Intelligence” which was a large inspiration for this episode. And in it Rob talks about the lack of security considerations and processes in AI production systems compared to traditional software development, and the unique challenges and particularities of building security into AI and machine learning systems.

Together in this episode, Ian and Rob dive into things like practical threats to ML systems, the transition from MLOps to MLSecOps, the [upcoming] ISO 5338 standard on AI engineering, and what organizations can do if they are looking to mature their AI/ML security practices.

This is a great dialogue and exchange of ideas overall between two super knowledgeable people in this industry. So thank you so much to Ian and to Rob for joining us on The MLSecOps Podcast this week.

And thanks to all of you out there for listening and supporting the show!

With that, here is an MLSecOps conversation between Ian Swanson and Rob van der Veer.

Rob van der Veer 2:16

So my name is Rob van der Veer. I'm from the Netherlands. I have a long history in AI and in security, starting 31 years ago as a researcher, a programmer, a hacker, a data scientist, consultant, and also a CEO. And right now, I'm Senior Director of AI Security and Privacy at Software Improvement Group (SIG), working with large organizations all around the world on these topics. And I like to be where the action is when it comes to standardization. I initiated Open CRE, which links security standards across industries and across roles. I led the writing group of ISO 5338 on AI lifecycle [still in draft at the time of this episode publishing, but available to download and read]. I created the OWASP AI Security and Privacy Guide, and I'm currently deeply involved in the European Cyber Resilience Act and the European AI Act, working with ISO, CEN-CENELEC, and ENISA. And that's me in a nutshell.

Ian Swanson 3:16

Welcome to The MLSecOps Podcast, Rob. I watched your Global AppSec Dublin presentation titled "Attacking and Protecting Artificial Intelligence," and knew immediately we had to have you as a guest on the show.

My first question to you, Rob, is should consumers leave the microphone enabled all the time on voice assisted technology? Alexa, Siri, Nest, you know, all of those devices.

What are your thoughts?

Rob van der Veer 3:40

Ian, thanks for that question. It was my question for the audience in Dublin, as you know, at the start of my talk. And it turned out that many people in the audience have a voice assistant, and a large part of them disables the microphone. And I think that illustrates the general lack of trust in how privacy and security is dealt with - with technology, especially by security people, of course, but especially with big tech vendors, and especially with AI.

Having the ability to switch it off is a good example of, I think, a useful countermeasure. I do this with one exception, the bathroom, because there's usually in our house, just one person, there are no conversations, maybe some singing. So I basically did my own little risk analysis there.

Ian Swanson 4:30

Yeah, I definitely wanted to start with that question because I thought it was a super interesting one that you asked the audience in your presentation. Speaking more about your presentation, again at the Global AppSec Dublin Conference, you talk about treating production AI as traditional software.

In your experience, is production AI not being treated as traditional software, and what is missing?

Rob van der Veer 4:54

Yeah. So, I see a lot of companies in my work. I work for SIG, I work with many organizations. And I see how they're dealing with AI engineering. And it's traditionally treated in a very different way. Different people than traditional software engineers, different tools than traditional software tools. And the type of programming is more, how you say, lab programming, mostly done by data scientists.

And they're really good at that work, and they're really focused on attaining a working model. That's what they want to do. They try all sorts of things, throw things away, copy paste, whereas traditional software engineers are more focused on creating resilient software for the future. And it's less experimental. So it's a different way of working. So there's a disconnect.

And we did some research at SIG, and it shows that the large majority of AI systems score way below the benchmark when it comes to maintainability. And also the amount of test code in these systems, it's about 2%, where the industry average is 43%. And we recommend 80%, by the way.

It is like as if software engineer best practices don't apply for AI, but they should, because once you have that working AI model, it needs to be secure and then you need to be able to keep on maintaining it. All that code that you use to collect the data, transform the data, train the model, you want to be able to transfer it maybe to another team. And for that you need documentation, which also is typically missing in AI engineering.

So, it's not a bad intention, but it's a different way of working and it's quite a problem, I believe. AI really needs to be included more into regular DevOps and security programs.

Ian Swanson 6:47

Why do you think that's not happening today?

And I think my question here is somewhere around, what are the particularities, if you will, of AI software versus traditional software that is leading to some of these differences around the people, the tools? Why are we not seeing as much testing, if you will, within these systems? Why is this the case?

Rob van der Veer 7:10

Yeah. Talking about particularities, I worked on the ISO Standard 5338, which is the standard on AI engineering, and it describes those particularities. And the most important ones, I believe, are the heavy reliance on data.

The behavior of the AI model relies on data, which means that engineers are directly working with sensitive production data, which is a security risk. And it means there are new development pipelines, data engineering and model engineering. So they're completely new processes to the organization that take place in different departments. More of a sort of a lab department or a research department. So organization-wise, and people-wise, and tool-wise, there's already a disconnect from the start. And that's one of the reasons that there's such a difference and a big gap to be filled.

Then also AI models are probabilistic, incomprehensible in most cases. So what AI mostly does is induce instead of deduce. It is guessing. It will often be incorrect to some extent. It uses a line of reasoning that humans typically cannot follow. And there are exceptions, of course, but they come with their own downsides.

And another particularity is that AI models often interact autonomously with the real world. And that increases the impact of AI risks, which is all the more reason for AI models that come out of the lab into production really need to be taken care of.

Ian Swanson 8:44

So is it fair to say that the machine learning development lifecycle is definitely different than the software development lifecycle? Not only from a standpoint of the process, the tools that are used, but also even the background of the people that are going through that process?

Rob van der Veer 9:02

That's such a good question.

You know, MLSecOps is certainly a separate topic of research and development, but in organizations, it should not be a separate thing. And this is exactly what the ISO standard that I just mentioned advocates. Treat AI engineering not as a separate lifecycle, although from the start, there will be separate departments and separate people, but you need to make one thing out of that.

And then MLSecOps becomes that part of your comprehensive approach that deals with machine learning. Because for sure, AI could and should shift left, so to speak. And not just for security, but for all software engineering best practices. And the best way to do that is to use one single approach to engineering as a whole.

Ian Swanson 9:50

As we think about the security of ML systems and AI applications, what are some of the practical threats? And perhaps you can give threat examples related to the machine learning supply chain.

Rob van der Veer 10:03

Yeah. Well, it helps to see the AI threats in three categories.

First, we have the regular application security threats such as denial of service. Then second, the model attacks. Mostly machine learning model attacks. For example, when you try to mislead a self-driving car into thinking a stop sign is a 55 miles an hour sign, for example. And then third, the attacks on the new development pipelines, the data engineering and the model engineering. And that's where the new supply chain risks are.

For example, data that you obtained can be poisoned to mislead your model. An already trained model that you obtain to fine tune yourself can have been manipulated. So there's more complexity in a supply chain than in traditional software engineering. And there's another problem, which is that these issues are very hard to detect.

Things in the data that are changed somewhere. If you have millions of records in that data set, it's hard to detect. You cannot just code review the data or code review the model. So it is a very high risk, a supply chain risk for AI systems.

And just to elaborate on regular application security, if you look at generative AI, and in particular large language models, there's a lot of talk about prompt injection. And it's a big issue. It's a really big issue. Of course, from an application security perspective, it is like SQL injection and like cross-site scripting [XSS]. It's the same mechanism, but you need different countermeasures, really.

So it's important that the people responsible for application security understand these AI particular risks with large language models. And a great resource for that is the recently released new version of the OWASP Top Ten for Large Language Models. It discusses the ten most prominent issues in large language models, of which prompt injection is one of them.

Ian Swanson 12:12

Yeah, it was great to see the OWASP Top Ten for LLMs published recently. Really powerful information, and it really helps with educating from a security perspective on what the new challenges are. As you mentioned, similar to SQL injections, these prompt injection attacks. But this is a brand new space that security professionals need to understand and understand the differences of AI and machine learning.

Now, I want to transition from your presentation that you gave on attacking and protecting artificial intelligence to, and you mentioned it multiple times, ISO 5338. You're one of the lead authors on ISO 5338, and that's really a standard of AI engineering that you set forth.

Can you provide an overview on what ISO 5338 is?

Rob van der Veer 13:02

Yes. The standard is based on an existing standard on software development lifecycle, which is the 12207, which is a renowned standard. It's from ISO, it's from IEC, it's from IEEE also, and it describes the general software lifecycle.

And what 5338 does, it doesn't define a completely new lifecycle. It goes by all the processes in 12207 and describes the relevant particularities. And in addition, it introduces, also, new processes. That way, it serves as a guide for any organization to extend their current way of working to AI instead of reinventing the wheel for risk analysis or supply chain management. There are so many great best practices and tools already in place in organizations that just need some clever extending, and that's what 5338 intends to do.

Just to name some particularities. The acquisition process for the supply chain needs to include data. That's relatively new. The supply process may need to facilitate continuous monitoring of a machine learning model. Continuously monitoring the performance of it. HR is an aspect of software development lifecycle, and you're dealing suddenly with data scientists with special expertise that you need to work with in an organization. And planning. You need to deal with the higher unpredictability of machine learning experiments compared to standard software engineering, which is already hard to predict as it is. And of course, for security, you're dealing with new threats like we discussed, and there are many hundreds of particularities listed in this standard.

Ian Swanson 14:53

As you have been presenting and sharing with organizations - ISO 5338 - what has been the feedback that you've received? Do they understand the differences between the AI system and that lifecycle versus the traditional software lifecycle?

And you mentioned some of these particularities and differences from traditional software development. And so one, what's the response to ISO 5338? And then two, are there any particular areas that you're having to explain further to say this is how it's different than traditional software development?

Rob van der Veer 15:31

Yeah, the response is different from the two realms. So, there's the data science realm, where there's a traditional way of looking at their lifecycle more based on the whole data mining theory idea. So for them, they need to get used to some of the concepts that are, I think, more common in software engineering.

But as you explain what it means, it's just a matter of how you frame it. It's about the same things. So they expect a certain visualization that they are familiar with, and when the visualization is different, they say, oh, but this won't work. But once you go into it, they love it. The software engineering realm immediately welcomes the additional insights regarding AI engineering.

So, overall, the response is good, but towards the data science community, it requires more explanation.

Ian Swanson 16:26

Got it. Let's discuss one area that I thought was really interesting in ISO 5338, and that was AI risk management process. Can you give us kind of a summary of that process and some of the particularities that live there that are AI specific?

Rob van der Veer 16:46

Yes, you need to deal with a whole new range of issues: unpredictability, transparency, explainability, fairness, staleness of the model, model drift; a whole new range of risks.

Risk analysis is not changing as a process, but there are a couple of new challenges that need to be added to the list of the risk analyst, including dealing with purpose. And that's a very profound one. The question whether the use of the data is legal is a very important one, because often it turns out too late that a certain AI application is actually illegal. And you also see this when it comes to safety and risk of harm.

So this is why we made these questions also part of the business analysis process in 5338, which takes place even before you start with AI, just to see, okay, what are we going to do? What data we're going to use, from what sources? Let's see if this is a good idea or not before we go ahead. And I think that's a very wise thing. AI ideas may very well be unfeasible from a privacy or human rights perspective.

Ian Swanson 18:06

That starts to get me to think about a lot of the regulation and as we go through governance and compliance of AI, especially as it relates to the EU AI Act.

How can somebody leverage 5338 in relation to the EU AI Act?

And what are your thoughts in terms of as we talk about the risk of AI and understanding the development of AI with 5338 and how it correlates with the EU AI Act?

Rob van der Veer 18:37

Yes, the EU AI Act asks organizations to be aware of the AI initiatives and do a risk analysis. And indeed, to help do that risk analysis they can use 5338.

5338 mentions several aspects that are new for AI but doesn't detail them. So if you want to learn more about fairness, for example, there are other standards and guidelines that deeply go into that. So I think that 5338 helps with regards to the risk analysis process in taking care of that part of the AI Act compliance. Yeah.

Ian Swanson 19:17

Yeah, I agree. As I read 5338, I thought it was a very good, if you will, template for us to understand what is the process for AI development. And as I read the EU AI Act, I definitely think that there's a tie in to be able to say, okay, as we are trying to de-risk AI, understand that we're building responsible AI, I think there's a good framework in 5338 that we can lean on to make sure that we're starting to set ourselves up as perhaps an enterprise or organization for the EU AI act that's coming.

Now, in your assessment, I'm curious what your thoughts are in terms of are companies mature today to adopt the entirety of the AI system lifecycle process? And the second part of that are, like, what are the biggest gaps and challenges within an enterprise?

Rob van der Veer 20:12

Well, we observe that organizations are eager to learn, but also that they have a long way to go. So the gap is big. And the biggest gap is that AI engineering needs to be treated as software engineering first. And furthermore, organizations need to understand the specific AI risks more, including those model attacks.

I would say that the biggest challenge for this is the low availability of experts, which is why it's so important to create materials such as the 5338. And I love the fact that you already took a look at it, although it's still in its final draft phase. A lot is currently being produced by the way, it's not just the 5338. NIST has some really good publications, ENISA has– which creates a new challenge, how you find the information that you need. And that's the reason why I created the OWASP AI guide, just to provide a little bit of overview of all the resources that are out there.

Ian Swanson 21:14

You mentioned one thing is there's a lack of knowledge and perhaps expertise in this area. Are you seeing that across all geographies, all industries? I would imagine that the more regulated industries, let's say financial services, healthcare/life science. Are they more mature today than some of the other areas of business?

Rob van der Veer 21:34

Definitely, the more regulated industries are more mature. You can see this in the representation of experts in the standard committees. Many of them are from medical device manufacturers. So, yeah, there's a difference there in industries.

Geographically? No, I'm not sure. The world has become smaller. It's easier for people to work remote. We get requests for expertise from all over the world, no exceptions. So I guess that the shortage is everywhere.

Ian Swanson 22:06

For your team and for you - people are seeking experts, and they're reaching out to you. Is there a particular area, perhaps, maybe as it relates to 5338, that they're looking for expertise?

Rob van der Veer 22:17

Yes. Mostly, people are wondering how they should prepare for the AI Act – the EU AI Act, which will set an example for regulations elsewhere.

It's hard for them to predict how the standardization, how the exact technical standards will turn out. I'm in the middle of the action there. I'm part of the working group that works on these standards, and it's not an easy thing. Therefore, it's also hard to predict what technical criteria and process criteria will come out of that. So people want to have a better understanding of that.

And also they want to have a better understanding of elements like fairness. That's what we see a lot, and next to that, security, I would say. And we see many AI initiatives struggling with transferring AI initiatives between teams. And that's not for the organization per se an AI question or issue. It's an engineering issue. Nevertheless, the gap that we were talking about in this session, that's the exact root cause of these engineering issues.

So we get approached to make sure that portfolios are becoming better maintainable, better transferable, and then, well, in many cases, it turns out there's a data science-software engineering disconnect that is the root cause of these problems in software portfolios.

Ian Swanson 23:50

Sticking on that point, I find that to be really interesting in terms of the transfer of work and kind of breaking down the silos, if you will.

What is some of your advice that you give organizations there? Is it including data scientists and ML practitioners within the business units? You talked about developing best engineering practices. What is that initial advice that you give companies?

Rob van der Veer 24:13

Well, first of all, and you mentioned it, the AI Act is about taking responsibility and knowing what you're doing and understanding the risks. I think that's the best place to start.

So take responsibility for AI, create an inventory of your initiatives, make someone responsible for analyzing and managing those risks. And for the high risk systems then, you need to make sure that you arrange transparency through communication, through documentation, auditability. And these are all things that the EU AI Act mentions. Countermeasures against bias, human or automated oversight, such things, they all belong to taking responsibility.

And then you can start mixing your software engineers and data scientists in teams. We feel that's one of the best ways to let data scientists be taught about software engineering, and at the same time let software engineers be taught about data science. Because there really is something to learn about AI that's insightful and useful for sort of traditional software engineers.

So, mix teams and then make your AI teams part of the software engineering and security programs. Step by step. Not all at once, step by step. And you need to involve some expertise there while you're taking every step. So, you could do this for part of your organization, or maybe start with one team, see how it goes, and then, you know, let the success spread. Not a big bang change. Just do it step by step.

And while you're doing this, train your people on the AI particularities that we discussed, including model attacks. And that's about it.

Ian Swanson 25:54

Yeah, that's great advice, Rob. It's really how enterprises can really look to mature their AI practice as a whole, as well as be compliant to a lot of the new regulations that are coming.

What's next for 5338? As you said, it's in draft form. So what are the next steps and what's the timing there?

Rob van der Veer 26:12

Yeah, we've been working on it for years, and we're really excited that it's very close to being published. Within one or two weeks, I expect to hear the big news that the last formal steps have been taken.

About 40 different countries have given their vote, their positive vote. We had 400 comments to process, so, as you may understand, it's quite a hot topic, so many opinions and many good input. And I think we managed to create consensus and what's next is, therefore, the release of it and hopefully many people downloading it. It's an ISO document, so you need to pay for the download, but it's just a very small fee compared to the value that it will bring. So I have great hopes.

And then, of course, we're going to continue working on it to create a new version because there's never a dull moment in AI.

Ian Swanson 27:10

Oh, I mean the pace of innovation in AI is just rapid.

Rob, what an effort. Like you said, it's been a couple of years. You're close to that finish line of getting the document published in its final form. As you stated, though, the work is never done. But congratulations.

By the time our listeners get a hold of this podcast, that’ll be out there. ISO 5338 will be something that they can go, they can get, they can read. [Not released yet, but available to download and read] It's got some massively important insights within it.

And then just to summarize the conversation that we had today, there's quite a few things that you mentioned that I think are worth repeating.

First off, I agree with you in the first question that we had of, do we turn off some of these devices that are constantly listening to us? We do in my household. So, I agree with your initial input there to that question.

Now, as we jump into ISO 5338, I love the concepts in there around responsibility. I love how we frame up in there the differences between standard software development to the ML development lifecycle. You highlight things like the risk, the responsibility, trusted AI, the security of AI.

And as we bring together all these concepts, you mentioned that it's really important that security is injected into the process that is the development of machine learning and AI.

I really appreciate the conversation, Rob. And I ask all of our listeners, go, please read ISO 5338 [still in draft at the time of this episode publishing, but available to download and read]. It's wonderful. Thanks so much for the time, Rob.

Rob van der Veer 28:48

Nice talking, Ian, thanks very much.

[Closing] 28:50

Additional tools and resources to check out:

Protect AI Radar

Protect AI’s ML Security-Focused Open Source Tools

LLM Guard - The Security Toolkit for LLM Interactions

Huntr - The World's First AI/Machine Learning Bug Bounty Platform

Thanks for listening! Find more episodes and transcripts at https://mlsecops.com/podcast.

Additional tools and resources to check out:

Guest

SUBSCRIBE TO THE MLSECOPS PODCAST