AI Vulnerabilities: ML Supply Chains to LLM and Agent Exploits

Written by Guest | Feb 24, 2025 6:44:41 PM

Audio-only version also available on Apple Podcasts, Spotify, iHeart Podcasts, and many more.

Episode Summary:

In this episode of the MLSecOps Podcast, host Dan McInerney from Protect AI and guest Sierra Haex dive into the evolving world of AI security. They share firsthand experiences from researching AI supply chain vulnerabilities—examining tools like Ray and the risks posed by untested model files and insecure deployment practices. The conversation covers how traditional security testing applies to AI, the innovative use of LLMs for code analysis and bug hunting, and the challenges and benefits of deploying AI agents. Tune in to learn more!

Transcript:

[Intro]

Dan McInerney (00:08):

Alright, welcome to this edition of the MLSecOps Podcast. Today we have special guest Sierra Haex, who is actually a former coworker of mine. We're both on the research team at Coalfire doing security research, what, five years ago or something?

Sierra Haex (00:21):

Yeah, definitely. Yeah.

Dan McInerney (00:22):

Yeah, and so now we're both doing research into AI and AI security. Which we've been doing for several years now. We kind of started together doing research for Protect AI on the original huntr bug bounty program.

Sierra Haex (00:34):

Yep. Right before like everything blew up.

Dan McInerney (00:37):

Yeah. I think you were like one of the very first testers of AI supply chain security on that bug bounty program. That was two years ago.

Sierra Haex (00:46):

Oh my goodness!

Dan McInerney (00:46):

I know, it's been a long time. So, now you're doing... Explain what you're doing now with some of the AI research that you've done recently.

Sierra Haex (00:54):

Yeah. So in the past year or so, I've been helping DARPA out with their AIxCC competition, where we are testing the limits of AI models. My current role in that is to build challenges that test AI's ability to find vulnerabilities and programs with the end goal of actually going and securing all the infrastructure that underlies the internet, all kinds of, all the invisible stuff so that another Log4j doesn't happen. It's a fun mission and it's a really good time.

Dan McInerney (01:31):

It's interesting because the path that we both have taken essentially was, we started with looking at the whole AI ecosphere and saying, hey, what's new in this like, when an organization adds AI to their departments, for whatever reason, they're doing sales research or they're using an LLM as a chat bot for customers, you're gonna have to have an ML machine learning department. And that department has security issues that don't really exist in other departments that traditional security can't necessarily cover. But there is quite a bit of overlap between traditional security and AI security. So, when we started all this, we looked at that and we saw that the supply chain was probably the most vulnerable part of an organization that chooses to start employing machine learning models and agents and LLMs. And that's where we started. We started with looking at tools like Ray, for instance.

Sierra Haex (02:26):

Oh yeah, absolutely.

Dan McInerney (02:26):

What was some of the research that we did on Ray? Cause we actually did this together. We were texting on the phone, I think at the same time looking at some Ray code. What is Ray, first of all?

Sierra Haex (02:35):

Well, Ray is an AI system that is used to ingest data, and then you can perform queries against the model that you trained. And you had messaged me and we kind of just jumped on it at, at the same time, and we were just going through it as fast as possible. And we discovered a couple, like very vulnerable things in it. And one of the things with AI is that AI has components of it that are unique to AI, but there's also a lot of components that are just overlap with infrastructure. And so the vulnerabilities and like web testing and reverse engineering and code analysis also apply to these systems as well. So we were performing code analysis against Ray.

Dan McInerney (03:19):

Yeah. And so what Ray also does is it creates a cluster of computers to train models. So you can, it's used by a lot of the major organizations, like I think Uber and Netflix I think uses Ray too. So, Ray allows you to build models on distributed sets of computers, like massive amounts of, of computing power. And it's used by a lot of large companies. But the issue was, because the speed of development was occurring so fast and still is, a lot of these tools are being built out by machine learning engineers, and there's not a lot of security testing going into it. Or else they'll just say, "Hey, listen, we put this product out here, it's not that secure. But, you know, here's some guidelines that you can help secure it." Like only run Ray in an isolated instance with access to absolutely nothing else, or something like that.

Sierra Haex (04:06):

Yeah, and that's a product of just like how fast the industry's moving. How quickly new things are developing on it.

Dan McInerney (04:11):

It's not like, this is not a knock against developers with these AI tools either that we found all these bugs in, it's just like the first market is getting the biggest chunk of the utility and the user base and all of that. So it's just a function of the speed of development. Not necessarily a function of like, oh, the engineers and everything are dumb and bad.

Sierra Haex (04:30):

Oh, no, not at all. It's just, yeah.

Dan McInerney (04:33):

But it's something you gotta think about when you are employing machine learning in your environment. A lot of these tools have not been security tested, and so we started doing security testing. We found some like remote code executions, some pretty vulnerable, some pretty big issues from a hacker's perspective, because these things are remotely attackable. So, if someone sends you a phishing email for instance, and they have a link to an internal, they have a link pointing to your internal Ray instance, if you click on that link, that's remote code execution. Arbitrary code that the hacker wants. So it doesn't matter if Ray is not exposed to the internet or anything like that, it's still a fairly big security hole.

So, when you're thinking about how to secure your organization from external threats in your AI environment, supply chain I think should be a fairly large portion of your time spent. Check out the previous CVEs, check out, just do some basic security testing on any of the tools that you're using, and you can avoid things like the Ray remote code execution.

Sierra Haex (05:28):

Yeah, and there was a couple ways we even attacked that. Where we went after like the management software, then we went after like the individual compute node and we were able to kind of look at it from a holistic attack surface.

Dan McInerney (05:41):

And so this was all just traditional security testing. We didn't use LLMs to find these bugs. We didn't, we weren't attacking LLM models. We're just using the infrastructure that stands up these LLMs. And so, this kind of brings us up to visibility. In a lot of environments, I feel like a mature IT organization knows exactly where all their computers are. They have it all organized nicely. They know where this IP address goes. They can tell when this computer's connecting to this printer and that sort of thing. I don't see that really in organizations that start deploying models internally. A lot of times it's like the wild west where the machine learning engineer has to just download whatever they can to get the job done. And there's no central organization of you know, where are these models stored?

Dan McInerney (06:24):

What data was this trained on? Was this trained on personally identifiable information, HIPAA compliance data, credit card information... all that stuff is kind of nebulous right now. And so if you are in that role of trying to secure the AI infrastructure in your environment or your organization, figuring out where all these things are and putting constraints and limits on where machine learning engineers can store models and things like that is gonna help a lot. So centralized databases, like MLFlow, are a pretty good example. It's a good place to store all your models. Centralized repository, just tell your engineers, this is the only place we're gonna put this stuff, you know? Or if you're in the cloud, you know, that makes it a little bit easier. So, Amazon has Bedrock. Bedrock has, you know, a central repository of where all your stuff is, makes it a little easier to organize. But that organization and visibility feels very, very important to eliminating these bugs that we saw when we were pen testers especially, where non-mature organizations would have a Windows XP box just like randomly lying around somewhere.

Sierra Haex (07:22):

Yeah, an old NT 4 command server.

Dan McInerney (07:24):

Yeah, an ATM machine that ran Windows XP that was, definitely saw that several times. So to avoid all of that, I feel like your first job is to figure out what tools are your machine learning engineers installing? What models are they downloading? Where are they downloading those models from? And have all those models been scanned? And I think, one of the parts of AI security that people overlook is the model file itself.

Sierra Haex (07:50):

Oh, yeah. Because usually they're like Python files and they're unpickled into like an AI system. And so, in Python, unpickling things is inherently unsafe because there's ways to run code through that process. And it's not needed most of the time because models are usually just a set of point vectors and you just need the dimensions of it and load the point vectors in. But some other models contain additional assets and ways of tweaking them. And so the industry has kind of shifted into this pickled format for, if you're looking at Python models, but yeah it's inherently vulnerable. And if you open it on your box, it'll run codes.

Dan McInerney (08:36):

Right. An antivirus, we did this testing, antivirus doesn't detect malicious models, at all. Malicious models just go under the radar. So you can't just use traditional antivirus techniques to be like, oh, hey, Microsoft Defender flagged this model file and said, oh, it's gonna execute some crazy code on your computer. You gotta actually scan it with model specific scanners. And so places like Hugging Face we actually, Protect AI actually runs Guardian, which is a tool that scans model files, model file specific scanner on Hugging Face to determine whether the files are malicious or not. And of course, this is never gonna be absolutely perfect. There's gonna be zero days that sometimes pop out, but this is just one area where traditional security measures like antivirus, they just don't overlap with AI. This is very unique to AI, is model files themselves.

Sierra Haex (09:23):

Yeah, and Hugging Face is an amazing platform to share and learn from other AI developers, but there're currently, there's no malicious code scanning in place. So, yeah scanning is super important before you bring that code down into your environment and run it.

Dan McInerney (09:41):

Yeah. So this kind of brings up the idea of security of AI versus AI for security. So what we're talking about so far is the security of AI. You know, the infrastructure around it, where you're downloading the models, how to use the model safely so you're not putting a chat bot out that says, "Hey, you can buy this car for a dollar," being legally obligated to sell it. But AI for security is actually an interesting topic too, that's been developing quite rapidly. What you were saying with using LLMs to augment your security research. Can you go into a little bit more detail about how you're using that to, you know, augment fuzzing or something like that?

Sierra Haex (10:16):

Yeah! Oh yeah, absolutely. Because if you look at DARPA's work, they're trying to defend the internet, but another component of my work is looking for bugs and writing exploits for them. And so when you have a massive amount of code, it's very hard for a human to go through, you know, 20 million lines of code, retain that state in your head, and then see issues or trust problems that develop in the code.

So I've been able to take binaries, run them through a disassembler and decompiler, I use Binary Ninja, and then I get pseudo C code, and then I can submit that to ChatGPT or Gemini or some of the other systems and say like, "Hey, look at this function. Are there any bugs inside of this? And if so, tell me about them." And then from there you can use basic bug finding techniques that the models know about, the code models specifically. And then from there you can use directed graph analysis to find paths between user inputs into the program and then where the vulnerable function is. And so it's, it's a really cool developing area that is yielding a lot of good results.

Dan McInerney (11:28):

Yeah, LLMs just have this ability to like, conceptualize and contextualize massive amounts of data.

Sierra Haex (11:34):

That's such, so surprising.

Dan McInerney (11:34):

It's crazy. And so we're watching this like, play out in real time. So we have, you know, the rise of the hack bots is starting to come now. There's tools like Expo who claim to have the benchmarks, they're about equal to a senior level pen tester for ramification tests, it finds zero days. They actually have expos running on HackerOne, and they have a whole bunch of different vulnerabilities they found on HackerOne just using an LLM. But it is a very different topic to talk about security of AI versus the using AI for security.

We actually have a little bit of overlap there. When we built Vulnhuntr, which is a web application security tool that goes and reads code bases and finds zero days remotely, exploitable zero days, and that kind of stuff, I think is probably where the future of security as a whole is going. Where do you see everything going in, let's say two years? We have several new generations of models. How does a security professional start using those models to secure their own environment?

Sierra Haex (12:36):

Well, I think it's just gonna be part of the CI/CD system. You're going to have a model looking at all the code that developers create and submit as your first line of defense against injecting vulnerabilities into a code base. And then like reviewing all the existing code that your company has, you know, all the legacy things, things that built up over the years, the technical debt using a system that can look at that data in mass and then find bugs is gonna be super valuable.

Dan McInerney (13:03):

I think the CI/CD pipeline makes perfect sense. Like GitHub has Copilot. I can imagine that it's very easy to just pop into your CI/CD pipeline. So whenever you put a GitHub PR in there, it just scans it and says, "oops, you have made a little mistake here." Yeah, I see that absolutely as being the fundamental change and shift in security. I think that can probably apply too to AI models and stuff too. Like, you download an AI model, another AI actually just analyzes the model itself. It can probably decompile it, open it up...

Sierra Haex (13:30):

It turns all the way down. Yeah, yeah we can definitely...

Dan McInerney (13:32):

Which actually reminds me of DeepSeek. So DeepSeek is super cool, but I feel like there's a lot of misconceptions. So can you explain what the DeepSeek release was and, and how it compares to the other models and why this was somewhat controversial?

Sierra Haex (13:45):

Yeah. So the reason why DeepSeek is such a big deal is the cost of training models that, say OpenAI uses ChatGPT, were in the billions of dollars. It was a ton of money to get the data ingested and then using the best hardware on the planet, they were able to train these models. The thing that really kind of shook the industry was that DeepSeek didn't have to have that upfront cost, it was just in the millions, I believe. So it was orders of magnitude cheaper to develop. They didn't need the best hardware in the world to train their models. And it's been kind of an upcoming revolution.

There's even more recent models that came out, like the day before yesterday I believe, it's called S1, and it was trained on like $50 worth of Grace Hopper GPUs. And so, in each of these models, they're tested against a benchmark of questions. And a lot of these models are like a PhD level answering PhD level questions. And yeah, it's wild to see the progress and how quickly things are jumping.

Dan McInerney (14:58):

Yeah, the moat is just not big around the major like model creators anymore. DeepSeek is also a Chinese model, which has brought consternation to some people in the industry because they don't want to share their data with the Chinese. And I feel like, just to clear up a little misconception here, DeepSeek is an open source, open weight model, meaning you can read and edit and change it however you want. There does not appear to be any way of the model file itself sending data back to China. You know, it's been scanned by many other people to make sure the model file itself has been safe. It is when you get it from a reputable source like Hugging Face, for instance.

The only issue you would have is if you use DeepSeek's API, you go to chat.deepseek.com, that is sending data back to a Chinese server and a Chinese company. And, you know, they're free to do with that data whatever they want. But if you just download that model in your environment, you're not exposing risk to yourself of data exposure. You can also just edit the model to do whatever you want. So, you want to make sure you're downloading an official release of it.

Sierra Haex (16:02):

Yeah, and Hugging Face is actually a really good place to get it from because...

Dan McInerney (16:05):

Trustworthy.

Sierra Haex (16:06):

The team has actually been recreating DeepSeeks model based on their paper that they've released. So yeah, there's some really cool work coming out of Hugging Face.

Dan McInerney (16:15):

The DeepSeek model itself is a test time compute. It competes with o1, ChatGPT's o1, is where it sits there, and it has like a chain of thought, and it's kind of funny to watch it think it goes, "But wait! There's this other variable in the code that I didn't think about. But wait," And so it really chats like a human to you. It's very cool. But I don't think that it's really a risk for US companies to use this model because it's open source and open weight. And that's a very, very important fact to know when you're trying to secure your personal organization and you're using external models.

Sierra Haex (16:45):

Well, and it's such a productivity boost as well, because if you look at ChatGPT, the amount of compute required to run that, you can't run that locally. But you can run DeepSeek locally.

Dan McInerney (16:57):

Yeah, if you've got what, like four A100s?

Sierra Haex (16:59):

Yeah, yeah...

Dan McInerney (17:00):

It's a little expensive, but I mean, I think it's cheaper than, you know, if you were to import all of the mixture of experts that ChatGPT has.

So we talked about like supply chain security, that's the APIs, the infrastructure that you use to build the models, to train the models, to store the data, all that stuff. That's, that's area number one that you wanna focus on when your organization deploys AI.

Then we talked about the model files themselves. That's kind of area number two, which is unique to AI itself. So you can use tools like Guardian or ModelScan to make sure that your models are safe. But also, with that, goes the visibility of do you know when that model was trained? Do you know what data it was trained on? All that stuff kind of fits into the supply chain aspect of this too.

Step 3 would be LLM security. And so I think you have some thoughts on specifically threat surfaces for various deployments of LLMs.

Sierra Haex (17:52):

Oh, yeah. So considering if you're deploying within your company LLMs, you have to kind of look at, is this gonna be external facing? Is this gonna be internal facing? If this is external facing, how much can your customers or just random people on the internet, how much can they do to your system? Is it plugged into APIs on the backend? Could it be used to compromise your network? But also on the internal side, if you want to train your own models, you also need to be very cognizant of the sort of data that you feed it. Like, are you, are you feeding it customer data? Are you feeding it PII? Are you feeding it PHI? What are the relevant laws around your jurisdiction as you do that...

Dan McInerney (18:30):

Data security becomes a major issue when you're using it internally. But also you have to make sure that if not just the training data itself, but also a lot of these tools are using RAG databases, and so it seems like you gotta make sure that what the RAG database is ingesting is not gonna spit out something that the user's not supposed to have. So, like, access controls on LLMs feels like a very underutilized sort of security measure right now.

Sierra Haex (18:57):

Yeah. There's definitely a lack of controls around it and a lack of visibility and introspection.

Dan McInerney (19:02):

Yeah. I see the most common pattern of deployment for internal LLMs is a RAG database, which an LLM can connect to, and it's usually surrounding some internal documents, either documentation of a service or just documentation of some product that your company puts out.

Sierra Haex (19:18):

Right, these are the support bots that you get redirected to when you go to a company website and you're like, "I wanna talk to a human." These are actually querying just a series of documents that the company has published.

Dan McInerney (19:28):

Right. And so you have to be careful to make sure you're not putting in sensitive information on the RAG database. And I think that the access control problem is just, there's no good framework. Like there's just nothing out there you can just plug and play and say, "I only want these people to have access to this data and the RAG database." All that kind of control stuff can be done in SQL, like in traditional databases that can be done. Just, I don't really see a good solution at the moment to do those access controls on these LLM applications like that.

Sierra Haex (19:58):

Absolutely, yeah.

Dan McInerney (19:58):

So what about agents? This is the future I feel like. Today is the LLM application, but agents - define what an agent is.

Sierra Haex (20:05):

Well, an agent is a discreet model that lives on hardware that you have or your own computer, and you're able to query it and it's able to perform actions on your system for you. So, an example of this would be like, "I need help making this type of function. I'm a coder and I need you to write a chunk of code for me," and then it'll kind of introspect your question and then actually spit out some text into your code editor. Well, this comes with inherent vulnerability of like, how much access do you give it to your system, right? Like how, what's the power in that? And then also how much do you trust the model having access to your system?

Dan McInerney (20:45):

Right, and so it comes back to there's really no access controls around the tool usage of these agents right now. So the danger of the agents, of an internally developed agent, would be like an rm -rf of your whole disc. Hey, you know, this agent's supposed to just clean up log files once in a while. You just tell it in natural language, clean up the log files today, and it goes and deletes your entire home directory. How do you control against that? Right now I don't think there's a wonderful way of doing that. It's just a threat that I think you should have in the back of your mind if your organization is deploying agents. And I think this is the year that organizations will really start deploying agents.

Sierra Haex (21:16):

Well, and I think they're exploding because they're extremely useful. And as we move along, I'm sure bad things are gonna happen. We just have to keep developing things around...

Dan McInerney (21:29):

It's like when we were looking at the infrastructure two years ago. You know, all these applications that supported machine learning models and stuff that we found were just full of holes, like Swiss cheese on it, like hacking in the nineties. I think agents are gonna be that coming up because everyone wants them and they're so useful. Did you see that OpenAI had released a couple agent frameworks?

Sierra Haex (21:50):

Oh, no. Do tell.

Dan McInerney (21:51):

They had… Claude first released MCP, which was computer use. So we can actually take control of your computer. And then OpenAI just released Operator framework. Which you can schedule it to do tasks. And then deep research. Where, deep research is amazing,

Dan McInerney (22:05):

It's so good. It's just, it just collects all this information from the internet, digests it for you... I mean, what would've taken you eight hours to put together a mediocre report it can do in 20 minutes if that. So those operators are incredibly useful. The paradigm we're seeing with agent deployment right now is the OpenAI paradigm where they control the agent. But I think it's not a far reach to create your own agents in-house. It's just not that difficult. And they have a lot more flexibility. So I think when you start trading agents in-house the security risks are there and there's not a lot of solutions to the security risks of the agent overreaching. You'll have to put those constraints in manually at this moment, I think.

Sierra Haex (22:49):

Absolutely. Yep. And as models get smaller and more efficient and they become ubiquitous and just in products that we use, there definitely needs to be an oversight of security as they are implemented.

Dan McInerney (23:01):

Right. Which I, I don't have a lot of faith in organizations to do that themselves because I look at router security, you know,

Sierra Haex (23:07):

It'll be fine.

Dan McInerney (23:08):

Router security, you know, well, everyone needs a router. And so we have all these different routers and they're all just full of holes. It's like, it's been 15 years and we're still dumping router vulnerabilities. Despite the fundamental security nature of those things.

Sierra Haex (23:21):

It's wild.

Dan McInerney (23:22):

So, we are here at Wild West Hacking Fest. Me and Marcello Salvati gave a talk here. Sierra is joining us, and we wanna give a big shout out to Wild West Hacking Fest. It's a great conference. It's really fun to hang out with everybody here in Denver. So, once again, I'm your host, Dan McInerney. Thanks for listening. This is Sierra Haex an AI security expert, even though she doesn't wanna admit it. I think the field is so nascent, if you've just been doing it for, you know, six months, I think you, it's pretty fair to call yourself an expert at that point. Even though we've been doing it for years now.

Sierra Haex (23:52):

Well, thank you so much for inviting me. I really appreciate it.

Dan McInerney (23:55):

Yes. And we'll see you all next time.

[Closing]

Additional tools and resources to check out:

Protect AI Guardian: Zero Trust for ML Models

Recon: Automated Red Teaming for GenAI

Protect AI’s ML Security-Focused Open Source Tools

LLM Guard Open Source Security Toolkit for LLM Interactions

Huntr - The World's First AI/Machine Learning Bug Bounty Platform

Thanks for checking out the MLSecOps Podcast! Get involved with the MLSecOps Community and find more resources at https://community.mlsecops.com.

View full post