Rethinking AI Red Teaming: Lessons in Zero Trust and Model Protection
Audio-only version also available on your favorite podcast streaming service including Apple Podcasts, Spotify, and iHeart Podcasts.
Episode Summary:
This episode is a follow up to Part 1 of our conversation with returning guest Brian Pendleton, as he challenges the way we think about red teaming and security for AI. Continuing from last week’s exploration of enterprise AI adoption and high-level security considerations, the conversation now shifts to how red teaming, zero trust, and privacy concerns intertwine with AI’s unique risks.
Transcript:
[Intro]
Charlie McCarthy (00:08):
Hi again, everybody. Welcome back to this week's episode of the MLSecOps Podcast. This is actually a continuation of last week's episode. Some of our regular listeners may be familiar with our guest this week, Brian Pendleton. He had so many wonderful insights to share that we had to break this into two. So last week, go check out that episode if you haven't already, we talked about themes related to adopting AI within your business and security considerations to take there. This week, we're diving into things like AI red teaming and additional security insights. So we hope you enjoy! You can see all of our episodes at mlsecops.com.
[Interview]
Charlie McCarthy (00:54):
Let's talk about something that we prepped for that I am super excited to get your take on: AI red teaming. Or, I will rephrase that as red teaming for AI, because I want to be clear. Something else that we try to help people understand through the course of this show is that are a lot of terms that are still being defined and used interchangeably, and there's a difference between maybe what some folks refer to as "AI security" versus "security for AI."
So you've got kind of this dual use in the industry where a lot of folks are now using AI - GenAI, LLMs - for red teaming purposes in more traditional cybersecurity. And that's very cool [but] not what we're really talking about in this particular use case. I'm more interested in how would you want tosee the cybersecurity field, Brian, adapt current playbooks for things like penetration testing and threat modeling specifically for AI systems?
Brian Pendleton (01:53):
And I actually...
Charlie McCarthy (01:56):
You may wax poetic. AI red teaming is...
Brian Pendleton (01:59):
Well, what I actually want to do, I will talk about it, but I want to take just a minute and say, one of the things that has become very popular is this term AI red teaming. And a lot of companies go, oh, I've red teamed my model, so you're safe. And what we don't talk about is blue team, right? And it's because, again, a lot of these smaller companies don't seem to have good security practices, or they say, my security, the way I'm going to secure the system is through AI red teaming. But red teaming is one small part of a whole security plan.
Charlie McCarthy (02:39):
Okay.
Brian Pendleton (02:39):
Right? And so what I would really like to hear in cybersecurity is more talk about how are we blue teaming? What, you know, the day that somebody says let's have a model, where's the security team being brought in? And how are they impacting the design of the model from day one, right? Because I don't necessarily think that's always happening. With the bigger companies, absolutely. But even with them, I would say maybe not as much as I would like to see.
Charlie McCarthy (03:14):
So should we be calling this effort like purple teaming instead? Like, do you need both at the same time? Or parts of the process that are close to get, how do you think about that?
Brian Pendleton (03:24):
Okay, so here's where I start going, everybody needs to stop with blue team, red team, purple, green, white, all this. It's like, it's just a way to make a separate team. And when I go back to security is a team sport.
Brian Pendleton (03:41):
We don't need blue, red, blah, blah, blah. And I think about it from when I was in the military, just in the imagery. We didn't just attack. Sometimes we had to know how to dig foxholes and stuff. And everyone says it's way too basic. And I'm like, no. If I want to be a good attacker, I need to know how to create a defense as well. If I want to be a good defender, I need to at least have some experience in how I would be attacked. Right?
And, I don't care if that is on a street fight in the military or on a computer system. And in my opinion, we've become far too focused on I am a defender, or I'm a attacker. And I think to some degree, that's why people recognize that and then said, okay, let's do purple team, because it brings it all together.
Brian Pendleton (04:37):
But when that happened, I was like, well, why can't we just say we're in the security team, because that's what we were before. You know, when I started off, I was in networking and, well, actually I was a CIS admin. Which, when I got in the field, we did it all. You know, we did security, we did networking, we did everything. And then as Cisco started blowing up because networking and the internet started blowing up, and there became people that very needed to be focused on networking equipment, and they became network engineers.
And it was always funny to me, and I was like, two years ago when you were with me, you were just a cis admin like me. Now you're a network engineer, you're making 25,000 more, and you know less than I do because you haven't kept up with everything that I've had to keep up with in knowing everything, right?
Brian Pendleton (05:38):
And so, that's where I see an issue, in my opinion, with a security field is we split up. We have too many teams, and there's not enough collaboration, and there's not enough people that say, okay, I want to red team this year, next year I'm gonna go on defense and take everything that I learned from the red team, and I want to be on the defensive side so that I can try and help defense. Or, you know, somebody else, you know, on the blue team going, you know what? I have been doing the best that I can, creating this defense I need to get into that head space of attacker so I can come back to it and build better defenses.
Charlie McCarthy (06:21):
That's really interesting. I don't think anyone on the show has put that the way that you have before, and it kind of makes sense. Like, you think about, you gave the military example, but even in like boxing, if you're gonna be a great boxer, you need to know how to throw a punch. But you also need to know what your opponent is gonna be doing or be able to predict, you know, both sides to have this well-rounded understanding of what it is you're doing.
Brian Pendleton (06:44):
Sure. In boxing, I mean, you can just be a brawler and just try to beat someone down. But while you're swinging your arms, if someone gets right in punches you right in the face while your arms are out and you drop like a sack, well, great. You were all offense, but you didn't get to them quick enough, so you lost, right? And the other thing that I don't like about red teaming, again, I was in the military, and red teaming is a critical thinking exercise. It's not attackers and defenders. So it's meant to say, here's my plan and here's everything that I've thought about it. Now, you think about it and tell me where I've made mistakes. And the funniest thing is, yes, we can do penetration tests and stuff, but if you are well-versed in networks, let's just talk about a network.
Brian Pendleton (07:52):
If you're well versed in networks and someone shows you their defensive plan, we can do a full pen test. But a someone who's experienced can also just take a look at that plan and probably very easily go, did you think about this, this, and this? Because I don't see it in your plan. And maybe you did and because of expense or whatever stuff, you purposely left it out. But I don't see that. And to me, that's really what a red team is. Somebody saying, did you think about this? Or did you think about someone using your product in this way? Or, you know, why did you even want to make this product? Did you think about a completely different product? You know, it is asking why and trying to point out you know, it's critical thinking.
Brian Pendleton (08:51):
It's not attacking, but everything that I see nowadays really is, oh, I'm red team. I'm an attacker. Like, well, if you're a red team, you're not supposed to necessarily be finding the vulnerabilities. You're supposed to be figuring out the larger, what could be wrong. So maybe if you're like, really low on the totem pole, yeah, it's your job, go find the vulnerabilities. But at a higher level, the people that are looking at red teaming, it should be more about concepts than it should be, I need to know that this one protocol can be attacked. You know, it's, I don't know if I can say it better than that.
It's other thing, there is the army red teaming manual. Which I tell everyone, if you want to be in red teaming or if you just really want to develop your critical thinking skills you should download the army red teaming manual. Until red teaming became popular, it was actually called the critical thinking manual, but then everyone started using red teaming, so they switched it over to red teaming.
Charlie McCarthy (10:05):
Interesting.
Brian Pendleton (10:06):
And I'll give you the link for that as well.
Charlie McCarthy (10:08):
Thank you. Yeah, we'll put it in the show notes. I mean, I have to imagine, you know, I really like the points you're making about like, do we have to label everything with all of this terminology, [the people involved] in security, it helps to have a more well-rounded understanding of all of this. And I do lean into that. On the other side of the coin, I wonder if it's fair to say, like, when we were initially talking at the first part of the show, I can't remember how you described it, Brian, but you had given the reason for your career being so varied and interesting because you're a natural...
Brian Pendleton (10:42):
I'm a troubleshooter.
Charlie McCarthy (10:43):
You're a troubleshooter. Yeah. And so I wonder if some of these people who get really comfortable in this red teaming niche are natural... They like to break things. There are people that are just, they have this natural affinity for getting in and doing this really specified piece. And, you know, I think maybe there's something to that.
Brian Pendleton (11:00):
And I don't disagree with, I mean, it's funny we talk about personalities like that. First off, I'll say, people that are blue teamers need to be able to break things as well, right? Because they have to be able to go in and figure out, oh, everything that we just did isn't gonna work. And they need to be able to see that it's not gonna work even after they put in a bunch of time.
And if it's a team environment, in my opinion, again, I like saying that because I know some people, sometimes I can come off very much like, this is how it should be. Your red team, if you have a separate red team, while you're doing the initial designs, that red team should be a part of that so that they can, instead of waiting until a product is halfway done, and then going "see, look what we can do to it," to help the organization both cut down on development time.
Brian Pendleton (12:06):
And, you know, built in security is always better than baked on security, right? Always will be. So if the red team is talking with the security team from day one and looking over their designs and stuff, and being able to go, Hey guys, remember in our last engagement we beat you because of this, and look, you're doing it again here. So let's, let's take the lessons that we had learned from that. Can we put it in now before you get to that point where you're gonna say, "okay, red team, let's take a look at this again." You know and that's how they both learn from each other is while you're actually doing, building the product, not afterwards where you're nominally saying, "Hey, okay, let's attack and defend."
Brian Pendleton (13:00):
When we're talking about products, right? And even if it's like a network, same thing. It doesn't matter if it is a product going out, if I'm having to build a whole new network, I would rather have my red team come in as I'm building it to help remind me to say, "Hey, remember this, remember this." Like, and even though being kind of the asshole that I am, I'd be like, I do remember that I do. It's still nice because you don't remember everything, right? So having it right there at the beginning.
It's also one of the things, I know you asked about AI red teaming, and there is a lot of AI red teaming that to me, isn't security red teaming, it is safety red teaming, which we need. But when they find a safety issue, then that red team throws it back on the security team and says, well, you screwed up.
Brian Pendleton (14:05):
Well, no, the security team has a set responsibility, and there is a safety team. Or there should be. Now maybe it's a small enough company or organization that maybe the security team is the safety team. But what I'm talking about, for instance, is if I can with an LLM, if I can jailbreak it and get it to say racist stuff, or if I can jailbreak it and get it to tell me how to, to do VX poison. Or I get it to tell me the best way to do something illegal. Right. So those are safety issues. They're, they're not security issues. Right. And then a lot of times they'll also go, oh, I got it to tell me the training data. So that's a privacy issue. Not to the security team. Safety, security, privacy, they're three different fields.
Brian Pendleton (15:11):
Sometimes they intersect, sometimes they don't. But they, at the beginning, we always have to look at them separately. Because, for instance, there's different laws and rules for privacy. Right? There's different laws and rules for safety. So we've gotta start at the bigger levels and then find out where they intersect. And far too often a security team will get yelled at for something that really should have been a safety or privacy thing. Even if the security team was the team that would eventually be tasked with fixing the issue, they may not have had the understanding of privacy or safety to know that they needed to look at it. Whereas an expert in privacy or safety would've been able to say, "Hey, we have to do X, Y, Z. So when you're going through and doing, making your security plans, you need to also think about this."
Charlie McCarthy (16:13):
Right. This might be a good transition to my next question, which is around how do you feel about - I'm throwing a lot of terminology at you because I can tell we love labels <laugh> - zero trust. These stakeholders - safety, security, privacy - how many of those stakeholders are involved in zero trust, or actually, let's just start with what are your thoughts around that whole concept?
Brian Pendleton (16:44):
As a concept? It is great. And where we should go. As an actual implementation, what I have seen so far over the last like two years, eh, hasn't been, it has not lived up to its hype. Even though people are talking about zero trust in AI I don't exactly see an implementation path soon. It's gonna take a little bit more time before we figure out exactly how zero trust is going to fit into these systems I think.
Charlie McCarthy (17:27):
For security professionals in the field now, Brian, who, you know, want to get in on the AI game school themselves on AI driven threats, can provide any recommendations for either upskilling, like where you would start, or maybe even how you got started in your research or even people resources that you've used as learning tools?
Brian Pendleton (17:51):
So, this is one area of, I'm gonna air quotes "research," that I've done over the last, like five years, is I really thought more and more about this ML security engineer position. And the one thing that I have found anecdotally is that as a security person, I believe that you are very well positioned to move into the ML security space. I think that security people by nature tend to be very curious. I think that they tend to be focused and like to solve problems. So when there's something that they just can't solve, they don't just quit, they just, they keep at it. Right? They're used to picking up all different types of skills that help them get a job done.
And, you know, here's where I bag on the ML teams again, in trying to get some of them to learn more about security, it has not gone nearly as well as teaching security people about machine learning concepts. And I think for the most part, it's not because there's not the you know, the ML teams definitely have the intelligence and the skills and everything. It's just that they want to build models or work in data engineering or whatever, you know, they're not thinking of the same way that I think that your typical security person does.
Charlie McCarthy (19:31):
Yeah. They probably just want to build cool s&!*.
Brian Pendleton (19:33):
Right, right. And most of the time your security people want to know how stuff works. And then once they learn how it works, they want to know, okay, now how can I break it, you know? And so it's just a different mentality, but I think it works very well for that ML security engineer position. So I would immediately tell everyone, like, if you're gonna go to a college for it, don't. There is so much free and relatively free stuff on the internet. I mean, Hugging Face has stuff. Free Code Camp is an amazing resource, and it's all free. Deep Learning.AI.
Charlie McCarthy (20:26):
Oh, Andrew Ng, yeah.
Brian Pendleton (20:27):
Yep. Andrew. His stuff, all is free, right? So it's all out there. Now, if you have to be pressured, like for me, I'm so ADHD sometimes if there's not somebody just like there poking at me, you know I'm in the middle of something, it's like squirrel! You know, and I go off and want to do something else. So if you're like that and you need that pressure of a classroom or something, great. Do what you have to do to learn it. But in general, you know, you can learn everything that you need out on the internet. The one thing that will be very controversial, you don't need to learn any mathematics. Anytime you see anybody say, you need to learn math, tell 'em no, you don't.
Charlie McCarthy (21:18):
Are we talking about specifically for like the AI security piece or the ML security engineer role that you're kind of...
Brian Pendleton (21:24):
Yeah. Now, if you want to get into data science and you want to get into machine learning engineering, yes, there's some math you need and you're gonna be - but for a security role, you don't need to, and I've been saying this for years and years, and everyone is like, no, you have to figure out gradient descent, and our attack has to do this one. And I was vindicated by a paper that Microsoft just put out not too long ago, and then, I have it right here, there's two things I want to point out. They say, as one of their points, you don't have to compute gradients to break an AI. And I think we absolutely understand that from LLMs, right? You just have to be able to write and you can jailbreak an LLM.
Brian Pendleton (22:14):
And I will tell you the same thing. If you are trying to attack a system that's being served through an API, again, that's back to API security. You don't need to know necessarily about the model. You need to know about API security, right? The impact will be on the model, but you can break through through the API. The second thing that they put in that, again, I've been talking about, is the human element of AI red teaming is essential. And I think that's the most important thing because you have to think about, coming from any security, you have to think about how somebody is going to use the product, the network, the software, you know, whatever it is. And you have to be able to put yourself in their shoes. And from there, you can then start thinking about how you're going to attack this system.
Brian Pendleton (23:12):
And that has nothing to do with building a model. It has nothing to do with the mathematics of, is it a linear regression model or a neural network or anything like that. This gets back to the, at the end of the day, it's about knowing humans. And, and just the same way that we look at security and we think about like insider threats. And we know that for some reason, like an insider threat, part of it is people are just gonna lose their laptops. And we still consider that a threat, right? It is.
So, you may think about, oh, I'm gonna create, for my company, I'm gonna create a laptop that's chained to their arm or something, and that's how I'm gonna get around it. Well, you know, you've had to think about it from the company's perspective of risk, the people that you know and stuff, doesn't have anything to do with math or anything like that.
Brian Pendleton (24:13):
Now, if you really want to understand machine learning, then maybe you're looking at maybe making that transition into machine learning. You know, there's a lot that's gonna have to go into it. I would still say that you can learn a lot of it online. But for the most part, if you've already got a good foundation in security, all you need to do really is the basics. And then that's your start. And then once you're doing it, you're gonna figure out where you want to go in and what you want to specialize in, or what you're interest in within AI security. Because even though it is a new field, a newer field, it's already starting to get different, you know, different little groups within it. Right? Maybe I want to be focused just on LLMs. Maybe I am really interested in trying to break a model.
Brian Pendleton (25:18):
And so, and at that point, if you are actually trying to break a model, well, first off, you have to remember there's a whole system that you have to get to before you get to the model. But if you've gone through and learned all the other things to get you to the model, great. Now, you probably do need to know something about math. If you are actually working, you know, on the model itself. If your thing is making sure that you are either able to detect or to poison data sets, great. You know, you can learn what you need to do that. But again, it's all out there. And I would tell everyone like a good initial place to start, dreadnode.io. They have what's called the Crucible.
Brian Pendleton (26:13):
And if you really just want to get into AI security and ML security, and you don't know where to start, start with their beginner's challengers, and they walk you through to where if as long as you know, some concepts about security and some concepts about ML, just very basic, you can start working through from their beginner, to their intermediate, to their advanced. And I can guarantee you, if you can get to the advance, you've got a great handle and you should be able to go tell your team, "Hey, if we're gonna attack an AI, I can do this." You know? Hugging Face has great just training on how to start working with models. And because they have that model repository, it can give you a chance to start looking at, okay, how could I attack this model? How could I use this model? And then from how I'm using it, start thinking about, okay, then what attacks come from this person using a model this way?
Charlie McCarthy (27:21):
There are also some good insights on Hugging Face too. A lot of the models are scanned by like, some Protect AI scanners, [HF] Picklescan, and so you can go in and see, you know, models that have been flagged as unsafe, kind of learn about where those vulnerabilities are and get a little back that way as well.
Brian Pendleton (27:38):
Yeah, absolutely. I was about to get to that 'cause it's, I mean, just like any virus or anything like that, if you wanted to play around with viruses, make sure before you download them you've noticed that they have been flagged, but put 'em on a system that you're okay with. But play with the ones that are already flagged as being broken and see what the different. You know, pull up one that's... I'm just gonna say like a very basic beginner model that a lot of people start off with is housing prices. Predicting housing prices in an area. So find one that is flagged as safe and one that's flagged as being, you know, already been broken. See how they operate differently.
Brian Pendleton (28:27):
You know, get that muscle memory of, oh, I can start seeing that this bad inference is happening this way. Or if I put this type of answer, it always comes back like this. Whereas the non broken one is doing things correctly. And I start seeing differences. I mean, you know, we are the ultimate model. We pattern predict or we can see patterns in data, in numbers, in the clouds, better than any machine ever will be. Maybe not as fast, but we recognize patterns. It's what's kept us alive for a million years, right?
Charlie McCarthy (29:09):
Right. Excellent pattern recognition.
Brian Pendleton (29:11):
And so just like anything. I know SysAdmins that can sit there and before even looking at the logs, somebody says, oh, this computer's doing this and this computer this. And they're like, oh, I know what's wrong with this one. Right? Because what they just heard, they've heard so many different times that they know let's go, you know, they already know that their computers are on this OS and have all this different stuff. So at that point they're like, I know what this is. Right?
Charlie McCarthy (29:42):
Right, but the computer in their brain has already gone like, "boop, there you go."
Brian Pendleton (29:45):
Right. So, you know, I always say, if you're able to work with these things, you'll start getting a feel for what you think it could be right or wrong with a model. And if you think there's something wrong with the model, then prove it. You know, if it says it's a good model, but you think there's something wrong, prove it. Work through the things that you might think could show that and prove it. You know?
Charlie McCarthy (30:14):
Go get your hands dirty.
Brian Pendleton (30:16):
Right? I mean, that's what we do as hackers, right? I mean, the hackers weren't about breaking into systems and stuff. The original hackers were about, I want to use something for a different purpose, or I see something that maybe no one else does as a use for this and I'm gonna use it, or everyone says I can't use it for this purpose and I'm gonna prove I can. You know, type of thing.
You know, it wasn't just about breaking into systems, even though, you know, that's kind of what it has become. But if you don't get your hands dirty and that's where you're gonna learn the most. I really think with some of this stuff, just like in regular security, if you're not in it...and that's one of the reasons why I tell everyone, you know, I'm honest with everyone, is I do not break models anymore. I could care less about models. Building a model is to me the most boring thing in the world. And I could probably teach a chimp to do it. And I'm sure there's ML people listening to this later on that, that are gonna go -
Brian Pendleton (31:17):
But where they come in, the way they come up with the features and the way they start understanding the data better... It's not the building, it's not the initial putting together of the model. It is really that refining of the model is where their expertise comes in. But everyone who says, oh, to be in you know... I will tell you most people that I've heard said, oh, to be in ML security you have to be an expert model builder and understand every single aspect of data cleaning and all this... No, bull$4!*. No you don't.
Charlie McCarthy (31:56):
All right. That's fair. We've got a bunch of good controversial stuff here. <laugh> This is gonna be a great episode. Alright, Brian, I feel like I could sit and talk to you for a couple more hours about all of this. You're so engaging and just these insights are awesome. That said, we've just got a few minutes left. So can I ask you, if it's all right, we'll share a link to your LinkedIn profile in the show notes so people can follow along with your professional journey. But just as far as the rest of this year, can you tell us what you might be up to in this space? Are you going to continue work with the AI Village or do you, what do you got going on?
Brian Pendleton (32:35):
So I'm continuing working with ARVA, the AI Risk and Vulnerability Alliance. We are just rebooting our AVID vulnerability database. So we're about to get that kicked off again. One of the founders of ARVA who had built Avid, also started another company. So things kind of slowed down, but we're back focused on that. You know, I continue to go up on the Hill and talk to representatives and staffers about AI security. So I will continue to do that. I would tell anyone who is out there, if you're in the DC area or can come in the January-February timeframes, look for Hackers on the Hill 'cause it's very important for experts that are out there. Not the CEOs of the companies, but the people who are doing things day to day, come and talk to representatives and staffers about what you are seeing out in the world.
Brian Pendleton (33:44):
And and they've got so many questions about this stuff and it helps them create legislation, right? So I would always tell people to do that. And I've been doing that for several years. I am currently applying for jobs, so hopefully I will be going into an organization where I can actually take some of these hot takes and see how they actually work. I will continue work. One of the things that I'm working on right now is a paper on red teaming, which the world does not need another paper on red teaming, except for mine is kind of the anti paper of why we need to stop writing about red teaming.
Charlie McCarthy (34:24):
Can't wait for that to drop.
Brian Pendleton (34:25):
Yeah.
Charlie McCarthy (34:27):
My eyes will be peeled.
Brian Pendleton (34:28):
And then also, you know, there's gonna be the AI Village at DEF CON in August and then CAMLIS in October. So I, again, if you're in applied ML security you should take a look at CAMLIS. It is a great collection of people and some actual security that's happening in companies. It's the one place where people are kind of honest about what they're seeing. So it's really nice to be there. So that's about it that I can think of right now.
Charlie McCarthy (35:10):
I mean, that's, that's quite a good little bit. So, yeah. Alright, well I hate to wrap it, but I think we're at the end here. Hopefully the first of additional conversations we get to have with you, Brian, but I just want to say again, it's an honor to have you on the show. I know we've been trying for it for a long time, so I'm glad we were able to make it happen and thank you so much for being here with us.
Brian Pendleton (35:31):
Yeah, thanks so much for having me. Like I said, I very much appreciate it. I know there's gonna be people go, who the hell is this guy? And I will tell you for a lot of things I have tried to stay away from presenting and I like doing things one-on-one and that's why this to me, it feels good because it's more like I'm talking to you than to an audience.
Charlie McCarthy (35:56):
Good. I mean, that is what it is.
Brian Pendleton (35:57):
That why I like going and talking to people on the Hill because I'm having a one-on-one conversation with people that can actually do something instead of just putting out another position paper or something like that, you know?
Charlie McCarthy (36:10):
Yeah, awesome. Well hope to do it again sometime. I appreciate you.
Brian Pendleton (36:14):
Always here and like I said, I will make time whenever you need it 'cause I enjoy talking with you guys. And like I said, I also have followed [Protect AI] and I think you guys are doing good work.
Charlie McCarthy (36:26):
Thank you so much. And with that, I'll close the show with thanking our sponsor, Protect AI and you, the MLSecOps Community members who keep tuning in every week. We will see you again next time.
[Closing]