Crossroads: AI, Cybersecurity, and How to Prepare for What's Next
Episode Summary:
In this episode of the MLSecOps Podcast, Distinguished Engineer Nicole Nichols from Palo Alto Networks joins host Mehrin Kiani to explore critical challenges at the crossroads of AI and cybersecurity. Nicole shares her unique professional journey from mechanical engineering to AI security, her thoughts on the importance of clear & aligned AI vocabularies, and the significance of bridging disciplines in securing complex systems. They dive into the nuanced definitions of AI fairness and safety, examine emerging threats like LLM backdoors, and discuss the rapidly evolving impact of autonomous AI agents on cybersecurity defense. Nicole’s insights offer a fresh perspective on the future of security for AI, teamwork, and the growth mindset essential for professionals in this field.
Transcript:
Intro 00:00
Mehrin Kiani 00:07
Hello everyone, and welcome back to the MLSecOps episodes. I'm Mehrin Kiani, your host for today's episode. I'm a Machine Learning Scientist at Protect AI. Very excited for today's episode as we are joined by Nicole Nichols, a Distinguished Engineer at Palo Alto Networks. Nicole, thank you so much for being here and welcome to the show. Before we jump into today's discussion, Nicole, could you tell our listeners a bit about your background and your journey into the field of AI and cybersecurity?
Nicole Nichols 00:36
Absolutely. I came to it like many people from other disciplines. Initially I was a mechanical engineer and I worked at Woods Hole Oceanographic Institute on autonomous underwater vehicles. And when I was there, I kind of was immersed in a world of academic research and a lot of people around me were doing PhDs, and I got really enamored with the process of research and trying to understand new scientific questions. And so I decided to do a PhD and I initially went into oceanography, but it was in that, that I realized that my favorite place to work was actually at intersections of different disciplines. And what I wanted to do was specialize in having a tool set that was universally applicable to a lot of different domains. And so I went back to graduate school for a PhD in electrical engineering specializing in single processing.
Nicole Nichols 01:34
And so I worked on marine mammal acoustics in that process, but fundamentally the mathematics are very similar and were foundations of machine learning, and so when I was in graduate school it was right as the point where deep learning was becoming the latest, greatest thing. And people were recognizing how we could effectively use those models more than we had used similar architectures before. And after I graduated I went to Pacific Northwest National Lab, and that's where I transitioned to applying this to cybersecurity. And so from there I've kind of continued delving down that rabbit hole. And there's kind of a second pivot point for me, which was there was the paper, "Intriguing properties of neural networks," [Szegedy Et al.] and as soon as that came out, I just became enamored with this notion of adversarial examples and all of the unexpected things that we can do in machine learning models that just kind of trigger these behaviors that we don't think should be there. And every time, and I think about this continually now, people are constantly trying to understand like, what is the security envelope and how can we best secure these models? And just in the back of my mind, I'm just like, we're gonna be surprised again. And, and I think that's what keeps it really interesting for me. So that's, that's kind of my journey across <laugh> a lot of different disciplines to here.
Mehrin Kiani 02:57
Yeah. Thank you for sharing. Yeah, it's fascinating how you know, everyone or most of the folks, you know, come from different backgrounds and, you know, we are all working now in this, you know, AI and cybersecurity space and, you know, making advancements in it irrespective of our background. So let's start by talking about an article you recently co-authored with a colleague titled "Fairness and Safety of LLMs."
Nicole Nichols 03:21
Mm-Hmm. <affirmative>
Mehrin Kiani 03:22
Can you give us a high level overview of what the piece is about and what motivated you to explore these two concepts together - fairness and safety?
Nicole Nichols 03:32
So I think that I published this piece back in January, I believe it was. And I think that the dialogue has continued to evolve. And one of the things that I'm happy to see is I, the, the reason I published it in the first place was that there is a lot of colloquial usage of language in the domain. And I think because AI is advancing so fast that, you know, it's hard for experts to keep up, and there's so much discussion that it's hard to pin specific technical definitions to things. And in the excitement, sometimes we use words that may mean other things. And so I've noticed that particularly around fairness and safety and security, they start to become interchanged, but from a methodological point of view they mean different things. And so fairness, when we think about that it's - is the model providing equal opportunity to all the people that are using it?
Nicole Nichols 04:34
And safety, I think of as - are we able to comply with legal regulations? Are we preventing cybersecurity attacks? Are we preventing data exfiltration? And they, if you think about the methods that support those, there's different evaluations for how you do that. There is an overplay - or interplay - between them in that, from a statistical perspective, if you have an outlier - whether that outlier is an issue of fairness or an issue of security - it could potentially be...so like if you don't have enough data samples in your dataset: if it's a security element or a fairness element, they can both be manipulated for increasing a backdoor or a trigger to do some other behavior. But there's, so in, in the, in the paper, we kinda had this quad chart of fair, secure, and kind of the cross correlation between them, because I think that when we think about it, it's easy to say that a model isn't fair, but it is secure or is secure, but not fair. And I think that there's, to help, there's - to be able to look at the distinctions of it can help ensure that when we're designing models, that we are meeting both of those goals.
Mehrin Kiani 05:58
Yeah. It's, it's interesting, you know, how, you know, we have often talked with several guests on this show you know, about different definitions for safety and security and how some of these, you know, they, you know, they have such a, a strong interplay, but that does not imply that they are the same thing. So this topic brings to mind another interesting challenge that you've mentioned related to difficulty, you know, within the industry to establish a common AI vocabulary.
Nicole Nichols 06:23
Mm-Hmm, <affirmative>
Mehrin Kiani 06:24
How do you think about, you know, having, you know, if you have more clear definitions and industry alignment on these definitions do you think that would have a, have an impact on how we are now, you know practicing AI security?
Nicole Nichols 06:41
I think so. I think one of the things that's important for vocabularies as we think about evaluations, so a common example is we talk about jailbreaking models. And so in that we have to understand what is the anatomy of a prompt and understand that jailbreaks can occur for different purposes. And if we lump everything together under one umbrella of an evaluation, we don't know what kind of coverage we have. So say you're trying to detect jailbreaks if you only look at it as a monolithic category, then you're likely to have your model be missing a particular class. So maybe you can easily detect hate speech, because there's a lot of keywords that are easier to pick up on. But other things that are perhaps, you know, trying to exfiltrate data may be harder to detect. And as we think about building evaluation systems, we need to understand what kind of coverage we have across the spectrum of types of things. And I think the challenge is, as a community, when we're coming up with these vocabularies, because the field is moving so fast, we don't have time to gel on a consistent definition before moving on to the next challenge. And so I think it's important to take a growth mindset on this and try and reflect that maybe our definitions are wrong, and to work as a community to figure out what is the definition that makes most sense as this field continues to evolve.
Mehrin Kiani 08:14
Yeah, it makes, makes a lot of sense. Also when we were prepping you, and even now too, made a comment of on, you know, how a growth mindset when working with, you know, technology like AI would benefit us all, can you share some of those thoughts with our audience, you know, including how, you know, growth mindset in particular can help cybersecurity professionals navigate this, you know, rapidly evolving AI landscape?
Nicole Nichols 08:36
Absolutely. I mean, I think this is one of the things that I'm really passionate about trying to help bridge is that there's a lot of people who have deep technical expertise in cybersecurity and deep technical expertise in machine learning. And it's very rare for someone to have both of those skills. And I think that some of the most important questions are going to be solved at that intersection. And I have friends and colleagues who have, you know, graduate degrees and various elements related to machine learning, and they feel like they've missed the wave. They're like, they're, "I'm too late." And I remind them that realistically, everyone is still learning and it's not too late, and so it's important to challenge yourself and be willing to fail at times and to be open to understanding what is it that I don't know about either the other discipline or something that's new in terms of like, maybe there's a new architecture for RAG databases or context retrieval, or, you know, there's a thousand different topics that are all growing at breakneck pace, and it's, I think it's impossible for every human to understand that.
Nicole Nichols 09:39
And so I think we need to be a little more humble as we approach other people and recognizing what knowledge they do have currently or don't have. And recognize that, you know, we have the potential to help each other. And I, like I said, I think at that intersection, particularly around security of AI, is where we need to overlap more and build more common dialogue so that we can understand what are the elements of cybersecurity that are most important from the principles of securing complex systems, and what are the techniques in AI that can best support those goals.
Mehrin Kiani 10:17
Thank you. Also, you know what comes to mind is, you know, as, as we've been discussing these, you know, particular terms when it comes to, you know, LLM security in particular, but as you know, we are, you know, this whole space is evolving, you know, so fast. And now, you know, we are, we are, you know, at on the brink of, you know, AI agents. How, how do you think you know, of, of those, you know, security concepts, you know, for AI agents, do you think we'll ever achieve a common definition when it comes to securing AI agents? Or, you know, will this remain as fluid, you know, as, as the technology evolves?
Nicole Nichols 10:53
I think we can. But I think that's even more nascent than some of the vocabulary around jailbreaking and backdoors and LLMs and foundation models. When we think about agents, there's actually a really long history around agents. And at times, you know, things such as decision trees have been considered agentic. And now there's kind of a new era of agents that is emerging, and I've been talking to colleagues recently about you know, what this longer trajectory of landscapes has been. And there, there, across this history, it seems like there has not yet been a common understanding of what it means to be an agent. And I think that in some sense, being willing to accept a loose definition may help us solve the problem because there's kind of the distinction without a difference that could be occurring, and so we wanna focus on what are the challenges that we need to solve in that space.
Nicole Nichols 11:55
When I think of agents, I think of in, in the modern, modern new, as of two weeks ago, I don't know, this all is moving so fast, <laugh>. And in the, in the area of agents where people are particularly interested right now, it is that merging of using a foundation model to have a natural language interface to a system which can plan and take actions. And I think that what I've seen is there's a lot of LLM apps where you have a chat bot and you're interacting with it, and there's some notion of memory in that it may remember the last five questions you had. But it's not quite the same element as understanding a state of the world, and then trying to recognize the interplay between all of the different elements in that world space and taking actions. So like autonomous cars are much more like an agent, but, you know, five years ago we've still been working those from then until now, and they haven't particularly been using LLMs. So as we think about agents I think it's important to understand that there's a lot of different technologies that can enable those behaviors. And at the end of the day, they're just another version of a complex system. And now we're adding a different tool into the toolbox to make agents.
Mehrin Kiani 13:17
So in terms of exploring new threats you know, let's talk a bit about backdooring. You know, it's a specific threat that's increasingly relevant with LLMs.
Nicole Nichols 13:25
Mm-Hmm. <Affirmative>
Mehrin Kiani 13:25
And it's also an area that, you know, you have done some research. Could you briefly explain what backdoor attacks are and how they've been studied in the context of LLMs or AI systems? How serious is this threat? And is there anything that makes it particularly challenging to defend against?
Nicole Nichols 13:39
So backdoors are, when you have some manipulation of the training data, I like to think of it as square pegs and round holes in some sense. If you think of the model as some sort of filter mesh, and then you have objects that you're trying to fit through it, where those objects are the adversarial examples. And so when you train a model you give it, you know, millions of examples. And if you selectively either add or remove or perturb that training data in a way that you can intentionally place a hole in that filter where you want something to pass through, then when that model is used, you can add that trigger to an image. And then when it is processed by the trained classifier, it is misclassified. And there's a variety of different ways where you can either have it be a targeted misclassification - so like, "I want to have dogs consistently misclassified as cats," or "I just want dogs to be misclassified as anything else."
Nicole Nichols 14:45
And there's also kind of the notion of degradation attacks where you just don't want it to be perfect. And so it's the notion of, you know, previously the model might have had 98% accuracy on classifying dogs, but we're just trying to degrade that performance so that it's only 40% accurate. So there's different attacker goals that you can have with that. And it can be done in different domains. And some people are surprised by the fact that even though language has, you know, a semantic meaning to us, so that we think that we could detect that there are changes in a sentence such that it would not be tampered with, but there are ways of using particular vocabulary or sequences where it's subtle enough where it wouldn't necessarily be noticed as having a trigger embedded in either or a paragraph or sentence.
Nicole Nichols 15:35
So it, it's something that can be used across different things. From what I've seen, most of that research has been around the deep learning models. It has been applied in large language models, but I think it's a more unexplored space. And as we think about it, I think there's always been a question of what is happening in the real world? And even for LLMs now and other types of models, I think that you have to have systems that are actually looking to see if there is that sort of attack. So if we ask about, you know, how, how realistic is this threat, I think that there's tons of examples that show that it's highly possible. And some that have shown this happening at scale, you know, [for example] buying web domains so that things can be manipulated as people scrape data from the web because those domains have been purchased and reassigned in different ways that - you can do this in the real world at scale.
Nicole Nichols 16:41
But the question is, if we're going to look to see if a model has been backdoored in the real world, we can show it academically, but the people who are running those models need to scan to see if they have been backdoored. And right now, that isn't something that I've seen being done at scale. There's a lot of work under the TrojAI program that was a DARPA run program to look at how you can detect those. Most of those methods are specific to either modality or to goal objectives, and they were having a fair amount of success. But again, I don't think it's yet been translated to LLMs as much. So we have a lot of depth of understanding in traditional deep learning models, but the extension to LLMs I think is an open space that needs more research.
Nicole Nichols 17:33
And I think that the question of trying to understand - I think it's, it's a bigger question than just backdoors. I think that for a lot of different questions that I've seen people say, how many of the spam messages that we're blocking were AI generated? We need to actually put into our system a check to say, do we think this was [a backdoor attack] or not? And I think that there needs to be more of those integrations to be able to answer those questions because right now we're mostly working off of academic possibilities, and we need to have the tools integrated into our current ops to be able to say, we think this was or wasn't generated by an AI generative system, or, you know, whether a backdoor is present. And I think those signals are subtle. I think part of the reason that they haven't been more broadly adopted is it's complex to integrate. There's other lower hanging fruit in, in the tree of security. And I think that, and it's hard to prioritize it in the scale of what do we need to build next when we don't have data to validate it
Mehrin Kiani 18:43
How the space is evolving with terms of, you know, as we are thinking about, you know, the security of LLMs and, you know, AI agents, with LLM backdooring, do you think this would open a new frontier for threat attacks, or is this a new attack vector all in its own? Because now, you know, we have AI agents, but then now we also have LLMs which can be backdoored. Together, the combination of, you know, more autonomy with the backdoor. So what, what are your, like, you know, your insights or where, where this is headed? Or, you know, if this is a real impact or, as you said that there's so much that is already low hanging fruit in the, in the space of this cybersecurity. Do you think this is, this is too complicated yet? Or are we not there yet to, you know, start tackling these issues?
Nicole Nichols 19:29
I think we need to think about the fuller landscape of which things we're defending. So one of the projects I contributed to earlier was the RAND Corporation put out a policy document on securing foundation models, and one of the things that was really helpful in that report was they outlined what is a reasonable defense based on what it is you're trying to defend and how sophisticated the actor is. And so they have a basically I think it was a one to five scale of actor sophistication. And so it's the sort of thing where backdoors are realistic, they can be implemented; the most sophisticated actors would have the most resources and ability to use something like that. So it depends on which type of system you're trying to defend. And this goes back to the complexity of systems. At the end of the day [in]cybersecurity we need to think about what are the weakest links.
Nicole Nichols 20:37
And so sometimes it might be the unexpected target that is used kind of as a stepping stone to something else. And as we think about agents I think we apply the same principles. So there's a nature of as they're growing, people will look for where those connections are. And it might be that an element, like the tool that it's using, is backdoored but maybe the agent isn't. So there is the, the kind of spread of one vulnerability chain to another, to another. And LLMs as agents are opening up the ability to have different elements connected. So I think that they are both impacting the landscape, but in different avenues. So the fact that we are moving towards LLMs, it's - the biggest concern there in my mind is the ability for it to be so connected to many systems potentially.
Nicole Nichols 21:39
And so as we think about what it means to secure LLMs, I think a lot of that's gonna come down to access controls and having something like a zero trust system around agents themselves as they interact with other tools, thinking about encryption of the communications between other tools, trying to have provenance of training data, and kind of applying the same principles to this system. But I think that's the, it's, it's kind of like, I don't wanna call it the lowest hanging fruit because it doesn't sound like an easy task, but it's the sort of thing where you need to have that coverage first and then add the more sophisticated systems to scan for "has an element of this agent system been backdoored?"
Mehrin Kiani 22:23
Let's touch on the topic of communicating vulnerabilities found in the AI supply chain. So when a vulnerability is discovered, how do you think about what information should be communicated and through which channels? You mentioned the importance of standardization and how we report and address these vulnerabilities. Could you expand on that and share, you know, what changes you think need to happen in the industry?
Nicole Nichols 22:44
So I think there's two specific pieces. One is updating the CVE process to be incorporating of AI vulnerabilities themselves. I've talked to some of the people who are looking at what needs to change with that specifically, and I defer to their expertise on, on what specific changes they're actively doing in that. But I think that what I see is that unlike a traditional CVE, when we talk about vulnerabilities in an AI system, there's additional metadata that needs to be communicated around which version of the model was seen and which training data was used, and being able to communicate that clearly. So like most people, if they're demonstrating vulnerability, they don't know exactly which version of a commercial foundation model they're testing against. They just know the day that they did it. And so it might be useful to have some more transparency into what versions of these models people are finding these vulnerabilities in.
Nicole Nichols 23:51
And then being able to, the other piece of that is the transferability. And so many of the different types of security weaknesses in these models are transferable. So if you have a particular jailbreak to a model, usually it works not just with one model, but with others. And so there's a different responsibility to the person who's finding it to potentially test against these others. And there's a bit of question of is that on their responsibility or is it on the responsibility of the people who own all these models to see, oh, this happened on this model, we should check ours. Because it's hard for engineers to test that against every possible model, in part because the ecosystem is growing so fast. But the other piece of that ties back to what we were talking about earlier in terms of observability.
Nicole Nichols 24:45
And so if we think about what's being reported through a CVE system we need to actually be observing if a backdoor has been observed. And again, if we don't have something that's scanning to see if a model has been tampered with in that way, or if we believe a malware has been produced from Generative AI, it's hard to incorporate that. So I think there's kinda those two elements. One is ensuring that when we look at what is the full set of information being communicated through the CVE process, that it covers the elements that are unique to AI models and that process, but then also updating how we survey our systems to check for those AI vulnerabilities.
Mehrin Kiani 25:33
Yeah. It's just, you know, like the rest of the space, you know, everything is, you know, constantly evolving and updating, you know, not just the threats, but how we report them, and how should we develop those standardizations. So, looking ahead, what innovations or challenges do you see on the horizon when it comes to AI's impact on cybersecurity professionals? In particular, what should security teams and AI researchers you think should be preparing for in the next five to 10 years?
Nicole Nichols 26:01
I think the biggest challenge is going to be agents. I think that we need to understand how agents may fundamentally disrupt the tools and techniques we currently use. Right now, most SOC systems are fairly dependent on human analysts to triage alerts. And in the kind of five years history, a lot of applying AI to cybersecurity has been trying to figure out how to reduce false positives in those systems and how to have more precision in identifying which threats are most severe, and trying to have better signatures for detecting either malware or intrusion detection, these sorts of things. But as we move towards agents, one of the fundamental things that will change is the speed at which these attacks are occurring. And I think that it's the sort of thing where that speed of something where an offensive person agent is being used on a system could easily overwhelm a human-based system.
Nicole Nichols 27:11
And so we need to understand how do we become robust from a defensive perspective to something where attacks can happen at extreme scale and speed that could potentially overwhelm our current systems. And so I think that it's an open question. I think that one of the areas that is kind of ripe for exploration research is kind of this synthetic gameplay. If you kind of have some sort of sandbox where you can evaluate those sorts of attacks and defenses and try and figure out you know, which signals are most important what strategies are most effective. So this is one of the areas that I've been working in for the last three to five years now. So I was at Microsoft and we worked on the CyberBattleSim program which is, I - I tell this story a lot and it's very true, and it's, it's hard to shake it, but when I was at Microsoft - we were thinking that the realism of this system, you know, being implemented in the real world was on the order of 30 to 40 years. And that was [in the year] 2020.
Nicole Nichols 28:20
And then I later worked on similar systems at Apple, and we were thinking maybe 10 to 15 years, and now it's 2024 and agents are everywhere, and it feels like it could be any day. And I feel like there's this unexpected growth. I think people, it's human nature to want to feel like the world that you existed in yesterday is what's gonna be there tomorrow, but the acceleration is just kind of beyond what people and humans have experienced before. I actually saw a recent paper I think I, I think it was the National Bureau of Labor Statistics that put this report out. I could double check [verified: it's from the National Bureau of Economic Research], but it said that AI and Generative AI is being adopted faster than the internet or home PCs. By quite a bit. And so we, we have an internalization of how much those technologies revolutionized how we interact as, as humanity, and I don't think it's quite possible to envision how much agents have the potential to change that; in part because of the speed of adoption and the capacity that it has to enable things that couldn't be done before.
Nicole Nichols 29:36
So now for better and for worse, you know, we can write in 20 different languages through using machine translation. And before those translations were fairly poor, but now they're getting quite good. And there's still an importance to, you know, understanding under-resourced languages. But again, once, once that first barrier has been broken, the ability to then bring the rest forward is something that you can focus on. And so, I dunno, it, it's a, it's, it's a giant unknown. I think this goes back to the growth mindset. We need to embrace uncertainty and <laugh> boldly look forward, try and make the best world we can.
Mehrin Kiani 30:20
Yeah. It's, yeah, it's it's amazing how much we have covered in this episode, you know, from starting from definitions to, you know, some of the threats and now, AI agents and looking forward how we think that, you know, this landscape is changing. So like if you, if you were to give like one or two takeaways from this conversation to anyone you know from the security team or AI researchers, what would be those?
Nicole Nichols 30:46
I think the takeaway is that we're looking at a new era of system security. And as a system, it's a team sport, and we need to partner with more people. We need to be open towards learning more things. We need to be willing to be wrong if we think that something is a threat, and then we realize, oh, actually we really probably should do something about that. And it's, I know that's not a technical answer, but I feel like we need to embrace those first in order to get to the technical solutions that we need. Because as I said, that there's so much diversity of knowledge and to understand, you know, the depths of understanding the right loss functions to use, or how to understand, you know, obscure distributions that best represent cybersecurity data or these sorts of things, like there's deep technical knowledge in each of these things. And as we put together something that, as an agent system, if we're going to have security uniformly integrated across all of the elements that it's touching, you need to have that involvement of people who have the expertise across all of those. So I think that really as we move forward as a community working towards securing the AI future, I think we need to focus on how to best foster that teamwork across people and those expertises to ensure that secure future.
Mehrin Kiani 32:19
Thank you for sharing those key takeaways, Nicole. I think growth mindset, teamwork, and, you know, willing to be, you know, humbled in the face of, you know, these huge cybersecurity challenges in the face of ever evolving AI. This has been a fascinating conversation, and thank you so much for joining us today. And also thank you to our listeners for the continued support of our MLSecOps Community and its mission to provide high quality education. Thanks also to our sponsor, Protect AI. And again, a very you know, warm thank you to our guest Nicole Nichols. Be sure to check out the show notes for links to resources mentioned throughout the episode and until next time, I'm your host, Mehrin Kiani. Bye now.
[Closing]
Additional tools and resources to check out:
Protect AI Guardian: Zero Trust for ML Models
Protect AI’s ML Security-Focused Open Source Tools
LLM Guard Open Source Security Toolkit for LLM Interactions
Huntr - The World's First AI/Machine Learning Bug Bounty Platform
Thanks for checking out the MLSecOps Podcast! Get involved with the MLSecOps Community and find more resources at https://community.mlsecops.com.