MLSecOps | Podcast

Expert Talk from RSA Conference: Securing Generative AI

Written by Guest | May 22, 2024 5:12:04 PM
 

 

Audio-only version also available on Apple Podcasts, Google Podcasts, Spotify, iHeart Podcasts, and many more.

Episode Summary:

In this episode, host Neal Swaelens (EMEA Director of Business Development, Protect AI) catches up with Ken Huang, CISSP at RSAC 2024 to talk about security for generative AI. 

Transcription:

[Intro] 00:00

Neal Swaelens 00:06

So we're back at RSA [Conference]. Ken, thanks so much for joining.

Ken Huang, CISSP 00:11

Yes, thank you, Neal.

Neal Swaelens 00:12

Yeah. We had a great talk yesterday. We talked about LLM security. Now we're talking about it in a somewhat of a different light. You obviously wrote a book [Generative AI Security: Theories and Practices (Future of Business and Finance)] and that's pretty much the whole topic of the conversation today about generative AI security. 

So I would like to focus on that topic today. So maybe for a start a brief introduction and then we'll, dive straight in.

Ken Huang, CISSP 00:37

Yeah. So again, I'm Ken Huang. I wrote this book with contributions from industry academia. The book is "Generative AI Security." That book has three parts, 10 chapters. We will dive a little bit deep into it, but my background is starting as the developer architect in 2-, in 2000, actually, I had a project for Capital One as a part of CGI American Management System (at that time) to develop a threat risk management solution for Capital One. And we found there's a SQL injection in our code. I'm the architect, so that get me interested in the security. 

And yeah, that's, even before OWASP initiative get started, I do have early involvement in OWASP and also guidelines contribute little bit, but recently more contribution, but that's how I get started.

Neal Swaelens 01:44

Got it. Okay. And what really drove you to write a book? Was it the timing of, you know, the release of ChatGPT that really drove you to start the book or has it been something long in the making?

Ken Huang, CISSP 01:57

Yeah, so I, before I write the generative AI security book, I do have another book published by, also by Springer. It's called "Beyond AI." It's talking about the Generative AI-Web3 intersection, because I also cover the blockchain Web3 space for some book published by Wiley and Cambridge University Press. So when I finished that book I immediately think during the process, I think the security really should be the, everyone will ask, like everyone jumping the ship too fast, trying to develop the - a LLM application or AI forward application, but not really thinking about security. So I think that that will be, have a need. So I start to first just my myself, right. I have layout the chapters and give the proposal to Springer and eventually, like the process in Springer, it's, you have to have a peer review for your process for the proposal and also for the sample chapter. So I submitted sample chapter. Actually, the sample chapter I submitted is about MLSecOps.

Neal Swaelens 03:25

Oh, really? That's good to know.

Ken Huang, CISSP 03:27

Yeah. And I associated with the DevSecOps podshow and the CI/CD pipeline, why we need the LLMSecOps. So, so this is one of the chapters yeah, we can dive deep a little bit into it, but it has a peer review process during the proposal stage and also after you submit the whole manuscript, it also has another review process. So yeah.

Neal Swaelens 03:56

And that's also what I believe you drew in some additional editors to go into the book, right?

Ken Huang, CISSP 04:01

Oh, yeah, yeah. Additional editors and also the chapter contributors reviewer.

Neal Swaelens 04:06

Got it. That's great. I guess like to set the stage for the conversation, for those of you that are listening in and watching; in the context of your book, can you like delve into what is generative AI security, and also why does it really matter?

Ken Huang, CISSP 04:23

Right? Yeah. So generative AI, maybe everyone know it have generative capability that everyone know maybe some contribution. It's like the reasoning capability, maybe someone disagree. But I think certainly it has a reasoning and emerging capability. But it also has a hallucination so that have some implication on the security, but not just the hallucination. It's the data right, the data itself, like the data poisoning. So data security is very important. 

We do have one chapter covers that and also model, how do you protect the [machine learning] model so it's not be stolen or no one injects a malicious coding to your model weights, right? So, and also the whole like MLSecOps, right? And how do you actually build in the program?

Neal Swaelens 05:26

I guess like as a, as a next, next question, right? Like obviously we're kind of seeing AI and ML applications becoming way more critical within products. I mean, the pace of adoption of AI and ML within applications, within products, is essentially driving more and more value to the organization. And as they become more rooted in the organization, what do you see as kind of the key risks for these applications when these AI and ML systems may fail?

Ken Huang, CISSP 05:59

Yeah. So I think I can wear a different hat for that. One hat is really - yesterday we talk about OWASP Top 10 [for LLM Applications]. So that's, is something we can focus on. Another hat is I'm co-chair of Cloud Security Alliance. Right? So we are looking, we analyze the existing vulnerabilities and see how we can put the control. So, so this is another, I also co-chair the AI organization responsibility working group. So we look at the possibility, we already have one paper published about the model security, data security and AI vulnerability management security. And the next working stream is focused on the generative AI application security, supply chain management, and also GRC side of it. So that we have a different work stream. 

But to answer your question, I think to the, some of the categories, certainly the prompt injection, deep fake, data poisoning, supply chain, right? And Denial of Service attacks. So those are maybe, I don't want to mention the top, everyone can say it's top, right? But those are some of the major concerns when organization trying to develop their LLM applications, right?

Neal Swaelens 07:34

Yeah and I guess like, the, the key point here again, is the fact that you have that knowledge gap in the organization. Companies are still, let's say, educating themself on the differences between traditional application security. So maybe it's a, a question before I head to the, the next one is like, how do you see the evolution of that knowledge gap change? Like, are companies more prepared now, or do you think they're still learning a lot?

Ken Huang, CISSP 08:00

Yeah, so in one of the chapter - Chapter Two - I talking about that for my book, right? I said, there's a shifting security landscape because of generative AI. Actually, we, and also another chapter, I said, we need to build a security program. There is a huge skill gap between what application security or AI or data scientists, what they have in terms of what we actually need to secure these applications, right? In terms of, one is of course the MLSec Operations, but what about also data security, the model security, and how actually you can leverage actually generative AI to operationalize your security operation, right? So yeah, certainly there's a gap. 

And also I recorded a video course for EC-Council. I don't know, EC-Council is, they provide training, right? So yeah, I provide a generative AI security training course. You can get a certificate if you go through this training, video training.

Neal Swaelens 09:10

Okay, I didn't know that. I actually wanna check that out.

Ken Huang, CISSP 09:12

This is a, maybe can someone go through the training? They just post it, they get a certificate.

Neal Swaelens 09:19

Okay, that's great. No, we'll, we'll make sure to also add a link to this episode. But I guess like the, as an expression, I mean, we, we talk about AI/ML; I'll see large language models have brought AI to the forefront anywhere. Particularly speaking about the security of LLMs, we, we talked about it yesterday, but for the sake of the folks that will be listening into this, what do you see as like particular nuances to the security of large language models compared to some of the other traditional AI/ML, if you can already call it traditional?

Ken Huang, CISSP 09:53

Traditional

Neal Swaelens 09:55

Models. Like,

Ken Huang, CISSP 09:57

Okay, so your question is from security perspective, right? Which areas is the LLM security differ from traditional ML Models?

Neal Swaelens 10:08

Exactly.

Ken Huang, CISSP 10:10

So generative AI has this non-deterministic, right? But also losing the capabilities. So you, the security will be focused on more like in reducing hallucination; you cannot 100% eliminate it, right? And some of hallucination has consequence. It may reveal your private data, or it may have some downstream implication if you implement a agent or function calling based on whatever, right. So the, there is some downstream implication of the hallucination behavior. Of course, there's also other aspect as well as that I actually have in the book.

Neal Swaelens 11:05

Yes, I think your point that you made there is definitely relevant. And we talked about it a little bit yesterday as well, where kind of the current breaches, if you can call it that of LLM applications, have been fairly limited to, you know, brand damage because of the embarrassing headlines that are generated. But as we're expanding capabilities with agency and retrieval augmented generation we're essentially expanding the, the blast radius, right? 

But I think maybe as a, a follow up question as you already alluded to in the book, you also talk about some great approaches to effectively secure generative AI applications. So can you expand on that? What does it look like?

Ken Huang, CISSP 11:46

Yeah, so, so, so the book has a three part, right? The first part is just the landscape of the security so far with the generative AI and also talking about what is generative itself. The second part is really secure your generative AI environment as the ecosystem, right? The data, the model, and how do you build the program and eventually go through the MLSecOps. So the third part alluded to your question, is how you effectively can leverage generative AI kind of skill set, right, if you have right, or, the capability exists. So the prompt engineering is one of this. 

So in that book, we dive really deep into how do you using ReAct [Prompting]: chain-of thought one or chain-of-thought two, what's the difference, right? And also how to use those kind of advanced prompt engineering technique to help you in your security operations, right? Incident response, threat hunting, application security. So this is one part, and also I did a survey of talent GenAI powered security tools, right? So, so in terms of security operation and in terms of incident response, in terms of threat hunting. So there is tools. 

So in my next book I will talk about MLSecOps also, so there's some upcoming book about AI engineer - focused more for developers. It's almost done. And I'm hoping to have that soon. Hopefully for June 1st.

Neal Swaelens 13:40

Yeah well, I'm definitely looking forward to that. I guess like, you know, you made a point there is obviously something that has been quite, like, quite often discussed. I mean, we have MS Security Copilot. We have companies, I believe, like Dropzone, if I'm not mistaken. How do you think generative AI will also play a role within security organizations? Do you think generative AI will dramatically change the way we look at SecOps, for example?

Ken Huang, CISSP 14:08

Yeah. It is a risk of repeating what I did yesterday. I have to cite Caleb Sima again, right? I think he, he's visionary in cybersecurity space. And his "three C" really brings the conversation right to the table, right? 

It's about coverage, like using generative AI to increase the coverage, rather, it really has this solutioning capability. Find the needle in the haystack, right? Right. So that's coverage. 

And the context is trying to find who, when, what a impact there, right? Or yeah. Or maybe analyze to see if it's a real positive or false positive to find the indication of the compromise. 

And then communication is basically leverage generative AI to generate, right, the security plan. Of course, you have to have a human in the loop, and also you provide a chat interface for every tool, right? So you can say, okay, what's in my model. Or which model I have maybe has some compromised vulnerabilities that huntrs community, right, find. And that was in the Sightline [product], so you can just ask it [the GenAI], right? So it should tell you. 

And also I add one more ["C"] - it's change, right? Because the whole environment, the business environment always change, the technology convergence, the new technology always come with a quantum, with the IOT, with AI, VR, blockchain, metaverse, right? So new threat always coming up. How can you adapt to this new change? You will have to leverage generative AI.

Neal Swaelens 15:57

Yeah. That definitely resonates. I guess like you basically abstract complexity and you accelerate the time to insight, which is so critical in detection response. And as you said, it's so critical as well in an environment that is continuously changing, new threats that are emerging as a result. So I definitely like that. But to, to switch gears, I think one of the things that probably a lot of our users are wondering about as well is how regulation, as it is evolving around generative AI applications, you know, the European AI regulation, but also the Executive Order in the United States - what are some things that companies should keep in mind when it comes to regulations, especially around these generative AI applications?

Ken Huang, CISSP 16:48

Yeah, I think there has to be, it, it's also talking in my book where in the Chapter Three, yeah in Chapter Three, would kind of serve as a landscape, like talking about EU AI ACT, Executive Order, and also <inaudible> Senator in the Congress - what do they propose, right? And we talk about the balance between innovation and the security. Like you have to have a balance, right? And the regulation and the innovation. And so to go back to your question about as an organization how do you adapt, right? I think it's, I talked with the the head of GRC at the OpenAI, Nick Hamilton, he was one of my co-chair for the AI organization responsibility working group, right. At Cloud Security Alliance. So he actually said it's a, it's a challenge - shifting landscape, right?

Ken Huang, CISSP 17:52

And they actually have AI Preparedness framework already are trying to be proactive, right? So I think this approach is good, like as organization, especially frontier model - Open AI and Google Gemini, you, I think that if you go through self regulation approach for now, that's there is already a Executive Order, but it's not really a - goes through the legislation, right? So it, it may, the next administration coming, it may it not exist anymore, right? So I think in order for you to proactively, you have to self-regulate first, and then you also have to do the, do the balance, right? And also eventually you have to thinking about the difference between open source model, the pro source model. Like in California, actually it recently has the bill just coming, drop like few days ago. And lots of people debate about that. Right? There's, yeah. The California bill said if your model cause 500 million [dollar] loss, you, you are in trouble or something like that.

Neal Swaelens 19:09

That. Oh, interesting.

Ken Huang, CISSP 19:10

Yeah you have to disclose it. Yeah. So it's as people said, it's in favor of this closed model, like those big company. Yeah. But maybe the open source model, it's like Gem - like [Meta] Llama. Right? So it may have more challenge for them, but I don't know. There is still some controversy in terms of this bill. Bill is a bill. It's not the finalized yet, right. It's not voted yet, but yeah.

Neal Swaelens 19:38

That's interesting. I wasn't aware of that. I'll definitely have to look into that new bill.

Ken Huang, CISSP 19:42

It's called SB 1407 [CA SB1047] If my memory says good SB 1407.

Neal Swaelens 19:49

Right. But I guess at the end of the day, as I said, it's all about you know, keeping that readiness, but that bill's quite interesting because obviously it will potentially be replicated across some other states and maybe even replicated in Europe.

Ken Huang, CISSP 20:00

California is the biggest state [in the US House of Representatives] and also it has the Privacy Act, right? That was like close to GDPR in Europe, and people, when they cite privacy law, it's always GDPR-CCPA, right? So I think that maybe in the future it's always "EU AI ACT-SB1047," whatever the regulation coming out, right? It's possible. Yeah.

Neal Swaelens 20:30

I mean, it is interesting. You mentioned you know Nick Hamilton from OpenAI GRC. I mean, as companies are building their AI and ML programs and they will be involving the GRC team, what are some of the, you know, best practices that you recommend companies consider as they're working through these issues?

Ken Huang, CISSP 20:52

Yeah, I think this again I have to reference back to my book in Chapter Three, I talk about how do you build a security program? You have to have a security program for generative AI because everything's changed, shift. If you using an older security program for generative AI, it will not work. Of course you can piggyback, like you cannot just throw it away and build a new program, right? You have to piggyback; augment what you have. 

So I talk about the security policy, procedures, process along with the generative AI and also in terms of risk management, especially the "shadow AI." That's, I think I put a lot of emphasize because of generative AI, some of the companies say [to their employees] "you cannot use ChatGPT, " because of privacy data or enterprise data, you cannot, right? But that's not the correct approach, right? You have to build, actually that's a lazy, lazy man security program approach, just to say, right. Eventually people will circumvent it. And this will create a shadow AI, you have a shadow model, shadow application, but the business has the needs, right? In terms of leverages the latest technology because shifting business requirement. So yeah, so shadow AI certainly exists in small, medium, and large enterprise. So how you deal with it,

Neal Swaelens 22:27

And it's definitely a, a big gaping hole just on the basis that it just generates such a big surplus for the employee productivity that it's very hard to not use it as an employee. But I think as a follow up question to that briefly do you, I mean, it's not a new problem, do you think it is something that existing shadow IT companies will be solving like shadow IT security companies? Or do you think it'll be just something that will be resolved by enterprise vendors that make it easy to companies to then start issuing their own managed LLMs, let's say, for security, right?

Ken Huang, CISSP 23:03

Yeah. I, I think the new tool will certainly come, right? Also leverage generative AI will help. I don't think that it will 100% eliminate shadow AI, right? Shadow IT still exists. So I think it certainly was generative AI that can help to kind of put a light on this, right, shadow AI. But I think as a, especially when the enterprise getting bigger, you always will have it.

Neal Swaelens 23:39

For sure. It's definitely an interesting topic that we also come across quite a bit for obvious reasons. But I guess like, as a, as a final question, like, you know, you've obviously delved a lot into the space. You have your own hopes for the future for generative AI security. What, what are those?

Ken Huang, CISSP 23:58

Oh, I would say my hope for the future for, yeah…

Neal Swaelens 24:01

Or maybe let's, let's switch the question. Like what is your prediction for generative AI security? If you have one for the coming year?

Ken Huang, CISSP 24:09

For the coming year.

Neal Swaelens 24:11

It's not an easy question. I know.

Ken Huang, CISSP 24:12

Yeah, yeah. I know. I, I actually it's my wish. My hope. My hope is you can have as many tools you want, but you should have just one chat interface that connecting - like it is using API or using agent or using function call, whatever you can use. And you have one chat interface. As CISO, you can just say, what's in my network? What is the security? How can you fix and using the agent to do the fix. Maybe some fix may have a huge impact to the downstream application. Then you ask is it relevant. Like if it can help to find some relevant people, right. To say, okay, this I need to fix. It's a vulnerability - you let me fix? Okay, the agent will do the work. It's fixed for you. That's my hope. But I think it's, maybe not next year, but maybe in 10 years. Yeah, hopefully it will.

Neal Swaelens 25:13

It'll definitely be needed, that's for sure. But I guess with that thank you so much for joining us, Ken. We'll make sure to include the details of your book. Yeah, thank you. And we're also obviously very much looking forward to the new book you're writing.

Ken Huang, CISSP 25:25

Oh, great, thank you.

Neal Swaelens 25:26

Yeah. And we'll definitely have you back once that book is out to discuss it too.

Ken Huang, CISSP 25:30

Cool, cool. Yeah.

Ken Huang, CISSP 25:31

Thank you everyone for tuning in.

Ken Huang, CISSP 25:32

Yeah, thanks.

[Closing] 


Additional tools and resources to check out:

Protect AI Radar: End-to-End AI Risk Management

Protect AI’s ML Security-Focused Open Source Tools

LLM Guard - The Security Toolkit for LLM Interactions

Huntr - The World's First AI/Machine Learning Bug Bounty Platform

Thanks for listening! Find more episodes and transcripts at https://mlsecops.com/podcast.