<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=4373740&amp;fmt=gif">

The MLSecOps Podcast

Adversarial Robustness for Machine Learning

Mar 29, 2023 25 min read

Pin-Yu Chen, PhD talks with the MLSecOps podcast on Adversarial Robustness for Machine Learning

Hear IBM Principal Research Scientist, Pin-Yu Chen, PhD, discuss compelling themes from his book co-authored with Cho-Jui Hsieh, "Adversarial Robustness for Machine Learning."



Audio Only:



Episode Summary: In this episode of The MLSecOps podcast, the co-hosts interview Pin-Yu Chen, Principal Research Scientist at IBM Research, about his book co-authored with Cho-Jui Hsieh, "Adversarial Robustness for Machine Learning." Chen explores the vulnerabilities of machine learning (ML) models to adversarial attacks and provides examples of how to enhance their robustness. The discussion delves into the difference between Trustworthy AI and Trustworthy ML, as well as the concept of LLM practical attacks, which take into account the practical constraints of an attacker. Chen also discusses security measures that can be taken to protect ML systems and emphasizes the importance of considering the entire model lifecycle in terms of security. Finally, the conversation concludes with a discussion on how businesses can justify the cost and value of implementing adversarial defense methods in their ML systems.



Introduction 0:08 

Welcome to the MLSecOps Podcast presented by Protect AI. Your hosts, D Dehghanpisheh, President and Co-Founder of Protect AI, and Charlie McCarthy, MLSecOps Community Leader, explore the world of machine learning security operations, aka MLSecOps. From preventing attacks to navigating new AI regulations, we'll dive into the latest developments, strategies, and best practices with industry leaders and AI experts. This is MLSecOps.

Charlie 0:37

Hi, welcome everyone and thank you for joining us on The MLSecOps Podcast. Today D and I are talking with Pin-Yu Chen, Principal Research Scientist at IBM. Pin-Yu, welcome. 

Pin-Yu 0:50

Hi, everyone. 

D 0:52

Welcome back.

Charlie 0:55

Right. Great to have you here. To get us started, Pin-Yu, do you mind giving us a little bit about your professional and academic background, how you came into the adversarial machine learning space, a little bit about your work maybe? We'd love to hear more about that. 

Pin-Yu 1:08 

Yeah, sure. So I'm a Principal Research Scientist at IBM Research, so I'm in a big team called Trustworthy AI. We do a lot of things related to trust plus AI, right? So I would say we are doing a dual job here because at the same time, we need to keep up with the latest development of AI technology, like deep learning, like ChatGPT, you name it. On the other hand, we would like to add and have a way to evaluate and improve trust related dimensions for AI technology, including some important dimensions like fairness, explainability, robustness, privacy, and so on. So at IBM I'm a technical lead on AI Robustness. I am happy to talk more about that. 

Charlie 1:52 

Wonderful. We'll lead right into that. Our first question is about Trustworthy AI and about your book. So your background on the book that you recently co-authored, Adversarial Robustness for Machine Learning, says, quote, his long term research vision is building trustworthy machine learning systems. End quote. What is a trustworthy machine learning system in your mind? And to kind of frame that up as you're thinking about it, are there specific practical, ethical considerations that ML practitioners should be aware of while deploying, debugging? 

Pin-Yu 2:32 

Yeah, that's a great question. So at a very high level, we try to build an AI system that aligns with human values, right? Because at the end, you will be a human using those AI technologies to benefit everybody, like how ChatGPT has positioned itself. But that being said, a lot of annoying things have happened or have been observed for not being trustworthy. And important dimensions include, for example, those AI decisions could be biased, not fair to a certain group, or to certain races, or to a certain gender, for example. Or because nowadays this AI technology is very complicated. So there is not enough transparency to explain how the AI system reaches that decision or makes that recommendation. And that could be problematic for a lot of transparency demanding applications and domains. And also, although we spend a lot of time to tune up the system, it does not mean the AI system will behave as we expect it in the real world. For example, if the system is observing some different inputs, different environments compared to how we train the system, or even worse, the system is being compromised or being attacked by some bad actors. So in those cases, there's a strong demand to make sure that the AI is trustworthy, that means aligned with the human value and follows the regulation and any law put on this AI technology just to ensure we have responsible and safe use of AI. Right, so we hear a lot of terms like responsible AI, trustworthy AI, safe AI, but they more or less are targeting those similar dimensions that I talked about. 

Charlie 4:20 

Thank you. When we're talking about Trustworthy or Trusted AI in the vein of some of the themes you just mentioned; bias, fairness, and explainability; from your book there's the term trustworthy ML; machine learning. Is trustworthy AI and trustworthy ML something different or how would you describe the connection between both of those things and adversarial robustness? Is there any difference? 

Pin-Yu 4:47 

That's a great question. So I think AI is kind of the buzzword, right? Because there are actually several technologies that lead to AI like a neuro symbolic approach, like a machine learning approach. So in the book we are focusing on a machine learning path toward making, like, a general purpose AI. Right? But by machine learning I mean those neural networks, deep learning related technology. So that's why because this book is targeting a technical audience. But also we are also hopeful that this book can serve the purpose of the new researchers or interested readers who want to get involved with adversarial machine learning to familiarize with the notion or the terminologies we use in our field. What do we mean by attack, what do we mean by defense? What does verification mean? So we kind of have a narrow focus on adversarial robustness for machine learning. And it's certainly a very essential component for current AI technology. GPT is purely built on neural networks plus instructions from human feedback.

D 6:01 

Yeah, along those lines, I guess with adversarial attacks you mentioned ChatGPT and others. Maybe let's start with kind of - like a lot of people probably know about inversion, evasion, data poisoning attacks. But how do you think about adversarial machine learning attacks in a more practical sense? In particular, how it might relate to all of these large language models that are propping up? 

Pin-Yu 6:26 

Yeah, that's a great question. So I would like to first provide a holistic view of Adversarial robustness, right? And then we need to talk about the notion of AI lifecycle. I think it's very important to understand AI lifecycles and then we can realize what could go wrong with AI model. For example, in the AI lifecycle, I divide it into three stages. First the stage of collecting data, deciding what data to collect. For example, in ChatGPT’s case, they basically scrape the entire text in the web scale, like Wikipedia and other sources. And then once we have the data, it comes to the model. What is the right machine learning model to train on those data? So, for example, ChatGPT used a transformer based architecture like generative pre-trained transformers to be able to be generative and creative. Right? And after you train your model now the third stage will be the deployment stage. How do you deploy your model? Right, so most of the AI technology deploy their model in a black box manner, which means the user can use the function as service, but the user wouldn't know what's behind the tool they are using, like ChatGPT right? It's basically very non transparent to users. On the other hand, there is also another mode of deployment, like white box deployment where everything is transparent to the user. And that will be like the hacking phase scenario where they provide these checkpoints like pre-trained neural network models for users to download for their own purpose. Right. So with this AI lifecycle in mind, we can then talk about, okay, do you expect anything to go wrong? Like any place where bad actors can come in and compromise our system in the AI lifecycle. So for example, in the training phase, if the attacker has the ability to inject some poisoned data or carefully crafted data to affect the training process of the model, that will be like a training time threat, right? And there are also recent works showing it's doable even in the web scale, it's possible to poison the web scale data to affect ChatGPT-like models. On the other hand, like adversarial examples or other familiar cases we have shown actually related to deployment phase attack where we assume the attacker has no knowledge about the training data, but the attacker can observe and interact with the target model and play with the interaction and find example that invades the prediction or make the model misbehave. Right? So it's a process very similar to how we find bugs in the trend machine learning model. So with this AI lifecycle in mind, we can then divide and conquer. And by saying, okay, so would you worry about your model being compromised? Would you worry about your data being poisoned? Or would you worry about your user information while serving the service so that the attacker can use this as vulnerability to intrude your system? And so on.

D 9:20 

So that's an interesting perspective. I guess my question then as a follow up to that would be what you just spoke about, it leads one to think that there might be gaps in how traditional security concerns or traditional security practices are applied to these machine learning systems. And I'm curious if you have a perspective on the gaps that you see in the environments used for these. Do you see a gap in security offerings for various ML pipeline systems? Like the difference between, say, AWS SageMaker or Google Vertex or Databricks or any of the major platforms? Where do you see that gap between the security needs and kind of how these ML systems are developing to get these types of models into production? 

Pin-Yu 10:17 

Yeah, I think that's a terrific question and I will self identify myself as a machine learning researcher. So my view could be biased and maybe different from what a traditional security researcher would say. But my understanding of current security toolkits is that they focus on more of the compute side of security. So for example, we need to ensure, like, one plus one equals two under different scenarios or they focus on a very high level, like, a social engineering perspective, like how do attackers gain access to the system through social engineering, through phishing and so on. But the thing is they are focusing on attack and defenses in a rather static world. And what is interesting is that recently these systems are adopting machine learning as part of their components, right? So with machine learning, when we brought in machine learning we saw a lot of opportunities and excitement like how ChatGPT can now, recently they just announced they can do this, plug in support and so on and people are saying this could be the iPhone moment or even the Apple Store moment for AI, right? But on the other hand, because they introduced this machine learning and machine learning is such a complicated function, right? So we can train the machine learning model to behave and do a very good job. Actually, we have very limited knowledge of how they learn from this noisy and large scale data and how they process information and make the final output, right? So that kind of non-transparency and black box nature actually created a lot of new security risks and threats from the machine learning perspective. And that's related to the focus of my book and also my research on basically focusing on new security problems introduced by having machine learning in the system like training time attack, inference time attack. It has a lot to do with the machine learning perspective. For security, I think we need to go one level up. We cannot only focus on compute level or access right level security. We need to look into the learning perspective to understand how a machine learns and to understand in the learning process is there any vulnerability or weaknesses that could be brought to the system. 

D 12:37 

So it's almost like a new framework needs to be applied beyond just a zero-trust which is about credentialing and kind of permissions. You're really talking about a higher level of abstraction to achieve robustness, is that correct? 

Pin-Yu 12:54 

Yes. I believe some of the static evaluation or unit tests may not be sufficient in this domain. Right. Because machine learning is a living thing. It is continuously being updated and evolved. Right. So why I would advocate, is we have a way of doing this idea of active testing. Like having a proactive perspective in terms of when we develop our model or before we deploy our model to the public. There should be some tools that we can leverage to actively search for possible errors or mistakes or threats or risks and for model developers to understand the limitations of their AI system and then to correct those errors before the actual bad actors use that as a vulnerability to compromise our system. 

Charlie 13:51 

Pin-Yu, can we walk this back just a little bit and for nontechnical listeners, when we use the term robustness, what do we mean by that in relation to machine learning models? 

Pin-Yu 14:02 

Yeah, that's a great question. It's a very heavy loaded term. And again, there are different definitions. So for me, I will describe robustness as the way to measure the gap between AI development and deployment. So one analogy I will often make is that when we develop our AI technology, it's like growing a planet in the greenhouse. So we assume everything is ideal. Like temperature is perfect and translating to our case, like data is perfect, there's no noise, and labels are all correct. And then we know why it's the right model to train and so on. But in reality, when we deploy our model in the wild, it starts to face a lot of challenges. Like, oh, the training environment is different from the environment I was deploying. So, for example, I was collecting data, a picture taken in the daytime for autonomous driving car training. But actually, I need to drive my car in the nighttime so the performance could be degraded. It cannot generalize through its nighttime view of the images. And not to mention they are bad actors that try to do some pressure testing just to see how good they can compromise our systems and so on. So in my view, AI robustness is really to try to have a way to quantify the gap between development and deployment, either in natural shifts or adversary environments. And also, how do we mitigate the gap between development and deployment? 

D 15:31 

So how should ML engineers, data scientists, people who develop machine learning systems, how should they start to invoke the methods that you're discussing to achieve a higher degree of robustness? 

Pin-Yu 15:44 

Yes, that's a great question. We have a conceptual pipeline called AI Model Inspector. So I will relay this notion of AI inspection to something that everybody is more familiar with. So, for cars, we take it for granted that it needs to be regularly inspected and maintained to make sure we can drive and use the car safely. So nowadays we may be using AI more often than our cars. So why don't we practice the same idea, the notion of AI maintenance and back to this AI lifecycle perspective? There are many new tools that we offer that we believe can help the model developers understand the limitations and weaknesses of the AI system. So, for example, between the stage of data collection and model training, there are a lot of tools that we can employ just to have an inspection of the data itself to understand; are there any mislabeled data or spurious data outliers? And we should remove them before training the model on this problematic data set. And between data and the model stage, we can do a lot of active testing by practicing these attack ideas. So here, attack has a broader notion. We are not trying to break our system. Rather, we are holding those attack modules to help us generate and find some bugs or identify like [...] of the system or model we are developing. And even if you deploy your model between the deployment stage and if you identify some errors or performance degradation through some continuous monitoring tools then we should bring our model back to the data stage. It's actually a recurring stage in terms of robustness inspection. If after deployment  that doesn’t mean it is worry free, right? Because the data or the environment may change very quickly. So we still need some way of monitoring the status, the health status of the model, and bringing the model back to the data

Charlie 17:55

Right. It's not necessarily a linear path. 

Pin-Yu 17:57 


D 17:58

And it sounds like what you're advocating, correct me if I'm wrong here, but it sounds like what you're advocating is, hey, within your model development lifecycle, and in your book you talk about some of these things, you need to start thinking about - in the model lifecycle - how do you start doing neural network verification, complete neural network verification, for verification against semantic perturbations? You're adding very necessary steps to kind of the AI maintenance model you talked about or the model lifecycle, I would assume. What is the most natural starting point for an ML engineer or an ML systems manager? Where would you guide them in how to get started on adding things like neural network verification? Complete neural network verification? Incomplete or complete verification against, like, semantic perturbations?

Pin-Yu 18:49 

Yeah, that's a great question. So based on my experience in the industry and talking to our industrial clients, right, I think the first thing we need to convince them is that their models are not as robust or not as secure as they would believe. Because the models are their own kids. So sometimes we need to spend some effort just to show their model can be broken or their model is not as generalizable as they expected. The most common mistake, I would say, made by even mature machine learning research is that they will rely too much on the metrics of accuracy. So oftentimes we will create this independent test set and just evaluate whatever model we develop on the test set.

D 19:31

Yeah, accuracy, F1, recall, precision, all the normal ones. But you're asking them to start thinking about a different kind of performance metric. 

Pin-Yu 19:39 

Exactly. And most of the time, they will relate this accuracy to robustness and security. If my accuracy or any performance metric is better, that will imply my model is more robust or more secure or so. But it's often not true. Right. So the first thing we need to come and say is, hey, your model, although it's more accurate, it's actually less robust, more sensitive, or more easy to generate those adversarial examples. 

D 20:05 

Which means it could be more brittle, right? And it could break more frequently. 

Pin-Yu 20:08 

Exactly. And we have a very famous experiment to show this undesirable trade off between accuracy and adversarial robustness. So, for models who are more accurate on ImageNet, there is a very well known benchmark. At the same time, those latest models are also more fragile to this adversarial [...], which is actually very undesirable to model developers. But that's something they overlooked.

D 20:33 

Well, let me ask about this, though. Maybe if they want to, maybe they're overlooking it, but there's also this reality of budget constraints, right? Like this global economy backdrop. Right now, budgets are tight, and we've found a lot of customers don't even budget enough for just retraining in traditional methods for their models to get that performance improvement. And that retraining budget is there. What you're asking for is almost a different way to think about how to budget inside of your ML development world and those costs, because you're going to need more forms of retraining. You're going to need more forms of, kind of, going through that lifecycle over and over again. How should ML engineers, and thus their managers or development and engineers, managers and leaders of ML teams; how should they think about asking and getting the budget from those who control those purse strings? To achieve some of the things you're talking about. 

Pin-Yu 21:36 

Yeah, so I think that's a very important issue to address and I know a lot of industrial leaders actually paying a lot of attention to this topic. So, for example, there's a lot of teams being organized or formed working on red teaming or working on auditing the model. And that team is actually independent of the team who is developing the model. Because for developers, we are always very biased and blind to what we are developing. So you need another fresh eye or even people called Red Team or sometimes Purple Team just an independent view or evaluation and make sure your system is behaving as expected and going back to the verification. So, for example, if I'm building an image classification model, then you would hope that if you rotate the image by a small degree, the model prediction should be the same, right? But actually without paying attention or doing that type of semantic change analysis, you may be surprised how many ImageNet classifiers will have a very significant performance drop just by simply rotating the image by a small degree. So those simple things, you would expect the model to have high accuracy. You should learn that notion of rotation environments. But it turns out the model is not as smart as we believe. So there is a need to have this simple unit test or more comprehensive testing brought by another team just to make sure your model will not build the user's expectation. And also, I also have another good example to motivate the reason why we should have these efforts right. That will be the recent Google Bard example. When people are comparing Bard with ChatGPT, right, they realize that Bard answered one question wrong of who is taking the first picture related to astronomy. And simply because of answering that one question wrong, Google loses $100 billion in stock market price value. So you may argue their system could be 99% correct on all other questions. But that doesn't matter from a reputation perspective. As long as you answer one simple question wrong and it becomes non factual, it could be really a loss in terms of reputation and revenue. And that's the reason we should have another team, just to fix those issues and make sure your AI is transparent or responsible in terms of answering those questions. 

Charlie 23:57 

Right. So if we jump from talking about cost, maybe over to, it sounds like what you're starting to dip into is, the value of some of this, Pin-Yu. How should businesses be thinking about this, and how do team members; what can they do or how should they frame it to contextualize it to the decision makers in their companies in terms of making the case for the added time and costs and resources on providing adversarial defense methods that you mentioned?

Pin-Yu 24:28 

I have two answers to that. I think in the short term, we can focus on some metrics that are computable and measurable. So, for example, in fairness, there are some metrics related to like a disparity or so on or equal opportunity or so on. So those metrics, you can evaluate your model on those metrics and make sure you met the requirements from law enforcement or from regulations or even from some guidelines like a GDPR or something. There are some guidelines that will tell you, okay, you need to make sure your model is fair in what circumstances? Like a 20-80% rule or so on. So there are, in the short term, in many cases, in many domains of trustworthy AI, there will be some clear guidance in terms of the necessary guarantee for your machine learning model. So those are like near term targets, but in the long run, we certainly want to make sure machines can have a way to understand human value and behave consistently with humans’ decision making. And that process may not be done through machine learning alone. It may involve human machine interaction. That's why the magic of making ChatGPT happen has actually had some reinforcement learning with human feedback, instructing the model to behave nicely. So it's certainly at some stage, we need to interact and instruct the machines to do the right thing and also do things right. 

Charlie 25:58 

Yeah, there's that human interaction piece. And when you said that, it actually triggered me to think of what you said earlier about aligning with human values. So humans need to be involved, not just to give that type of feedback, but we're talking about maybe behavioral professionals, psychologists, that pitch in for these sorts of conversations to create the trustworthy AI. Yeah. Interesting. 

D 26:21 

So, Pin-Yu, can you talk to us a little bit about where you've seen some of these real world attacks? There's been a lot of debate about is adversarial machine learning real? And everybody uses the same kind of few examples. I'm curious, in your work as a researcher, are you coming across different clients and different customers at IBM or others where you are seeing these in the wild and we're just not hearing about them? Maybe you can talk a little bit about what you're hearing, what you're seeing, what has actually happened without naming names and give a sense of how real this is.

Pin-Yu 26:56 

Yeah. So I'm very confident every client we talk to, they will mention, yeah, robustness is important in our system. Right. But when we say robustness, there are also different notions of robustness in their mind. So, for example, some clients, they care more about generalization perspective, like the image rotation thing I talk about, they would like to make sure their model will not lose performance, even if the data input has some slight distribution change, like change of rotation or change of color, for example. But they may not necessarily worry about being attacked if they are very confident, for example, they have a full control of their data collection process or they have a very good protection of their model details. In those cases, they will worry less.

D 27:44 

Everybody thinks they're safe until they're proven wrong though, right? 

Pin-Yu 27:47 

Very true. Yeah, but at the same time, right, so I also hear a lot of stories about using AI to break the offerings. And one concrete example is actually not in the images space, but more in the code space. So, for example, malware. Detecting malware is always a very important topic. And with this AI capability, AI can generate more sophisticated code that will hide the malware function in part of the code and even try to evade the detection of the detector. So there are also some concrete examples where people would use this notion of attack just to test up whether their detector is robust enough or secure enough. And from a machine learning perspective, I would not only use this attack as a way to show the vulnerability of security, I would rather use this attack as a more general debugging tool to help understand the limitations and also failure modes of the models we are developing. 

D 28:52

So is that introducing more of these methods for penetration testing, essentially? Like pen testing. 

Pin-Yu 29:00

Yes, actually it's more like a Red Teaming perspective rather than an actual hacker using those tools to break a system. I would rather advocate those tools as a Red Teaming or like a white hat hacker perspective. How can they use those tools just to simulate those attacks? Maybe in an ideally like a Sandbox scenario, you simulate those attacks. Just try to understand how vulnerable my model is. What are the possible 

D 29:28 

Yeah, we're seeing that right. OpenAI has their AI red team now and Anthropic and a lot of the AI foundational model companies are building it. It's interesting to me though because I think that there is probably another end of the spectrum, which is enterprises who are building applications on top of those models, on top of those foundational models. They're going to need different Red teaming approaches as well, I would imagine. Do you agree? 

Pin-Yu 29:55 

Yeah, I totally agree. Especially now people are focusing on why it's so called a foundation model or a model that's trained on a web scale data and also with hundreds or thousands of GPU trains for weeks or even months to make that available. Like a GPT-3 or GPT-4, right? But from a security or robustness perspective. If every application or every workflow is built up on the same foundation model and you don't have the proper way to secure or ensure the foundation model is robust enough that every application, depending on that foundation model could be problematic because those mistakes or those security risks will carry over from the foundation model to their own application or their own fine tuned model. So that's why I think that OpenAI is very hesitant to release the details of their models right now because often releasing those could be exploits, right, to those adversaries.

D 30:53 

Yeah, the imagination has been captured by these virtual assistant concepts on the large language models. But there's a different type of large model and the rise of diffusion models over the last two or three years. GLIDE, DALLE-2, Imagen. I'm curious as to kind of how you think about those. Also, is it the same vein of robustness for things in the image space, not just the language space? I'm assuming these types of theories and concepts and techniques that you're talking about apply equally into things that are not language models. It basically could be tabular data models, could be image models. Right? Is that a fair assumption? 

Pin-Yu 31:35

Yes, that's a fair assumption. Actually, a lot of these red teaming tools or debugging tools that we are developing, we are trying to make the model as generic as possible. By generic, I mean it should be minimally dependent on what kind of data they are handling or what kind of kind of tasks we are handling. It should work in general for text, tabular data, speech image or work for different types of neural network like recurrent neural net, convolutional neural net all the way to diffusion models and so on. But also we would like to maximally leverage the domain knowledge to improve the robustness of those models. And speaking of diffusion models, we actually demonstrated a very recent work of ours. We show it's actually possible to backdoor a diffusion model and the backdoor cost is very low in the sense that we can take a pretrained model online and just fine tune the model to inject backdoors, and the injection can be done very stealthy and it will not harm the image generation quality of regular inputs. Which means if you downloaded a model from a non-trusted third party, you may be using a backdoor model. Without attention, there's no easy way to identify the model has been backdoored. So there's a lot of issues with these new models. 

D 32:52 

So as you keep validating these types of things, I'm curious, is IBM operationalizing your research or are they putting it into commercial offerings? What are those offerings and how is your insight with IBM protecting against these types of attacks and how are they being commercialized? 

Pin-Yu 33:10 

Yeah, that's a great question. I totally share the mission of IBM to make AI like non harmful and actually trustworthy, right? And some of our products have been consumed by IBM's own offerings, like Watson Studio, for example. Our research on this block box attack, where we can abstract any model or system as a blog box model and run our attack on top of the black box model to do robust testing is actually consumed by IBM Watson studio. Because in our business, we need to handle different types of models and platforms like model on Pytorch, model on TensorFlow, and you name it. Right. So it's very important, actually very elegant, to just abstract models from different platforms as a black box model and apply our independent robustness testing tool on top of that. So that's why it's being observed very quickly. And in addition to IBM offering, we are actually actively collaborating with a lot of government entities like the Department of Defense in the US. So, for example, a recent [...] program called Guard, when they actually proposed, recommended three toolkits to evaluate and test the robustness of your model. And IBM's Adversarial Robustness Toolbox is one of the three recommended tools. So in that toolbox, we have a lot of debugging techniques, including attack, defense and verification and evaluation metrics. So there's actually some community efforts happening. It's also, again, also an open source library for our art. That's great community efforts. And there's also some customized version for specific clients that want to add new functionalities and also some part of it has been consumed by IBM offerings. 

D 34:57 

Great. And I know that IBM is a leader in this space. Charlie?

Charlie 35:03 

Yeah, I wanted to ask more on a personal note as we're wrapping up here, Pin-Yu, just from your own perspective, what is next in this field? Adversarial machine learning, you're talking about robustness, Trustworthy AI. Are there things that you're reading, that you're thinking about? Any new books maybe that are in the works, maybe co-authoring or what's next for you? What's top of mind? 

Pin-Yu 35:28 

Okay, yeah, I can share two aspects. The first aspect is this latest technology. So for adversarial machine learning researchers, we are always chasing after the latest technology. So for example, the recent trend in AI is this notion of foundation model. You train a very big and gigantic based model and you can use that model to do fine tuning on all different tasks and solve different tasks at once using the same model. So this has been a very new trend adopted in the industry, academia, right? But with this new notion of AI technology, what are the new risks that we should explore? Improving those diffusion models is part of this new foundation model perspective. What are the new risks in terms of backdooring or in terms of narrowing down to thousands of models, to just a few, very few foundation models that everybody is using in terms of robustness and security? That could be a bigger concern or that could be a relief because as long as we fix those models, most of the application will be secure. And to that point we are not sure and we need to look into the latest development for recent new technology like new people learning methods and training methods and architectures and so on. So that keeps us very busy. Also in the long run, I also become very interested in the topic of adversarial machine learning for good. So recently I identified a lot of dual problems that seem to have shared totally different objectives, but the underlying technology is the same. So for example, for backdoor attacks, as an attacker, we actually want to inject something to the model to control the model. But on the other hand, if you think about the water marking problem where actually it's actually a dual problem to backdoor, but the purpose is different. In that case, the model owner wants to inject a watermark to the model or the data the model generated to claim ownership, but that actually can be done through the backdoor methods that we develop in the community. So I really like this dual problem perspective because it feels like when we have a better understanding of how to backdoor a model, at the same time we get some knowledge for free about how do we develop a better design, a better watermarking mechanism for the very same model. So I'm trying to connect the dots and actually extend adversarial machine learning to beyond just focusing on adversarial perspective but also doing some extra applications that are very important to the community.

D 38:04 


Charlie 38:05

Thank you.

D 38:06

Well, thank you, Pin-Yu. It has been a pleasure. Pin-Yu, Senior Research Scientist at IBM and accomplished author. Thank you for joining us on The MLSecOps Podcast.

Pin-Yu 38:10

Thank you.

Charlie 38:16 

Thanks so much. We'll talk to everybody next time. 


Thanks for listening to the MLSecOps podcast brought to you by Protect AI. Be sure to subscribe to get the latest episodes and visit MLSecOps.com to join the conversation, ask questions, or suggest future topics. We're excited to bring you more in depth MLSecOps discussions. Until next time, thanks for joining. 

Additional tools and resources to check out:

Protect AI Radar

Protect AI’s ML Security-Focused Open Source Tools

LLM Guard - The Security Toolkit for LLM Interactions

Huntr - The World's First AI/Machine Learning Bug Bounty Platform

Thanks for listening! Find more episodes and transcripts at https://mlsecops.com/podcast.

Share This:

Supported by Protect AI, and leading the way to MLSecOps and greater AI security.