From Risk to Responsibility: Violet Teaming in AI
Audio-only version also available on Apple Podcasts, Google Podcasts, Spotify, iHeart Podcasts, and many more.
Episode Summary:
In this episode, the founder and CEO of The In Vivo Group, Alexander Titus, joins show hosts Diana Kelley and Daryan Dehghanpisheh to discuss themes from his forward-thinking paper, "The Promise and Peril of Artificial Intelligence -- Violet Teaming Offers a Balanced Path Forward," authored with Adam H. Russell.
Transcription:
[Intro] 0:00
Alexander Titus 00:21
I'm Alexander Titus. I go by Titus just to confuse the world. There's a lot of Alex's out there, so.
I am a computational biologist, AI person, AI and bio space. I'm actually in the middle of switching roles, so I'm about to join the University of Southern California Information Sciences Institute to do some really cool AI and bio research, and I'm also a Commissioner on the National Security Commission for Emerging Biotech. So, I spend a lot of time thinking about all the opportunities and risks that come with this space.
D Dehghanpisheh 00:46
Awesome.
D Dehghanpisheh 00:51
Welcome back to the show, everybody. And today I am so excited to have Titus, who you heard his introduction.
And you know, a guy who goes by Titus is somebody who I can really relate to given that I go by D. And with me today is a great honor to have my friend, colleague and who I call our company's adult supervision, our CISO, the esteemed Diana Kelley. Diana, welcome. Glad to be hosting the show with you today.
Diana Kelley
Great to be here.
D Dehghanpisheh
Thank you. So, Titus, just a quick question. You know, one of the things you had in your intro was some of the stuff you're doing on the commission and the stuff you're doing at USC. Congrats on the new role and all the things that are happening over there.
Talk to us generally about how you see the world of AI risk, particularly in biotechnology and life sciences.
Alexander Titus 1:40
I mean, it’s a big area of conversation right now. So people all over the world are starting to talk about AI and bio, and some of the - I affectionately call it - “Chicken Little Sky is Falling” kind of concerns. Because, I mean, there are legitimate concerns, but I think much of it we haven't yet proven out that those concerns are real and valid.
But with large language models, ChatGPT, all that stuff really exploding onto the scene in the last less than a year, all of a sudden, people are worried that anyone is going to be able to design an organism that may or may not have some kind of sequence of concern involved with it.
So that's where a lot of the risk conversation is, is worried how is the average person who is not an expert in biotech could potentially make some kind of pathogen that maybe has a pandemic potential or something like that.
And so these AI models, while doing wonderful things, are giving pause to a lot of people. And that's where I spent a lot of my time, is how do we actually stratify what is a, what I think of as a delta end risk. Because a lot of people say, well, what if someone could design something that really was harmful to a million people?
Well, we don't have to actually design something to have things like SARS-CoronaVirus-2 that's harmful to far more than a million people. Right?
So where is the actual risk versus the perceived risk? That's a big topic of how I spent a lot of my time.
Diana Kelley 3:13
So taking that as a baseline, Titus, of the actual risk and the perceived risk, how would you explain those two things to a layperson, perhaps someone who's not deep in ML or AI?
Alexander Titus 3:26
I have a different perspective than a lot of people on what is actual and what is perceived. Perception is just like beauty in the eye of the beholder.
D Dehghanpisheh 3:33
You mean you're not worried about hallucinations?
Diana Kelley 3:37
Deep fakes!
Alexander Titus 3:38
Well, I am worried about hallucinations, but I have a friend who works at a biotech company. And oftentimes people will say to him from the security community, “If you knew what we knew, you'd be scared.” And he often tells it back like, “Well, actually, if you knew what I knew from a biology standpoint, you wouldn't be scared at all,” because this is a lot harder to do – engineer biology and things like that – than a lot of people are worried about.
So I think about stratifying risk kind of in the operational biology. So like, how do you just do biology versus completely new biology. At the moment, anyone who's played with some of these ChatGPT kind of tools, they're great to write a thousand words if you want to write your cover letter for a new job. But a thousand words is not that many characters actually.
So you’d give it, I don't know, 5000 characters. And in biology, a gene can be tens of thousands or hundreds of thousands of base pairs long. And so 5,000 base pairs, which is the actual A’s, T’s, C’s, and G’s of DNA that people are talking about, it's hard to produce enough to have kind of real biology there yet. Right? We’ll always caveat with yet.
We all got caught off guard when ChatGPT really surprised us all. And in 2012 when ImageNet happened, a new technology surged onto the mainstream again that we didn't see. So we see these big leaps and bounds. So I think the perceived versus actual barrier is always in flux. But I do think that a lot of the real risks come from…
If you use a model that wastes your time, if it is predicting to build something, whether it's a new small molecule, or a new biological drug, or a new organism that's going to produce some new material. If you go way out in left field down the wrong path, if you've optimized wrong and you spend a lot of time and resources building off of that, that could be a risk that is often not discussed, is that kind of opportunity cost, economic risk. Especially to the biotech industry, because there’s a lot of startups.
We have lots of big pharma, but a lot of the modern synthetic biology companies are still very startup heavy, pre-revenue or early market fit. And so I think that there is as much risk in steering down the wrong direction as there is into that kind of really scary side of things.
D Dehghanpisheh 6:04
So, you've been thinking about risk of AI in life sciences and biotechnology for a considerable amount of time.
How did you start worrying about this? When did you start worrying about this? Why? Because it seems like you were kind of a little bit ahead of the time.
Alexander Titus 6:22
I am often considered the optimist in the room when it comes to these things, and so -
D Dehghanpisheh 6:27
I hate to see what a pessimist looks like.
Alexander Titus 6:31
[Laughs]
Well, I more started thinking about risk when everyone else seemed really worried. And I’d say, well, should I be more worried? Like, what am I missing here?
I started in 2015, started building deep learning models to study biology and have been working on it since. Where I think about risk is we want to be able to slow down the things that really are risky, but we don't want to over-throttle back on the things that we want to be working with.
I've spent a lot of time in a number of different groups of people that think about different things. I have spent time in the biosecurity community whose focus is protecting us from accidental, and intentional exposure and things like that. I spent time in the National Security community, from DOD to the intelligence community, who's focused on everything from kinetic bombs and missiles to espionage and things like that. And there's a very different set of risks that come with what you can do with biology there. And then I spent a lot of time with the private sector whose biggest risk in biology is not being able to use it well enough.
Looking at where everyone's concerned, oftentimes people are concerned for the flip side of what someone else is concerned about, right? Someone who's trying to build a business model off of biology is concerned they're not going to be good enough at using biology versus someone in the biosecurity community might be concerned that people are too good at using biology. And so when you look at these two sides of the same coin, right in the middle you realize, well, we are where we are for how we use and design biology and AI is making that significantly easier, improving that.
This year NeurIPS, the big AI conference, is having their first workshop on generative biology. So it's really coming into the forefront and we're going to be a lot better at engineering biology very soon.
Diana Kelley 8:23
Okay, you just kind of blew my mind with “generative biology.” Thinking about generative AI versus generative bio. Are those connected? Are you using gen AI for gen bio? What is it? And what are the potential risks.
Alexander Titus 8:36
Yeah, yeah. Well, generative biology is what I call the area of research that I dive into, and it is using generative AI in a variety of different ways to invent new biology. If you want to, instead of having to do some combinatorial top down kind of screen of new biology, these AI models can learn what I call the paradigms of life.
So, if you think about – your dogs walking around the background – your dog, your cat and a horse all have four legs, a tail, and they're all roughly the same shape. So there's some kind of design paradigm there. And while we haven't been great at studying all of the nuances of how these design paradigms happen, the ability for these models as they grow bigger and bigger, to understand the nuances that are interconnected of that data in the biological system is getting better and better. And so when these models can then just say, in a latent understanding, so the models kind of intuitively know what makes something divide more efficiently or bind more effectively.
Then we can say, all right, I want to have a molecule that has this kind of binding affinity with this kind of spike protein, for example. If you think about all the conversations around spike protein during the pandemic, could we from – and this is pre-ChatGPT and mainstream major large language models – could we start from the ground up from more of a principled standpoint and say, all right, I'm going to generate a piece of biology, something that could bind more effectively and maybe we would have even accelerated it already kind of history making vaccine response process.
That right there is conjecture, but that's the promise. And that's why the people are excited about the opportunity of these kinds of things.
D Dehghanpisheh 10:28
And so that kind of gets into the dual use paradigm, right? I mean, the notion of generative biology powered by AI and some large foundational models. The ability to both do great things for society and great harms to society. It seems like it could be kind of in that dual use paradigm.
Can you give us an example that's in the market today of a way in which a system has seen both of those uses?
You talked early on about making weapons or the like. But is there a common platform or a common framework that has been applied where that dual use has become blatantly obvious to both business people and governmental regulators, maybe?
Alexander Titus
Yeah. I'll tell you a story about viruses. The first thing that most people think of is viruses as bad. Right? Infection, scary pandemic, whatever it happens to be. One of my favorite reporters and authors is Ed Yong from The Atlantic, who wrote this great article that I talk about all the time, where phages are a type of virus that are specific to infecting bacteria.
And there was a young woman who had an incurable infection at a hospital in the UK and the clinicians couldn't figure out how to cure this infection. And so they called a faculty member in Pittsburgh who had spent his whole career studying phages, viruses, and in encouraging people to collect this data and put it into a database. And so they called him and said, can we do some analysis, figure out how to actually engineer a phage therapy?
So instead of using an antibiotic, we want to use these viruses that are specifically evolved to attack bacteria to cure this woman's disease. And so that's what they did. They did their data mining. They took the bits and pieces out of the database that they liked. They engineered it into a model organism and were able to actually cure this young woman's incurable infection.
The best part about that, though, for me is biology right now, because generative biology is not in its complete sense. We can't completely invent new biology yet. So we have to pick up what I consider, “walk around the forest and pick up Lego pieces off the ground” and then use those. And the genome that was in this database was found by a high schooler in South Africa in their backyard on the bottom of a rotting eggplant in their garden.
And so data from South Africa ended up in a database in Pittsburgh, which was then used to cure someone in the UK. With a virus. And so when you talk about this kind of dual use, the coronavirus that caused the pandemic is a virus. The phage that they used in this article and story is a virus. So if we say we're never going to use tools to engineer viruses, our ability to treat cancers is going to be significantly diminished. These antibiotic resistance bacteria, things like that.
And we don't want to engineer harmful viruses with increased transmissibility or more lethality, things like that.
D Dehghanpisheh 13:38
So is there a governing mechanism or some kind of throttle that you put on this technology or…
I'm trying to figure out how you would guardrail against potential dual use harm while still taking advantage of the need to kind of do maximum elements.
It feels like that asymmetry is almost impossible to manage.
Alexander Titus 14:00
Yeah.
That is the debate in the biotech and biosecurity world. And that's where my research is really focused on. Because the concern is these large language models and other generative tools make it easier to create the scary version of a virus or something like that. But we want to be able to have tools that understand how to engineer viruses, something like a phage, while not engineering the things that you don't want them to have.
And so is the paradigm that was introduced in a really great Wired article in March called “Violet Teaming” that I spend a lot of my time focused and thinking on is how do you take AI systems that you're concerned about and actually use those AI systems to try to create ways to limit the downside in misuse or unintentional applications of those systems?
And it's not just a red team or blue team where it's purely from a technology standpoint, because you think about it from a blue teaming standpoint, the best way to ever prevent engineering of a virus is to never do it. But the societal missed opportunities that come with completely ignoring that whole application is hard.
And so how do we actually build in responsibility, security by design. You know, there's a whole bunch of different ways you could describe it, but how do you say let's train AI algorithms to not identify things that are associated with transmissibility, lethality while still identifying things that are better attacking at gram-negative or gram-positive bacteria or whatever that goal happens to be.
Because there isn't a good answer to your question, D. And there are people on both sides that say, let's just not do it at all or let's not worry about it, because how are we ever going to tackle this problem?
I spend time with both of them. We all go to Thanksgiving together. At Friendsgiving it's a little heated when someone gets food poisoning, but it's a lot of fun.
Diana Kelley 15:59
Titus, it sounds like, you know, as you're talking about violet teaming, it's not red teaming, it’s not blue teaming, it's taking some ethics into consideration. Could you talk a little bit more about how it differs from purple teaming, which is a common phrase that some folks may already know?
How does it differ and also how could companies fall by implementing this in their organizations?
Alexander Titus 16:18
Yeah.
So the way that I interpret purple teaming is it's, you know, you put a red team and a blue team next to each other so they can actually talk to each other. I think of it as from an AI adversarial training, but it's not adversarial. It's a constructive feedback loop between red teams and blue teams to identify a vulnerability, patch it and then say, all right, now I know what I need to work around to try to identify another vulnerability, but largely from the technology itself.
Whereas violet teaming is a nuance on purple teaming and everyone loves when there's more terminology and in every field, right?
I'm just as guilty as anyone for complaining about it.
Diana Kelley 16:52
We need extra terms!
Alexander Titus 16:54
Yeah, exactly.
But it really is bringing in a bit of ethics in society and those who are impacted by these decisions. And so when you think about it from an organizational standpoint, how do you bring in other perspectives that are not just technology?
So that's like I was saying, there's a trade off of everyone who would benefit from a phage therapy, for example, would argue we want to be able to engineer them more effectively. And everyone who wants to make sure that we can fight infectious disease in countries that don't have great sanitation is going to say no matter what, we should make sure we try to eliminate infectious disease and not do any tools that can engineer those kinds of things.
So the way that I've worked with a number of companies to think about violet teaming, first, is not having an absolute. So it could be a committee, it could be a group of people of some capacity that aren't just all technologists. In the life sciences groups of people have bioethicists clinicians, these kinds of other perspectives that help say, wait wait wait, you're missing this person or group's perspective when you're making that technological decision because we want to be able to incorporate that.
Some of the pushback is now you're taking technology and making it squishy. But technology is implemented in the squishy community society that we live in. And as these things get more and more dramatically impactful, we have to take more and more of the squishy side into account because I want to be taking into account, you want to be taken into account, where I'm not purely a robot, no matter how many times my colleagues might tell me that sometimes.
D Dehghanpisheh 18:34
You mentioned violet teaming and bringing people who are not in the purely technical realm.
I wonder if there's a corollary between, say, the bioethicists and a CMO who has to protect the brand, and a reputational component and bringing them in too. Because I think what you're basically pointing out – and correct me if I'm wrong here, Titus – is that violent teaming as a concept means bringing more people to the table means taking into account both red team, i.e. offensive and blue team, i.e. defensive actions for purposeful malalignment and for purposeful malicious tasks.
And so I can see if somebody was deploying a generative application that's available to the public and it gets weaponized intentionally to sink a brand, to sink a stock like you're going to want to have people to think about how the brand could be harmed. And so a CMO needs to now start thinking about having offensive capabilities and defensive capabilities.
It seems to me like there's easy correlations there between what a biotechnology or life sciences company must be doing to fully analyze dual use risk and mitigate those risks. And what say people who are just, you know, selling shoes as an example or something want to be able to do. Is that a fair comparison?
Alexander Titus 20:08
Yeah, I think that's a great comparison. And I think, oftentimes they'll be marketing, they'll be the medical side of things.
So, CMO in biotech often means chief medical officer, there could be chief marketing officer. But yeah, that's exactly it is, how do you bring together the people who are often not in the technology decision conversation? Oftentimes marketing is told this is what you need to market. But having that input, this is what the market feedback is and where we want to go.
Good companies do that, but not in the coherent theme that is what we're talking about. Right? It's very intentional in this capacity to make sure we get at all those voices.
Diana Kelley 20:49
Yeah, really? That is true, right? We've got to be inclusive as we're building and securing this technology.
Alexander Titus 20:59
Yeah.
Diana Kelley 21:00
So you've referenced in the past an approach with different directions, things like robustness and verification, interpretability, generalization, value alignment, macro strategy, policy.
Could you give us a little bit of an overview of what those categories are and why they're so important?
Alexander Titus 21:11
I would take one step back and say the whole idea of violet teaming is not a rigid, ‘x number of things have to be taken into account.’ It's more of a paradigm of what are you trying to optimize for. And it could be a subset of those kinds of things.
So robustness, for example. Actually a really interesting paper just came out in arxiv the last week, I think, where they showed that you could fine tune train a model on some very small number of specific new training examples and completely get around all of the kind of security training paradigm that you put in.
So that means that most of these models are not robust at all. If it takes you 100 samples to fine tune any model and you can get around the training that you've had, you've just completely lost the idea of robustness and robustness is, you know, I think of it as kind of resilience as well. How are you, your model, your system, your organization, responsive to pressure testing, if you will, on different cases?
Transparency is something that I think a lot of people are familiar with. But how do you know what a system is doing? So how do you know what data went into it? How do you know what data is informing some kind of decision, whether it's, you know, some classifier or something like that.
And if I was told I couldn't get a loan from a mortgage, but because a model had never seen someone who only worked in government and had never seen me as a type of lendee, I'd want to know and understand how that decision’s being made. And that's the transparency side of things.
And this is true. None of this is new ideas to the AI world, but I guess my big takeaway message is that it's complicated and, all of these, we're seeing them fall down pretty quickly every time someone has a new way to get around some concern with these LLMs and generative tools right now.
D Dehghanpisheh 23:20
So a lot of our audience is pretty technical-bent, right? And a lot is not.
But for those that are technical and that are involved in the construction of artificial intelligence applications or machine learning systems, is there a particular phase of the model lifecycle where you think these things need to be more honed or more applied? Because you're talking about robustness, verification, interpretability. All of that is really in kind of the model design, experimentation and model training phases, if you will. It's not necessarily what's happening at inference.
And yet a lot of what I think you're talking about in terms of risks occurs at inference in production where the model can actually be applied. Or is it like, hey, D you're wrong, It actually is across the board.
How should our technical audiences think about where the biggest gaps are in terms of the model lifecycle?
Alexander Titus 24:18
I think you're spot on. I mean, where the impact happens is when AI and systems meet the real world, IRL, in real life. A lot of red and blue teaming methodology is kind of post hoc. Once it's hit the real world and you try to figure out, all right, how do we now constrain a model that we have?
That is way better than doing nothing. But what I'm talking about is how do we actually go upstream into that training period? And for example, how do we build a new loss function that encodes some set of value that you're looking for? And that value needs to be quantifiable, so you can actually calculate, compute a loss on it.
But, in the case that I was talking about, the phenotype of transmissibility, let's say we have a model that kind of learns what a transmissible set of genes looks like.
It could identify during that training, or the generation and part of it and say, all right, we're going to downweight that. We're going to say anything that has this characteristic in our training paradigm, our new loss function is going to actually take that into account.
But I think the more important part of what you said is the lifecycle. I don't think what I've just mentioned about this paper showing a hundred fine-tuned training examples busts the whole thing wide open.
This needs to be a very iterative process where it's not a do it once and you’re secure, you’re robust and you never have to worry about it again. It's how do you monitor, keep track of, understand what's being exposed, what's not being exposed, correct for it? And so this is just adding another consideration in the lifecycle kind of earlier on where instead of just straight training a model, you’re training a model with intention and that intention is not purely technical.
Diana Kelley 26:11
Wouldn't it be nice if we could just fix it once and it was secure forever? I was hopeful when you said that.
[Laughs] You've got this really unique view, Titus, in that you've seen private sector, you've seen government. And I'm wondering if of those dimensions that we were talking about, like robustness, and policy, and verification, are you seeing those two cohorts focusing in different areas? And do you have any analysis about why they might be doing that?
Alexander Titus 26:41
I think that oftentimes the disconnect is government is not necessarily – was not at all – income driven. And the industry by definition is, right? This is what a corporation is. And so oftentimes government will ask, well, can you also put another consideration in, well, you might take a hit on your revenue or something, but it's good for the world.
And of course, everyone wants to do straight good to the world so the government probably tends to lean more towards being overly robust or overly transparent in a way that either sacrifices performance or gives up some of the secret sauce. Whereas industry wants to be responsible, but also wants to keep as much of the secret sauce as they can so that they can be competitive on that.
And so there is a growing amount of similar language but different thresholds for what is desired for each of those. And it's again, it's not because either side is good or bad, it's that there's, I mean, the optimization function of government and the private sector are just different.
I do often remind my government colleagues, because when I've worked in government, all anyone does is ask for money to do cool things. That’s what industry is doing, right? Industry says, giveme money, I will do cool things. So it's actually not that different, despite people not often feeling that way because there's a lot fewer people that go back and forth between government and industry. A lot more spend their career in government or spend their career in industry.
D Dehghanpisheh 28:21
To that end, where is the intersection of the optimization function, which is kind of cool things in the vein of regulation, and safety, and goodness for society, and protections. Where is that intersection at the crossroads of that street and the street of capitalistic instincts, let's go get profits, let's make a lot of things and let's go build markets. Right?
Like, where is the government applying its concerns around safety and security to this realm, if you will?
Alexander Titus 28:59
Well, I think the one thing that people often don't realize is government also wants industry to go build markets. I mean, we have large swaths of government that are designed to empower industry because that is good for consumers, for economic growth and everything.
The government should be about the people and for the people. And so that is where oftentimes some of that tension comes from. Some corporations will forgo public good for profits, certainly not the majority. And so that's why we have regulators that want to make sure that we can take into account the public good.
Violet teaming, in a way – and this might show the fact that I come from both government and industry – violet teaming in a way is kind of embedding a little bit of government mentality into your technology.
Not in a way that someone’s going to be like, “The government’s in my tech!” No, no, no. That's not what I mean. It's the spirit of government.
D Dehghanpisheh 29:53
Not the Snowden backdoor stuff.
Diana Kelley 29:58
Big brother!
Alexander Titus 29:59
No, no. This is just a little bit of government love in all your training models.
D Dehghanpish 30:04
Actually, one of the questions that we have asked quite a few guests on this podcast is this notion – and we talk about this as safety and security are not necessarily the same thing. Something can be very secure, but it may not be that it makes you feel any safer. Right?
So where do you – and I always use the concept of I could build you the most secure house in the world, but if you're agoraphobic, and you still don’t want to go outside it doesn't matter what I do. Like, you're not going outside.
How do we think about security versus safety in this space? Because you're talking about things that could be real world critical harms where if something's not secure, the example could be I can go attack a supply chain upstream in biotechnology, which is a security issue, even if the perception of, say, a safe drug or a safe compound or a safe consumer product is done.
Or I can have tons of security. But the safety component of, hey, this large language model gave me a recipe, the one you gave an example about chlorine gas.
You know, those are different constructs. That model is totally secure, necessarily the endpoint can be, but it's not very safe. Right? So how do you think about that?
Alexander Titus 31:24
That's actually a good example. I'll give that example as a way where if you think about it purely from a technology standpoint, that is secure but not safe.
So there is an interesting article about a meal planning app that would take whatever ingredients you had in your digital cart and would generate recipes for you, although people would have things like Clorox bleach or house cleaning materials. And so it started suggesting recipes including refried beans and Clorox bleach. And whatever combination it had suggested, one of the chemical byproducts was actually chlorine gas.
And so you might have a really secure app, could be on your iPhone. It has all this privacy and security. And so no one would ever know that you had done this. You've got all the traditional security involved. But yeah, that's not very safe because you're sitting there in your kitchen and all of a sudden you walk in and your partner is on the floor because they chlorine gassed themselves.
So that's not safe. And so that's the distinction you're talking about. In the broader biotech sphere they think about safety and security, biosafety and biosecurity, as biosecurity is mostly focused on how do you prevent unwanted and unintended access to, say, biological agents that are in a laboratory that's designed to handle that category of agent?
And so biosecurity helps make sure you have all the personnel, security, physical security, all that kind of stuff to keep that away. And then biosafety is how do you keep the scientists safe while working with it all the PPE, all that kind of material where if you use the right protective equipment working with Ebola, you're safe. And you can be secure because you're working in a BSL-4 lab that has all the right protocols.
One of the concerns that often people cite around generative AI in biology is that any high schooler will now be able to create a bioweapon in their basement, which I think is actually a really interesting biosafety-biosecurity discussion, because if I, arguably a moderate expert in this area, created Ebola in my basement because AI told me to, I have no biosafety and biosecurity, so I'd probably kill myself before it ever hurt anyone else. Right?
So like, yeah–
D Dehghanpisheh 33:42
Good news!
Alexander Titus 33:43
It would be bad for me, but the world is not doomed because these have real world biosafety and biosecurity concerns. And so there's a practicality to some of these that are high consequence, but also to the point where they're limiting in terms of what's possible.
Diana Kelley 34:02
So I feel like you might have taken a swipe at my Grandma Papini’s famous pasta “bleach-onaise” sauce. I just want to go on record.
[Laughs]
No, that was a great example.
I’m wondering, Titus, you really have this unique lens of the technology, of life sciences, biotech, government, and private sector. If someone wanted to get involved and start working on the kinds of things that you’re working on, support your work, or get into another aspect of this work, how would they go about doing that?
Alexander Titus 34:37
Well, one, I'm always looking for research funding, but if they themselves wanted to get into–
D Dehghanpisheh 34:45
Stay tuned, audience members, stay tuned.
Alexander Titus 34:48
But if people want to get into this space, I think it's interesting that there are not a lot of training programs at this intersection. There is just a lot of context and a lot of information in each. Computer science, and machine learning, and biotech. And so there's not– it's hard to get a degree in that. More and more so there are but it's hard to do that.
So I think that’s one of the other principles of this violet teaming is oftentimes, for example, AI security and biology.
If I was building kind of a dream team, there would be someone who understands cybersecurity, who had worked with, say, DNA sequencers. Someone who understands AI, who had worked maybe with biology, but they don’t have to be biologists, and then maybe have a biologist who moved into the computational field. So you have this kind of intersection of some really interesting dynamics where one person doesn't have to do them all.
I change more often than most, so I happen to have experienced a lot more than a lot of people do. But I get some flack for that over time. So I think that the way that people get involved is bringing these skill sets across different areas.
Biotech companies are as digital native as most companies at this point. And so people who are good at software, good at, MLSecOps, or anything in that space, every biotech company today needs people doing that. Needs people to make sure that their systems are secure, they're building models to actually understand how to do the next generation.
There's a lot of ML native biotech companies now, and so that tech talent is sorely needed in the biotech space. And there's a lot of opportunity across the board.
Diana Kelley 36:31
Yeah, yeah. Which is exciting because I mean, there are a lot of people in cyber who have other interests and actually biology is one area where people trained in biology and went to cyber so this could be a perfect intersection point for them.
Alexander Titus 36:43
Yeah, Yeah. Well, in the biotech industry, especially if we're talking cyber too, every industry focuses on its primary technology as the central focus. And so cyber is really kind of a late to the game conversation for a lot of biotech. And so there's a big need for cyber practitioners in biotech as well. Cause a lot of the hardware that is built is not built in a cyber hardened capacity because it’s biology first. It's not cyber first. And so we need as many cyber practitioners as we need ML practitioners in the life sciences.
Diana Kelley 37:19
Opportunity. So with the violet teaming that you were speaking about I'm wondering, do you think that this is going to be something that gets mandated, and would it be mandated by a standards body perhaps, or by the government?
Do you think this is going to be something that's going to be required?
Alexander Titus 37:35
I think it's a long way off if, and I don't really see a true requirement, but I do think. So, I guess to clarify– Standard bodies don't usually mandate so much as create the way to be measured, and so everyone can measure themselves against that standard together. I do hope and could see some set of standards come out around – we're starting to see this from NIST has an AI framework.
We're starting to see a lot of this moving in that direction. And I could see that AI framework being adapted in some capacity to be slightly more life science framed.
But I think that a lot of the responsible paradigms that are being built can be brought into the life sciences as is. And so I think, yes, lots of standards. Mandates are, I think, a little ways off, but I think that's one of the reasons why I am doing the research I'm doing is because if mandates happen, I don't want them to be uninformed. I want this to be empirically informed so that, you know, data driven policy decisions can be made.
I’m on this commission I talked about and one of the reasons I shifted into this research is because we don't have enough data driven information on the actual risks and opportunities that this kind of AI and bio intersection, which is where I decided, all right, well, I'm going to go be one of the people who contributes to that body of evidence.
D Dehghanpisheh 38:59
So you used the word responsible. Responsible AI is a big segment of this whole security and safety world. And I'm curious, one of the tools used in responsible AI is this notion of model cards, model governance.
How do you see, and is there anything the government is doing from a biotechnology perspective, in particular, that's relating to model cards and model governance in this space?
How do you see that fitting into this whole notion of safety and security in generative biology?
Alexander Titus 39:32
I don't think that there is yet, but I do think that there is talk about that because, you know, the beauty of a model card or something like that is you actually get some of that transparency too. You actually know what went into that model, what it was trained on, when it was released, any other kind of information about it that you'd want.
That's helpful from a healthcare life sciences standpoint as well. When you're thinking about we want to make sure we don't accidentally in some capacity release a model trained on HIPAA data that is not supposed to be released into certain systems, things like that. And so when we get towards those types of model cards, it helps more from a management, kind of an infosec management perspective as well.
Because behind a hospital's paywall, for example, within its own healthcare system using models that are trained on HIPAA data might very well be the way to go. But those models you wouldn't release outside of those systems as a way to keep those models secure for the patients.
Diana Kelley 40:35
I have to say, this has just been an absolutely amazing talk. But there've been a lot of threads and I'm sure people could really pull on from this conversation. Where would you recommend people go to find out more information and to get involved?
Alexander Titus 40:52
So there's a number of different spots. One, contact me on my website as a very baby step into it. I have theinvivogroup.com.
In Vivo Group – in vivo is kind of in life – So it's all about how do we do these things in real life and bring these technologies to impact in a way that isn't just digital because a lot of these things happen in the real world. IRL. I'm trying to learn– I have a sister who's just started college and “IRL” is a big thing.
D Dehghanpisheh 41:25
We like to say “in the atoms and not bits.”
Alexander Titus 41:27
Yep, exactly.
There’s also a growing set of– well, one if someone wants to play with it. I think the first thing is it'd be great if people started familiarizing themselves with what these tools can do in the life sciences, right? Go try to use a ChatGPT style model and generate a protein and see what happens. And so hugging face actually has a lot of really great life sciences-related models that are starting to come out.
And then you could follow the work of what this National Security Commission on Emerging Biotech is doing as a way to start to get at some of the questions you were asking, Diana. That is, I don't think yet we have the answers, but I think there will be indications of where the government is leaning in terms of the perspective over the next 12 to 18 months.
And there's a website for that, the NSCEB, the National Security Commission for Emerging Biotech.
And then there's so many good companies that are starting to make really big plays in this space. So just pay attention for some, for example, NVIDIA just invested $30 million in recursion, pharmaceuticals. So we have an AI hardware company investing substantial sums into a pharmaceutical because of this promise of AI driven biotech.
And so that those are the ones that catch me off guard. Investors have been putting money into biotech for a long time, but a hardware chip company putting money into biotech? It's pretty cool.
D Dehghanpisheh 42:51
Hey, we have had a wonderful conversation. For all of the listeners and readers, refer to the show notes for all the links that we talk about, all of the websites, all the material back. And once again, Titus, thank you very much.
Alexander Titus 43:07
Thanks for having me.
[Closing] 43:10
Additional tools and resources to check out:
Protect AI’s ML Security-Focused Open Source Tools
LLM Guard - The Security Toolkit for LLM Interactions
Huntr - The World's First AI/Machine Learning Bug Bounty Platform
Thanks for listening! Find more episodes and transcripts at https://mlsecops.com/podcast.