<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=4373740&amp;fmt=gif">
MLSecOps-favicon PAI-favicon-120423 icon3

The Intersection of MLSecOps and DataPrepOps

 

In this episode we have the pleasure of hearing from Dr. Jennifer Prendki, founder and CEO of Alectio - The DataPrepOps Company. Alectio’s name comes from a blend of the acronym “AL,” standing for Active Learning, and the Latin term for the word “selection,” which is “lectio.” Dr. Prendki defines DataPrepOps for us and describes its contrasts to MLOps. She also discusses data quality, security risks in data science, and the role that data curation plays in helping to mitigate security risks in ML models.

Transcription:

[Intro] 0:00

Charlie McCarthy 0:30

Welcome back to the MLSecOps Podcast, everyone!

I am one of your co-hosts, Charlie McCarthy, along with D Dehghanpisheh. In this episode we have the pleasure of hearing from Dr. Jennifer Prendki, founder and CEO of Alectio - The DataPrepOps Company. Alectio’s name comes from a blend of the acronym “AL,” standing for Active Learning, and the Latin term for the word “selection,” which is “lectio.” 

Dr. Prendki defines DataPrepOps for us and describes its contrasts to MLOps. She also discusses data quality, security risks in data science, and the role that data curation plays in helping to mitigate security risks in ML models.

Thanks so much again for joining us, and we hope you enjoy this week’s episode, The Intersection of MLSecOps and DataPrepOps.

D Dehghanpisheh 1:26

Dr. Prendki, I got to open up with a question that I know will be on a lot of machine learning practitioners' minds. And we know that data is oxygen that fuels an AI fire. But this is the first time I've heard of DataPrepOps. 

What is DataPrepOps and how is it different from MLOps? 

Jennifer Prendki, PhD 1:44

Yeah, absolutely. So I'm not surprised that this is probably a term that you heard for the first time, or one of the first times. So this is a concept that I've been evangelizing for a couple of years now. It's both a space and a practice. 

You mentioned MLOps, so I'd like to jump to that very quickly. MLOps is the space that helps with the operational side of the development of a machine learning model. And we know that it's both a series of companies, a landscape, and a series of different capabilities, but it's also the practice of a discipline within–or at the intersection of–data engineering, machine learning engineering, and machine learning. 

So DataPrepOps is sort of the same thing. It's a space and a series of techniques and best practices that focus on the data preparation side of a machine learning model. So, when you look at a landscape map of MLOps, you're always going to have that little spot there that mentions data labeling companies, synthetic data generation companies, basically any sort of support for the management and the creation of the training data set. 

So now, I guess another question is how is that different from DataOps? Right? And so this brings me to the short definition of DataPrepOps. It's basically the practice of converting a raw data set into a machine learning-ready data set. And so practitioners who are listening to us today would know that when you store a data set, even if you have a strong security layer, you have data privacy layers, security layers, a clear database that you can use for your data. 

Well, it doesn't matter how much data you have. It doesn't matter even like, I'm going to say, the quality, because there is a lot of transformations and preparation that needs to be done. Typically, people think or jump to the conclusion that we're talking about data labeling, which we do. I mean, it's definitely a very big part of the data preparation step. 

But that entire process of converting a data set into a training data set ready to be ingested by a machine learning model is basically what DataPrepOps is all about.

D Dehghanpisheh 3:51

Is it a fine-grained set of steps and capabilities and tools, if you will, that Alectio has for really, kind of, the whole data wrangling stage of a data pipeline? 

Is that a fair assessment? 

Jennifer Prendki, PhD 4:03

It's sort of a fair assessment. I would add with a grain of salt that I don't necessarily think data wrangling is the right term because that also involves data engineering, right? 

DataPrepOps is a very separate step compared to data engineering because we're not talking about even, like, managing the quality of your data in the sense of you have to be careful, that there are no missing features, that your data is stored properly, is properly secured, searchable, and whatnot. Data prep is everything that needs to happen afterwards. 

So you are correct in the sense that there are a series of steps. And so one of the things that we at Alectio evangelize a lot is that data preparation takes more than just data labeling or data annotation. That's maybe like an extrapolation of that conversation, but when you talk about data quality, we talk a lot about what I call structural data quality. 

On a static data set, you can identify gaps in your data, like the existence of corrupted records, these sorts of things. But really there is another concept which you could call data value or functional data quality, which is actually a function of what you're trying to do with this data. And basically you have a place where your data can be high quality, but it does not have any value, and hence it's not going to be helpful for your machine learning model. So this data needs to be enhanced, transformed, annotated, and augmented. That's what DataPrepOps is about. 

Charlie McCarthy 5:31

Interesting. 

So then talk to us a little bit, if you would, about the inspiration for founding Alectio with DataPrepOps as the foundational focus. 

Can you talk to us a little bit about the company, what makes it unique?

Jennifer Prendki, PhD 5:44

It's sort of a long story, but the company is like the eventual creation of tools I would have needed as a head of machine learning years ago. 

So, the story starts a decade ago because I come from the physics space at a time where we physicists had a lot of data and we did not have proper resources and proper servers to compute models on the entire data set. And so those data sets in physics tend to be low density in terms of information. There is a lot of data, but there isn't a lot of valid or interesting information in there, so there is a lot of noise.

Fast forward. I ended up in the machine learning world, and I ended up running machine learning teams. And so when I was working for Walmart Labs seven to eight years ago, I started realizing that we as a company were starting to see problems in terms of the volume of data we were working with. And I coincidentally needed to manage the budget for the conversion of the data which we were collecting into something that would be ingestible by a machine learning model. 

And so this is where I realized what do you really do when you're in a situation where the scale of data is so much that you just cannot either manage, but even pay the bills for the storage and the data preparation, the labeling and whatnot. And so we were in this interesting phase where the size of the data sets we were operating with was scaling very, very quickly, but the budget was not following. 

And so I started going back to my roots as a physicist, which is like, you really have to think about what really matters is information. Is it possible at all to reduce the size of a data set without compromising the performance of the model? And so this is where my first thinking of DataPrepOps–of course, I didn't come up with the term back in the days, but so it came from a data management perspective, right? 

And so fast forward, as I kept moving to other companies, I started realizing this is becoming more and more of a common problem. And I'm sort of seeing strain in a system or in the market where the entire industry is going towards scaling up instances, scaling up compute, scaling up storage and collecting more and more data with the sheer hope that basically this is going to improve the performance of a machine learning model. 

And so DataPrepOps started originally from a data curation perspective, which is one of the fundamental concepts, which is not data labeling, which is also part of DataPrepOps. And eventually today the division is like all of those steps that need to happen so that your data set is converted to a machine learning-ready data set that should be automated and managed by a machine. 

In fact, it should be managed by AI. 

D Dehghanpisheh 8:25

Cool. 

So Jennifer, this podcast focuses on ML and security, hence the term MLSecOps. I'm curious, as you thought about both Alectio, what you just described, and all the data elements there, and its applicability when you were head of a machine learning team and managed a lot of data scientists, tell us about some of the inherent security risks in data science. 

And I guess I'm asking because I'm curious as to what you see as the most common security gap or most common security risks right now in data science and in the area of DataPrepOps. 

Jennifer Prendki, PhD 8:58

So this is not going to come as groundbreaking news to you or the listeners of the podcast because in the context, obviously, of MLSecOps there is a risk in the model security, right? Let me extrapolate here. 

If I take a step back on DataPrepOps, which is obviously my [area of] expertise, one of the fundamental concepts that we are really playing with is that data is not information. And you can have large data sets with a fraction of that data really holding valid, valuable information, but also like who says information says sensitive information, right? 

So we always talk about data security. There has been a lot of effort on the market to protect the data–and indirectly the training data–that's going to go into machine learning models. But we really should be thinking about protecting information. I think everybody knows that it's kind of an underlying implication when you talk to experts that information is sensitive. 

There's actually an acronym called ROT. Redundant, Obvious, or Trivial Information, R-O-T. And so basically, if you have ROT data, you don't really have a security issue, right? Because that data is not really holding anything valuable or important or sensitive. 

So now that we’re established that what matters is protecting the information as opposed to the data, I believe that what DataPrepOps is about is creating a better workflow and better technologies to help transfer information from the data into a model. 

So this is super important from a security perspective because basically what I'm saying here is that the sensitive information that lives in your data, through the concept of training a machine learning model gets exported into the model. And so that means the model's parameters, the model's weights, the model's embeddings, and whatnot. 

So you have to think of whatever exists in the model as being like a copy of what the model knows, which is supposed to be an extrapolation of what comes in the data. That's literally the definition of machine learning. And I think the big problem is there is not enough effort on the market to protect that information that lives in the model or in the model's memory. 

And so you can make all of the efforts that you want to really protect the data. If you're not going to take care about protecting the model and the copy of the information that lives in this model. Because obviously you have a model you can infer, you can sort of reverse engineer the information that originally existed in your data set, then you failed, right? I believe that's exactly what MLSecOps is all about, right? So it's a very important piece, which is not directly data security, but information security. 

D Dehghanpisheh 11:43

It's security across the entire pipeline, especially the data, because the data components are so–

Jennifer Prendki, PhD 11:48

Absolutely. But so my point here is, if the information that lives in the data gets transferred into the model, we need more efforts to protect the meta information that lives in the model, if you will, right? So, more security for the models, please.

D Dehghanpisheh 12:02

Yeah. 

Charlie McCarthy 12:03

So in follow up to that, as we're thinking about things like protecting the data, what are some specific critical considerations, Jennifer, in data preparation to mitigate those potential security risks?

Jennifer Prendki, PhD 12:16

I mentioned data curation earlier, right? So the concept of data curation is really founded in the belief that you may have a very large data set. Let's say you have like two or 10 million sensitive pictures of license plates or whatnot, which obviously you don't want to see exposed. In general, there is only a fraction of that data that really matters to the model and really contains information that would move the needle and help the model learn.

So in other terms, there is generally a relatively small fraction of the data of the training data that truly contains relevant information. So that goes back to my idea of data value and my idea of functional data quality. And so if in practice, you could state that in practically every single case, especially the cases where the amount of data is bloated, if you have 10 million pictures, but truly there's only 1 million that would make a difference in the training process. Well, eventually, if you really operate only with these 1 million pictures, you would have lesser exposure and you will have lower risks. 

If we're talking about sensitive MRI patient data or whatnot, basically that means that by training only on the data that matters, you're really avoiding exposure of the data to 9 million patients which would have been exposed otherwise. In other terms, a reduction in the size of the data creates less exposure on the information that lives in the data set and the sensitive data.

The other aspect which I don't necessarily want to spend too much time on, but DataPrep brings a notion of interpretability, right? So people talk about explainability. Explainability in machine learning is basically the ability to say the model made ‘X’ decision because of a specific feature or even a specific weight or series of weights or specific layer in the model. 

Interpretability is sort of different because you go to explainability, you don't really have actionability. You just know that this person has been denied a loan because they live in a specific zip code or in a specific area. It doesn't tell you which part of the data caused the model to believe that people who live in a specific zip code are actually less likely to pay their loans back. 

And so DataPrep has this other dimension as well, because you can identify through DataPrep techniques which part of the data matters. You could really reverse engineer where the information comes from. That has huge implications and potential from a data security standpoint. 

You can basically get to the point where you have an end-to-end workflow, where up to now you have this sort of break in the middle, specifically with deployed models because they are black boxes where there is no mapping between the input and the output or the training data and the prediction. 

D Dehghanpisheh 15:04

Can you give examples of how that DataPrepOps practice enhances the security? 

You were saying, hey, “more security of the models, please,” and you just talked about kind of like the injection of the training elements. What other practical elements or practical techniques or procedures need to be applied that DataPrepOps, in service to MLSecOps, can enhance? 

Jennifer Prendki, PhD 15:23

Again, DataPrepOps is a wide space. There are lots of dimensions to it, even like a vector search would be one. Anything that helps prepare your data set would fall under this. One of the things that I think is very unique, very important, and has, again, huge implications for MLSecOps generally speaking, in the concept of DataPrepOps is the dynamic selection of a data set.

We talked about data curation. Data curation is usually used as a vanilla term for reduction of the size of the data set. But to take a step further, the way we do things at Alectio is based on an approach called Active Learning. So Active Learning is basically like a case where you bootstrap your training process with a little bit of data and then it's going back and forth between training and inference, training and inference, right? 

I mean, so in terms of workload, you have to imagine you have 100,000 pictures of something, right? You pick the first thousand, you train with it, you have a specific model which is in a certain state, it's not the model you want because you haven't used enough data yet. And then you can use various techniques to basically assess what you should pick next, right? 

And so the vanilla Active Learning does something very simple which is to look at inference time, which are the confidence scores or what the confidence scores of the predictions are, and use that to basically say it seems the model needs more help with that subset of the data. So you use that as a way to feed more interesting information. And basically in this specific case, you correlate the confidence score with level or density of information or how difficult it is for the model to make sense of it, which is not a one-to-one correlation. And that's actually one of the problems with this specific approach. 

But we love–and I love–the approach where you continuously keep picking data because again, you reduce the size and you don't pull data until you actually need to pull a specific record. You can live in a world where maybe you need only 1% of the data and the rest of the data did not get used in the training process. So basically the information never got transferred into the machine learning model. 

Extrapolating on that, because this is some sort of a meta approach. Because what you're trying to do with and specifically the version of Active Learning we're doing is that you're assessing the value of subsets of a training data set based on the reaction of the model. So you don't actually invasively look at the data. 

Curation is not about–or at least the way we do this–it's not about digging in the data and deciding that the density of cars in that specific image is higher and hence it's more likely to be useful, right? It's basically like letting the model do the talking and guide the training process. So again, you don't keep that data, you don't transfer that data, and you don't train with a specific record until you actually know you need to in order to make your machine learning model.

The second thing– In fact, I have two more things to say on the topic. There is the data labeling side of things, right? So less data means you have less data to annotate. The problem with data labeling is that there is still a ridiculous number of cases where this needs to be done manually, right? So you need to transfer the data that got selected for somebody else to see and basically assess the data.

Take the case of content moderation. I know we're talking about ChatGPT, Segment - Anything models basically like automating data labeling. This is a very unrealistic kind of view of what's going on right now. Because if you're going to do content moderation for a model that's going to be the back end for a child's hub, you need mothers to assess whether they think this is acceptable for their children to see or not, right? So this is not going to be done by a model or automatically because you need human judgment on top of this. 

You need more and for use cases where you need manual labeling, the more data gets immediately exposed to somebody else’s scrutiny, I mean, usually depending on people's intent, that's another huge, huge security risk. 

The other advantage of reducing the size of the data is the possibility of training or retraining models on the edge. And so that also means that there is less data transfer, there is less–

D Dehghanpisheh 19:30

Like, as in a federated training and learning cycle?

Jennifer Prendki, PhD 19:33

Mhm. Exactly.

D Dehghanpisheh 19:35

That makes a lot of sense.

Charlie McCarthy 19:36

So as we're thinking about this security intersection of DataPrepOps and MLSecOps, both terms which are relatively new to the industry– 

As we're learning about these concepts in security for organizations, Jennifer, and business leaders, folks who are starting to wrap their heads around these concepts–for the first time or in the early stages–How can those folks begin to integrate these practices DataPrepOps to enhance the security and performance of their data science pipelines? 

How should they be thinking about that and getting started?

Jennifer Prendki, PhD 20:10

Look, it's like best practices for DataPrepOps can already be involved. You would use a platform like ours, like everything's already managed and whatnot. One of the unique approaches we have in this case is the notion of setting data aside and calling the data that's useful on the fly, that's unless you have a sophisticated platform and some maturity and understanding of how that would work, it might not be easy to just implement a solution yourself. 

But what we mentioned earlier, right? I mean, DataPrepOps or DataPrep altogether has huge implications in the sense that you need to annotate data, you need to send the data somewhere. Up to very recently, for instance, a problem I've seen before starting Alectio was people would share data with their labeling companies through a CSV file in an email. So basically there is absolutely no compliance and no security and whatnot. I mean, bringing some level of education about those topics to teams would definitely help in that sense.

This is a problem that I was particularly aware of and shocked by because I would see people do that all the time, integrating best practices for these new fields. Generally speaking, unless you're going to go for a pro level or an enterprise solution, that gives you all of the steps that you don't have to worry about because that's definitely what we do with DataPrepOps. 

We don't want people to have to think about, what are all of those different steps? How do I source the right labeling company and can I trust to share the data with them? This is such a huge gap, right? I mean, the sheer fact that you have somebody in a company that would say, I'm just going to share access to an S3 bucket through email without any security key, best practices or whatnot. It starts with making people aware of those problems and you can already go a long way by having that kind of thinking among the team, among employees. 

D Dehghanpisheh 22:02

So we've talked about the technical security safety elements as DataPrepOps complements MLSecOps in a more secure machine learning model space. 

But beyond security, there's this whole world and realm of governance and compliance, right? How should we think about DataPrepOps in regards to AI regulations, particularly AI regulations like the EU Act that's getting ready to be activated, or GDPR

How does DataPrepOps intersect with data protection regulations? Can you talk a little bit about that? 

Jennifer Prendki, PhD 22:35

Of course. Yeah. 

So, to put more perspective on your question, let's take a step back. I'll try to explain basically where I see DataPrepOps on the market. So it sort of goes with what we were talking about earlier, right? Now, people are familiar with the concept of MLOps, which in my mind is almost an overloaded term, because MLOps has really become the space of model preparation, model tuning, model deployment, and the operations of managing the model. 

And when you look at companies in that space, even though most landscape maps still have this little data preparation box in there, on the other end of the spectrum, you have the DataOps space, which is data privacy, data management, data storage, and everything that's the management of a raw data set. And so, in fact, DataOps is not necessarily associated to machine learning. You also need DataOps for business analytics of any sort, basically like sort of an in between kind of thing, right? 

So practitioners would know that in order to take that safely-stored data set on something like Databricks or Snowflake or one of the cloud provider's products, you basically need to have those manual operations. Normally we would do that manually, where you would transform data, you would annotate the data, you would remove outliers, you would potentially apply augmentations or whatnot, and then that would go into your machine learning model, right? 

DataPrepOps sits in between. So it's like the operation on slide and an attempt at automating and creating an end-to-end workflow from the moment the data leaves the DataOps layer and enters the MLOps layer. And so I think there is a lot of awareness to protect the data and provide security solutions in the DataOps layer. If you buy a product from a DataOps provider, you can have some faith that they're doing something to protect your data, avoid data leakage and whatnot. 

However, from what we said earlier, the model layer is not necessarily that protected because there is still no real awareness that there is a copy of the information that used to live in your data set and is now stored in your model. That's literally what MLSecOps is about, right? So, DataOps being in the middle, there needs to be more regulation on the model layer, the model preparation. Because you have a machine learning model, if you're going to deploy that machine learning model somewhere that's a vulnerability to your organization, you could have adversarial attacks and lots of problems from that. Right? I mean, so in fact, somebody who could just trigger an inference endpoint could figure out a lot of things. 

DataPrepOps being in the middle is even worse because, being in between, you're exposed to both the data side and the model side. The way we operate is really this logic where though you're training the model, assessing which subset of the data would be useful in pulling that data. At this specific point in time, whatever new technologies and regulations exist and are going to be created for data security are going to apply to DataPrep and security implications for the model are also going to apply to DataPrep. So this is where everything needs to go. 

Actually this is something I recently just realized. This is very important, I think, for security people to realize in the context of AI regulation. GDPR comes from Europe, right? Culturally speaking, people in the US have a culture of asking for forgiveness more than permission, while Europe has the other way around, which basically gives an edge for a researcher in the US to move faster, right? Let's move, let's figure out the regulations afterwards. But that gives an edge to develop technology faster. 

While Europeans typically have the other position, which is we cannot move technology forward unless everything's properly regulated. It's like two sides of the same card in many ways. So the way I see a global awareness that there needs to be more security conversations for AI and for machine learning models is going to come from an enhanced conversation and the collision of European mindset and the US mindset into something that's acceptable from both parties. 

And I think we're going in this direction because as you probably know, the CEO of Open AI, Sam Altman, just called for regulation himself. So I think we're getting there. 

D Dehghanpisheh 27:01

So, Jennifer, against the backdrop of regulation and more security and the intersectionality with generative AI applications, massively large training data sets, and all of the concerns that come around about hey, whose data is it? Where is original work? Where is it not? 

As you think about that, what are some of the macro concerns you have as it relates to the challenge of data in generative AI against those policies and still maintaining security? What are some of the trends that you're thinking about right now? 

Jennifer Prendki, PhD 27:38

So this is, I believe, a very important question right now. And I actually wrote an article discussing how DataPrep for large language models differs from what we're used to. So here is basically like food for thought, I would say: 

People are familiar with Generative AI–large language model, but Generative AI in general–And I think we've all heard now that there are problems, as you mentioned, for original works and whatnot. For example, the Copilot model being trained on Stack Overflow data. For instance, can you use an open source training data set and use open source data to train on things? We would have all thought that because people answering questions on Stack Overflow actually makes that data public and then there should not be an issue to train on that data, right? 

And so obviously this is something where there was a pushback on. On the other end of the spectrum, you have companies that are training on Getty Images basically because they need this huge, massive scale of data that just cannot be generated at the company level. So I would go back to basically a prior thinking I had already maybe a couple of years ago that goes into the concept of data market. 

So forget for a moment basically that concept of Generative AI. Imagine now you have your company that does satellite management, like maybe a telecommunication company. You own satellites that revolve around planet Earth, right? And of course you're going to have opportunities to collect data and you're going to have opportunities to collect climate data, maybe like defense data, security data, identify where you have patches of pollution, migration of populations and whatnot, right? And all of that data that you could collect, it's actually not something that's going to be used by your company because you're a telecom company. You're not a climate monitoring company. You're not the Weather Channel or anything, right? 

At the same time, people always believe and still believe that data is the new oil. It's their gold. And they really treat that as a moat for their own organizations. People used to sit on their data and believe–

D Dehghanpisheh 29:55

You never know what your data is going to be worth because you can't really guess what the use case is going to be in the future. 

So why would you give up the royalty? 

Jennifer Prendki, PhD 30:01

Exactly. 

And so even taking it one step further, it's like the opportunities you have to collect data might not align with what you do. You might have opportunities to collect data that really would be more useful to somebody else. And so that requires a shift in mindset that my data is my gold, but now I can monetize on the data itself instead of using that data for my own data applications. That goes into creating a market where you can share data safely across organizations. 

That problem already existed before Generative AI. And companies are not yet comfortable just saying I'm going to sell my data because there are all of these implications. Who owns the data? Is it deontological to do this? Is it okay to do this? Is somebody going to sue me because they're in the picture I just took? Or maybe they're not willing to take the risk if it's not their own data application and they actually don't know how this is going to be used, right? 

I mean, I would say the greatest risk of all if you sell data to somebody else, right? What if this is used by terrorists or for some kind of dark purpose? But at some point there is an opportunity cost to not selling your data. That problem existed even before Generative AI. But I think Generative AI made that problem worse because those organizations, any organization in Generative AI just cannot afford and assume that they can operate in a world where they can afford to generate or create all of the data they're going to be trading on, right? 

So basically it means they need to rely on the data market and that's sort of what happened, right? Because they were relying on data coming from Stack Overflow and Stack Overflow basically was not selling the data. So how this is happening now is interesting because coincidentally at the same time Twitter made their data API for sale. So it's not a free API anymore and it's kind of expensive. I think that's a prediction of where the market is going to go, right? 

I mean, Twitter owns or not, that's a separate question, right? Because people will always say the users own the data. You could answer to that, that if you're not paying for the product, you are the product. So maybe the data you are generating is the product, right? This is where conversations about ownership need to go. Basically I really, really see a need for a global data market, which also leads to eager implications from a security standpoint and from a liability standpoint.

Charlie McCarthy 32:24

These are some fantastic insights and you clearly, Jennifer, have a very deep knowledge in this space. I would encourage all of our listeners to go check out the resources offered by Alectio, read the blog. 

As we're closing out this conversation. Jennifer, if you could offer one piece of advice or perhaps a call to action to organizations, data teams looking to reduce their security risks and improve their posture, what might that one call to action be? 

Jennifer Prendki, PhD 32:51

I would say education.

I mean, so the biggest security risks from an AI machine learning perspective and from a data management perspective starts with the humans managing the systems. If you have new hires in your organization, make it clear what could happen to the organization–and in general–if there was a data security issue, a data leakage, so that they relate and they take the necessary initial steps. 

And of course, in a world where everything is going to revolve around data and AI, there will be adversarial attacks, there will be many things we can’t control. And we're going to need better regulations and better technology for but you can probably resolve or prevent 95% of problems just by helping people use their best judgment and be aware of the risks that their own actions might have for their organization. 

Charlie McCarthy 33:41

Yes, thank you. 

And we 100% agree. MLSecOps.com is a great place for resources if you're looking to learn more about this topic and other MLSecOps security topics. 

Again, Jennifer, thank you so much for being on the show. This has been a fantastic conversation. 

Jennifer Prendki, PhD 33:57

My absolute pleasure. 

Charlie McCarthy 33:58

Thank you, D, my co-host. 

D Dehghanpisheh 34:01

Thank you, Jennifer. Thank you, Charlie. 

And thank you to our listeners and readers.

Charlie McCarthy 34:04

See you all next time. 

[Closing] 34:11

Thanks for listening to The MLSecOps Podcast brought to you by Protect AI.

Be sure to subscribe to get the latest episodes and visit MLSecOps.com to join the conversation, ask questions, or suggest future topics.

We’re excited to bring you more in-depth MLSecOps discussions.

Until next time, thanks for joining!

Additional tools and resources to check out:

Protect AI Radar

Protect AI’s ML Security-Focused Open Source Tools

LLM Guard - The Security Toolkit for LLM Interactions

Huntr - The World's First AI/Machine Learning Bug Bounty Platform

Thanks for listening! Find more episodes and transcripts at https://mlsecops.com/podcast.

SUBSCRIBE TO THE MLSECOPS PODCAST