The Data Cloud Podcast

AI and Real-World Data: A New Era for Identifying and Curing Rare Diseases with Chandi Kodthiwada, Vice President of Product Management at Komodo Health

Episode Summary

In this episode, Dana Gardner is joined by Chandi Kodthiwada, Vice President of Product Management at Komodo Health to explore how Komodo Health utilizes vast and disparate data sources to generate unprecedented insights in life sciences and healthcare. They discuss the founding mission of Komodo Health, the challenges of building a comprehensive, de-identified dataset, and AI’s role in reducing the burden of disease and improving patient outcomes.

Episode Notes

In this episode, Dana Gardner is joined by Chandi Kodthiwada, Vice President of Product Management at Komodo Health to explore how Komodo Health utilizes vast and disparate data sources to generate unprecedented insights in life sciences and healthcare. They discuss the founding mission of Komodo Health, the challenges of building a comprehensive, de-identified dataset, and AI’s role in reducing the burden of disease and improving patient outcomes.

Episode Transcription

[00:00:00] Producer: Hello and welcome to the Data Cloud Podcast. Today's episode features an interview with Chandi Kodthiwada, VP of Product Management at Komodo Health, hosted by Dana Gardner, Principal Analyst at Interarbor Solutions. Chandi explores how Komodo Health utilizes vast and disparate data sources to generate unprecedented insights in life sciences and healthcare.

[00:00:25] Producer: Dana and Chandi discuss the founding mission of Komodo Health, the challenges of building a comprehensive, de-identified dataset, and AI's role in reducing the burden of disease and improving patient outcomes. So please enjoy this interview between Chandi Kodthiwada and your host, Dana Gardner. 

[00:00:42] Dana Gardner: Welcome to the Data Cloud Podcast, Chandi. We're delighted to have you with us. 

[00:00:48] Chandi Kodthiwada: Great to be here, Dana. Looking forward to it. 

[00:00:51] Dana Gardner: Me too, now. The need to assemble it ingest a vast number of disparate data sources to build massive data sets, allows Komodo health to discover unprecedented insights across a wide range of life, science and healthcare inquiries.

[00:01:06] Dana Gardner: The scale and velocity requirements are truly extraordinary. So tell us about Komodo Health. What's the company's mission and how is it that data is so foundational to your work? 

[00:01:16] Chandi Kodthiwada: Yeah. At Komodo, we're on this mission of reducing the burden of disease and we do that by, you know, various ways. But before I tell you how we do it, let me tell you a little bit around how we were founded 11 years ago.

[00:01:33] Chandi Kodthiwada: Our founders Arif and Web were working in the industry as consultants, and were deeply aware of this problem of low quality data and taking too long to find insights. Sure, it was great job security as consultants, but they decided to solve for it and 11 years later, we were one of the country's top, deep and wide real world dataset.

[00:02:00] Chandi Kodthiwada: That is absolutely foundational for our work. You know, we’ve spent the past decade building this product we call the Healthcare Map. The founders joke about this, but unfortunately it's too true that there is abundance of information, but scarcity of insight. What we've done with Healthcare Map and what we're doing these days on top of Healthcare Map is supposed, is intended to solve for these two things.

[00:02:28] Chandi Kodthiwada: How do we give deep, deep, rich insight of healthcare data and how do we help our customers understand and utilize the insight from all of this data to work towards in our mission of reducing the burden of disease? 

[00:02:43] Dana Gardner: So this is really the biggest of big pictures when it comes to health from the individual up to the biggest type of population category. It seems like you really literally have your finger on the pulse of the world's health. Is that fair? 

[00:02:58] Chandi Kodthiwada: I have tried multiple times. My mother asked me what I do, and that is the best explanation of our work I've heard so far. I'm gonna steal it, I'm gonna use it. 

[00:03:08] Dana Gardner: Okay. Well, tell us about your particular role individually at Komodo, your background and, and what motivates you to tackle such a large challenge and, and the ensuing opportunities that it brings.

[00:03:21] Chandi Kodthiwada: I am, Vice President of Product Management at Komodo Health. I’ve been around the industry for about 20 years in healthcare and life sciences, data analytics. I have lived and breathed the complexity of data and trying to figure out how to make sense of the chaos and the complexity of a wide variety of data to make decisions like one of the complex things that I've seen in over two decades.

[00:03:54] Chandi Kodthiwada: I'm happy to report that technology has been effectively used in this world to solve for simplicity. The  fact of the matter is it takes about 10 years for us to learn something works and for us to be able to actually use that learning in a prescribable treatment way. And it's close to about $2.63 billion between the two as learning something that works and as being able to say, let's use it widely, like 10 years and close to $3 billion is too much of a cost.

[00:04:30] Chandi Kodthiwada: To know something works to make it, you know, a prescribable drug. And I've been through COVID times and it has had a drastic impact on me on what a single day could change in one boat, you know, disease burden. And my learning the past few years has been. There are ways to reduce this gap. Going from 10 years to a year and a half is what we have seen with COVID drug and reciprocal cost reduction.

[00:05:02] Chandi Kodthiwada: So application of real world data and real world evidence, and application of cutting edge technology is proven to bring those two down, the time it takes for us to bring drugs, miracle drugs, if you will, and the cost down. 

[00:05:19] Dana Gardner: Great. Now you mentioned the Healthcare Map, and as I understand it, something in the order of 330 million de-identified patients are included in this data set, and I can't even imagine the number of data points that is coming together.

[00:05:36] Dana Gardner: And by de-identifying the patients, you get access to more information, more data sources. So tell us a little bit about this whole process of ingestion and the many disparate world, real world data sources that you're using to put together this very impressive observation vehicle. 

[00:05:54] Chandi Kodthiwada: Our vision was never to create the biggest dataset that there ever was because our learning earlier on and our founders learning earlier on was that big data came with quality issues.

[00:06:08] Chandi Kodthiwada: The bigger numbers never meant better used. So we approached it, you know, from a different viewpoint. Arif, one of the founders is an MD doctor, and his view has always been about like, how do we emphasize visibility into complex patient journeys. If you look at various disease conditions, and if you look at various diseases, I'm sure you and the audience have heard about this concept called precision health.

[00:06:40] Chandi Kodthiwada: It's this vision that if you understand disease pathways and pace journeys very closely. We would build better drugs, and given how complex conditions like cancers are, it's very helpful to understand these diseases and the patient journeys the patients go through in a deep and richer way. That helps us build this compelling view of how do we build the most effective treatment for a particular disease condition.

[00:07:11] Chandi Kodthiwada: The richness or the higher quality of it comes from as being able to view three outta four prescription events, more than half of the medical encounters. And that gives you this trustworthiness and that gives you this high quality of the data. And just because how we've integrated various sources of data in this, it gives you what researchers love to call longitudinality.

[00:07:40] Chandi Kodthiwada: So it's not a point of event of data. You're able to view this rich dataset. 12, 14 at 26 months of continued observation of a de-identified patient's journey. So some of the complexities that come with incidental data, for example, when you and I go to a PCP, they get a point of time data. 

[00:08:04] Chandi Kodthiwada: If I purely relied on that data, it would be four dots, maybe five dots in a particular year. And if you go to maybe a clinical trial, it's even more thinner slice of data, and it's even more disparate. Sources of data and what we have done is built various degrees of depth into this and various types of data into this. So it gives you this deep, rich dataset that you could learn more about.

[00:08:33] Chandi Kodthiwada: What did this person go through six months before I got the data? And how has the patient or a person been six months after the data? So the richness of the data comes through when we do this initial homework of how do we establish the depth and how do we make it more latitude now? 

[00:08:51] Dana Gardner: Mm-hmm.

[00:08:52] Dana Gardner: And just for clarity for our audience, not everyone might be familiar with the constraints, but because of HIPAA and other myriad regulations and compliance requirements, that de-identification is important. So in order to get the most and best data to identify trends at that larger perspective, you need to, as you say, de-identify.

[00:09:14] Dana Gardner: So maybe just clarify that so our audience understands why it's useful to do that and beneficial when it comes to the larger picture. 

[00:09:22] Chandi Kodthiwada: Legal, ethical, moral implications. Legal, obviously being some of the things that you have mentioned. Every country has their own view of how they establish privacy rights, and there are  also ethical and moral implications.

[00:09:37] Chandi Kodthiwada: For example, if you were to study a fatal disease. It's just not ethical or moral to do a clinical trial. Patients are suffering. It's, it's almost all the time, a fatal condition. So both regulators and pharmaceutical companies don't tend to study this disease in real life patients. So the moral implication is this.

[00:10:04] Chandi Kodthiwada: We have to identify a way to study this population so we can cure the condition. Now, the moral responsibility is how do we still enable that study in a non-identified way and do the, you know, ethically right thing, which is how do we solve for the disease without compromising patient privacy? So various perspectives on the, you know, ethical, legal, and moral responsibility behind patient privacy.

[00:10:33] Dana Gardner: Great. Thank you for clarifying that. Now, what were some of the large obstacles that you face when identifying and gather and ingest and making operational, making practical use of all of this data on these very diverse and large number of sources. So tell us about, you know, what you had to go through in order to make this rich, powerful data source effective and usable.

[00:11:02] Chandi Kodthiwada: Web used to joke about this. Around 2010, we went through this era of patient data explosion, meaning anyone and everyone who had the right went and acquired any and all data possible. And the result of the accumulation was that they focused on collecting volume of data, but the quality and quality of the data went down and it created this fragmented set of disconnected assets.

[00:11:35] Chandi Kodthiwada: So to overcome all of this, we had to work on sophisticated algorithms, work on this entity resolution to resolve for you, disconnected entities and a whole lot of correction around bias and collection. And obviously to some extent the questions that you were asking around how do we do this in a privacy respecting way?

[00:12:00] Chandi Kodthiwada: De-identification was also another effort that was involved. A combination of all of these sophisticated algorithms, ability to resolve entities across these, you know, disconnected data assets. And all while spending a whole lot of time, love and care around how you respect the privacy of patients. And here we are.

[00:12:22] Chandi Kodthiwada: I speak about the moral imperative, but those are all the investments that we've done past 10 years to get to what we call the Healthcare Map Today.

[00:12:30] Dana Gardner: And in this Healthcare Map, maybe you could give us a, for instance of a fairly typical end-to-end journey of data from a source through the various algorithms and other processes that then allows you to make queries against it and get actionable insights.

[00:12:46] Dana Gardner: So what's a fairly typical way in which an end-to-end journey of the data goes? 

[00:12:52] Chandi Kodthiwada: Oh, you're asking me a million dollar question.

[00:12:58] Chandi Kodthiwada: Maybe 350 plus million dollar question I should say. So, some of the things I mentioned earlier on algorithms to entities, especially when we're trying to connect data from various entities that were not connected and things that needed systematic network construction. 

[00:13:24] Chandi Kodthiwada: So when a person switches from an employer to employer, there is a certain disconnection that happens in being able to view and track them across payers and across providers. People like healthcare networks, a hospital or a PCP you may go to, or two sets of employers may have different insurance providers, right? So the work that we're doing across systematic network conversion or this entity, entity illusion or bias correction is helping us build this. 

[00:13:52] Chandi Kodthiwada: Still the identified, but an individual level, view or narration that makes it possible for researchers to understand these longitudinal patient journeys. And it also goes through various lifecycle optimization. Most of the complexity is in cleaning and linking that data to simple terms, but have taken millions of dollars in years of work to get and get right. 

[00:14:22] Chandi Kodthiwada: And if you ask a few of our customers that tell you, Komodo’s Healthcare Map is the cleanest yet well-linked longitudinal journey of patients and use healthcare map. 

[00:14:36] Dana Gardner: Hmm, of course, it's one thing to get the data, it's another to then try to optimize the lifecycle of the data. So it's never a journey that's ended.

[00:14:45] Dana Gardner: You're always peeling back the onion, if you will, and finding new ways of trying to improve. So how are you trying to optimize your data through its lifecycle? And how is Snowflake helping you with that and Snowflake's ecosystem? How do they accelerate your mission to having that cleanest but also very well-functioning data?

[00:15:05] Chandi Kodthiwada: We look at a trillion, if not more than a trillion rows of data. People who are working with data understand you run a random query on trillion rows of data. Like here you are sitting for a week to get the result of a malformed query. So this is where partners like Snowflake come into the picture.

[00:15:34] Chandi Kodthiwada: There are certain features or aspects of the business that are the features that you need to work to be able to operate in this business. You know, when you're working a trillion rows of data and the richness to speed and the scale Snowflake brings us, allows us to move away from, how do I write or build queries that run in a performance way or a trillion  rows. 

[00:16:01] Chandi Kodthiwada: How do we create cohorts that researchers are interested in? How do we enable fast insights to patient journeys, right? So, the marriage of the two companies is this, you know, effective realization of what the industry needs, help us identify condition, help us identify patient population, so on and so forth.

[00:16:21] Chandi Kodthiwada: And that said, let me give you some specific examples. I'm sure there are some Snowflake fans out there that want to understand. Help me understand specifically a Polaris catalog, an open source catalog that helps us distribute Healthcare Map. We do close to 400 plus data deliveries every month. Imagine trying to multiply a trillion plus rows of data 400 times in very specific ways.

[00:16:55] Chandi Kodthiwada: So the data delivery by itself needs to be a few thousand people if Snowflake weren't it. So they bring the regular scale and reliability and compliance. My view given, you know, we are in a HIPAA compliant way, so we get that data governance, compliance at scale of operations, even when we operate here.

[00:17:17] Chandi Kodthiwada: And a large percentage of our customers are also customers of Snowflake. So looking at something like Snowflake Snowshareallows us to simply say here is what's changed from the last time we gave you the data to now instead of us always packaging a clear source of data. 

[00:17:43] Chandi Kodthiwada: Most recently we've been enjoying what Cortex can do for us,specifically around, imagine the rigor and scale of the data pipelines that goes in, and the complexity behind doing it 400 times a month at trillion rows to 300 plus customers. Cortex now allows us to simply write natural language queries about where did data go? To who? To what purpose? How was the data delivered, so on and so forth without, you know, us having to write a whole lot of code to inspect the whole lot of code that we built in pipeline.

[00:18:13] Chandi Kodthiwada: So the perfect example of how do you run this scale of an operation without that scale of operation, taking over our original mission, and that's what Snowflake is helping us do. 

[00:18:26] Dana Gardner: Okay. You mentioned the natural language. I should imagine that you'll want to increase over time the type of healthcare and, and life sciences researchers who can make the most use of your data.

[00:18:41] Dana Gardner: Are you moving towards helping them use natural language? And how they access research and, and insights. I understand how you mentioned you're using Cortex to help optimize and scale the data, but how about from the end user perspective? 

[00:18:58] Chandi Kodthiwada: Yeah, firsthand, I witnessed, and they've been part, and they started out as a developer, the industry, it's considered sunk cost.

[00:19:08] Chandi Kodthiwada: And the sunk cost here is if you have a complex question, the sunk cost it's going to take a week, a month, and in some cases more than a month's time to circle back to an answer to a complex question, and this is one of those other capabilities where we see we could have huge impact. And we recently launched what we call Marmot, Komodo Health's analytic AI platform.

[00:19:38] Chandi Kodthiwada: And the idea behind it is a simple kind of idea. Some of the feedback I hear from our customers who get the data is thatit takes too long for us to find insights from this data. It's complex, it's diverse. I need experts. I need people who can understand the business. I need people who can write code. I need the, you know, the business person to talk to the, you know, person who writes the code, all of these weeks need to happen for me to get an answer.

[00:20:14] Chandi Kodthiwada: And the result was you get data from you on January 1st, chances are March 15th, you're still looking around for an answer from, you know, one of the experts. And Marmot is our thoughtful way of bringing howwe've done over a million of cohorts and we have helped hundreds of customers solve for these analytical patterns over the last 11 years.

[00:20:42] Chandi Kodthiwada: And Marmot is our way into how do you enable a natural language interface to these deep, complex problems? And how do we, Komodo, with all of our expertise, distill it into bringing LLM excellence into how do we simplify answers to these questions? How do we shrink this, you know, six month, three month long wait period behind insight and data and shrink that into, say, a handful of minutes?

[00:21:14] Chandi Kodthiwada: And what we hear today from our customers is what used to be the sunk cost of three months. I asked the too complex of a question, is now 15 to 20 minutes of playing around with Marmot. 

[00:21:28] Dana Gardner: Wow, that's very impressive. Can you provide perhaps some examples of how Marmot is uncovering things that were previously entertainable or not practical, given the time constraints? So is there a, I don't know, a use case that would help illustrate this incredible time compression that you described?

[00:21:50] Chandi Kodthiwada: This is a joy of working on Marmot and bringing Marmot to market and thinking about this one use case. The realization was one of our sponsors was looking for, we have a better drug and we're trying to understand how to direct it well and what we've been able to help them with ease, providers for prescribing competitive therapies.

[00:22:20] Chandi Kodthiwada: The macro pattern we identified was that it wasn't being effective. Obviously, the providers would know a few weeks, few months, maybe some quarters later, but we've been able to identify a pattern around ow prescribing patterns were being ineffective because we can measure other attributes, such as the industry term is HCRU, healthcare resource utilization.

[00:22:49] Chandi Kodthiwada: It's a proxy to how has patient's condition improved after a specific, you know, therapy or a drug has been prescribed and because, you know, depth, and with the Healthcare Map, we can see the impact of a drug. And in this case, our study was that when prescribing a competitive therapy, it was being ineffective.

[00:23:13] Chandi Kodthiwada: So we taught this insight back to the sponsor we were helping with and  it's rich. It's 80% patient coverage. So eight out of the 10 patients we were seeing it was being ineffective and we helped the sponsor bring the right set of patients, or excuse me, the right set of therapies to those group of patients.

[00:23:34] Dana Gardner: You just opened my eyes to, you know, there's so many variables involved with why a therapy would work or not work, and you're able to go in and distill out among them, you know, what is or isn't working. And it's not always just what's intuitive. There are the counterintuitive things that you could only see from that vantage point of the whole data.

[00:23:54] Dana Gardner: So that's very impressive. Alright, well healthcare as an industry has been, I guess we could call them a slow industry to adopt sometimes new technologies and perhaps artificial intelligence is among them. There's, you know, the issues of trust and compliance, but there's also perhaps just being tentative for something that's new.

[00:24:16] Dana Gardner: But it sounds like the implications of AI and data in this type of population level analysis is overwhelming. How is Komodo helping break down the barrier, if you will, of adopting some of these newer technologies and making them accepted as well as very prominent and useful? 

[00:24:37] Chandi Kodthiwada: If you look at the past 30, 40, 50 years of how clinical trial research and how drugs have been launched, most organizations have, like turn the last knob in the org chart in the talent, in the available data to do the best they can. Right? So I'm trying to say as much optimization as possible through tech, through existing tech, and through existing people and organization structures. 

[00:25:15] Chandi Kodthiwada: Some large organizations, well funded, well resourced, have gotten to a point where, you know, this is the best we can do with the tech we have and the people there is. But going back to the earlier discovery, there are 10,000 diseases that we solve for 15 a year. Something has to change and something has to give us this, you know, this magical way to one run analysis, understand deeply, empathize effectively, and bring drugs to the market.

[00:25:44] Chandi Kodthiwada: Right. So there is that willingness from pharma, given all the advancements that have occurred with AI past 12 to 24 months, and then there is the practical nature of pharma and life sciences and healthcare as industries that risk averse for the right reasons. They want it to be transparent. They want it to be compliant.

[00:26:11] Chandi Kodthiwada: They want it to be not a black box, and they want it to be explainable, especially when it comes to making decisions that affects peoples’ health and healthcare, and in this case, large populations. 

[00:26:25] Dana Gardner: Right now, we're almost out of time, but I wanted to touch on what is presented to me from your last response is that there's a cultural aspect to this, and being an AI-first or a data-first organization could go a long way towards accepting with the proper guardrails in place, some of these newer capabilities.

[00:26:44] Dana Gardner: So I assume that Komodo has an AI-first, data-first culture, but how do you see that broadening to your partners as well as, of course, your customers? So, do you see a cultural shift happening and what have you done internally that works that others might learn from?

[00:27:03] Chandi Kodthiwada: Komodo has been data-first since our founding 11 years ago, and today it is absolutely an AI-first culture. We view AI not as a replacement, but as a way to amplify human expertise. For example, we established an AI council at work that focused on how do we bring AI fluency to, we lovingly call Komodo people Dragons.

[00:27:32] Chandi Kodthiwada: Komodo Dragons, there's a pun there somewhere. But all of this, it's not just we are pushing our customers to be AI-first, but at the heart of Komodo, there is deep desire to be AI-first and AI fluent that we're spending the money, the tools and the processes that need to change to bring AI fluency to every Dragon.

[00:28:02] Dana Gardner: Right. Well, we certainly have covered a lot today, Chandi. I really appreciate it. Last question. Everything is moving very fast, of course, with AI and data, but in healthcare, where do you think the industry is heading and what's next for Komodo? What can we put our crystal ball in front of our listeners and say, here's what you can expect from some of this legwork and very important investments that have been made so far.

[00:28:26] Chandi Kodthiwada: I started with the, it takes too long and it is too expensive to bring researched products into patient's hands. The reason it is because it is a whole lot laborious, burdensome process, both for healthcare and for research. And AI has proven that it can take off some of the low impact operational work out of people's hands.

[00:28:54] Chandi Kodthiwada: It should be an incredible lever in speeding up the 10 year process. It should be an impactful force in bringing the cost down. Now, marry this with real world data and real world evidence. Now you have a technology that meets the depth and rigor of someone like regulatory body and for pharma to wield this power, both of the tech and of the data.

[00:29:23] Chandi Kodthiwada: To influence, you know, some of the things that we've taken for granted. So the future is a  whole lot of humans doing high impact work. Humans asking tough questions and it not being a high cost or a high time consuming. 

[00:29:43] Producer: Want to see what's next for apps and generative AI and the Data Cloud? Check out BUILD–Snowflake's annual developer event. Dive into the latest innovations like Snowflake Intelligence, now GA, and explore how developers are building powerful apps, data pipelines, and machine learning workflows for the LLM era. Watch on demand at snowflake.com/build.