The Data Cloud Podcast

Revolutionizing the Future of Computing with Benoit Dageville, Co-Founder and President of Products, Snowflake

Episode Summary

This episode features an interview with Benoit Dageville, he is the Co-Founder and President of Products of Snowflake. In this episode, he’s updating us on all things Snowflake. Touching on the nuances of the data stack, the evolution of Snowflake's data platform, a glimpse into the early days of Snowflake, and much more.

Episode Notes

This episode features an interview with Benoit Dageville, he is the Co-Founder and President of Products of Snowflake. 

In this episode, he’s updating us on all things Snowflake. Touching on the nuances of the data stack, the evolution of Snowflake's data platform, a glimpse into the early days of Snowflake, and much more.

--------

How you approach data will define what’s possible for your organization. Data engineers, data scientists, application developers, and a host of other data professionals who depend on the Snowflake Data Cloud continue to thrive thanks to a decade of technology breakthroughs. But that journey is only the beginning.

Attend Snowflake Summit 2023 in Las Vegas June 26-29 to learn how to access, build, and monetize data, tools, models, and applications in ways that were previously unimaginable. Enable seamless alignment and collaboration across these crucial functions in the Data Cloud to transform nearly every aspect of your organization.
Learn more and register at www.snowflake.com/summit

--------

Other than statements of historical fact, all information contained in these presentations and oral commentary (collectively, the “Materials”), including statements regarding (i) Snowflake’s business strategy and plans, (ii) Snowflake’s new and enhanced products, services, and technology offerings, including those that are under development or not generally available, (iii) market growth, trends, and competitive considerations, and (iv) the integration, interoperability, and availability of products with and on third-party platforms, are forward-looking statements. These forward-looking statements are subject to a number of risks, uncertainties and assumptions, including those described under the heading “Risk Factors” and elsewhere in the Annual Reports on Form 10-K and the Quarterly Reports on Form 10-Q that Snowflake files with the Securities and Exchange Commission. In light of these risks, uncertainties, and assumptions, the future events and trends discussed in the Materials may not occur, and actual results could differ materially and adversely from those anticipated or implied in the forward-looking statements. As a result, you should not rely on any forwarding-looking statements as predictions of future events.

Any future product or roadmap information (collectively, the “Roadmap”) is intended to outline general product direction; is not a commitment, promise, or legal obligation for Snowflake to deliver any future products, features, or functionality; and is not intended to be, and shall not be deemed to be, incorporated into any contract. The actual timing of any product, feature, or functionality that is ultimately made available may be different from what is presented in the Roadmap. The Roadmap information should not be used when making a purchasing decision. In case of conflict between the information contained in the Materials and official Snowflake documentation, official Snowflake documentation should take precedence over these Materials. Further, note that Snowflake has made no determination as to whether separate fees will be charged for any future products, features, and/or functionality which may ultimately be made available. Snowflake may, in its own discretion, choose to charge separate fees for the delivery of any future products, features, and/or functionality which are ultimately made available.

© 2022 Snowflake Inc. All rights reserved. Snowflake, the Snowflake logo, and all other Snowflake product, feature and service names mentioned in the Materials are registered trademarks or trademarks of Snowflake Inc. in the United States and other countries. All other brand names or logos mentioned or used in the Materials are for identification purposes only and may be the trademarks of their respective holder(s). Snowflake may not be associated with, or be sponsored or endorsed by, any such holder(s).

Episode Transcription

Steve Hamm: So Benoit, great to talk to you again, you know, we've been talking for four years now, so I feel like we're almost old friends.

Benoit Dageville: Yes, exactly. Yeah.

Steve Hamm: Yeah, I wanted to start off today with some real historical context, you know, way back in 1990, bill gates gave a famous speech at the conduct's, uh, you know, the computer show in Las Vegas. He predicted that someday the computer industry would put all of the world's information at people's fingertips.

Information at people's fingertips really became kind of the guiding light for not, not just for, for Microsoft, but for a whole industry of the whole PC industry. You know? And when I look at today, it seems snowflake is delivering on that promise. Talk about the big picture, where do snowflake and the data cloud suit in the history of computing.

Benoit Dageville: Yeah, that's a great question. Um, yes, you are very right. I [00:01:00] really think that's, that's snowflake delivers really on that prediction, but. Differently right. For the, I would say for the data driven world and, and if I want to paraphrase, you know, what you just say, it's really about putting data, the finger tips of any organization.

And, and when I say any organization, I mean, from the smallest organization to the biggest that that's, that's really. Somewhere could, could have been on mission. And, and when I say any data, it's probably mean any data, whatever the size, you know, we can be, you know, a small size, few gigabytes to, you know, petabytes size and exabytes, uh, but also any structure of data from structural data, assuming structured data on unstructured data and, and also.

About accessing data, giving, uh, these organization options to access this data is not only data that these organizations [00:02:00] own, but also data which resides outside of the boundaries of these organizations. And that's enabled by snowflake, you know, real world data sharing capabilities, which makes it really possible for organizations.

Connect the data sets who any other data sets on our global platform. And that's, you know, really create this network that our network across the globe. And we enabled that. And, and that's what I often refer to as the worldwide web of, of data. And, and if you think about. These predictions that the gates, you know, did it, it's really about, you know, the world wide web and we are somewhere creating this world wide web of, of data.

Um, so, so yeah, um, very much.

Steve Hamm: Yeah. Yeah. That's interesting. You know, I want to drill into that just a little bit. Cause you talked about, you know, structured data, you know, which in traditional really relational databases, unstructured data I'm in some structured and now [00:03:00] unstructured data. That's the, that's the, was the real tough nut to crack.

How did you guys address that unstructured data?

Benoit Dageville: Yes. So, so, um, we started, uh, where we started in 2012. It was really about combining, uh, what we at the time was called, you know, big data and data warehouse in one system. So, so from day one, snowflake. I was really focusing on, on leveraging, you know, both structured data, so tables and colognes, and also semi-structured data, which is, you know, weblogs GS and documents.

Uh, and that's, you know, some instructional data you can have really literally petabytes of it because it's generated by machine, uh, uh, completely unstructured data. So what is unstructured data is. You may, geez. You know, texts, no cumulative, any document that you ha you can imagine. So it's not really, unstructured is probably a [00:04:00] bad word because it has some structure, but it's, it's rundown structure right.

Based on the type of these fives. So, so we have, you know, enable now full support of, of, of that data and, and we can, you know, process it and that's, you know, uh, Oh, snowpack, you know, and I can talk more about that and give examples if you want at some point, but, but you can build a full pipeline with that unstructured data in snowflake and process it, you know, extract, you know, value from these data extract new information from this unstructured data.

And move that that's that that's, uh, unstructured information now, a structure it's if you want and put it in tables and then query is table. So, so you can die equate this unstructured data, so to speak.

Steve Hamm: No, no, I think that's the, that was really the new frontier for awhile. And it seems like it's finally being, you know, managed in a way that is really useful for companies. Yeah. [00:05:00] Another question is kind of has a bit of a historic perspective on it too. You know, people talk about the modern data stack.

What does that terminology mean to you? How does snowflake fit in, you know, and, and my sense is that you kind of fit into a piece of it and now you're kind of extending into other pieces of it. So talk about that.

Benoit Dageville: Yeah. Yeah, no, this is a great question. And, and, and, you know, before really answering what is the stack, I think it's critical to, to define, you know, what are the key attributes of the stack, right? What is important? And then you can define it. And, and for me, the most important aspect of, of the modern data stack.

Is is really, uh, um, how you can have only one stack and that's so stupid, but, but in the old days, right, or organization, and especially large, large on the price would have, you know, many, many, many [00:06:00] different, you know, uh, systems to store data and their data and to manage their data. So with. One single organization, the data will be cooked that he siloed against, you know, these difference, you know, that has a data stack systems or these different analytical systems, uh, between, you know, your big data system, your warehouses, your data marts and the otherwise.

For this organization, because of course data will be siloed and it will be very hard to have a 360 view of that data. So, so for me, the most important characteristic of all of these data stack is that you can use only one and not only one for your own organization, but really one for the world, as I mentioned, because you want to connect your data sets with other data sets that exist, you know, square.

So it's very important that these data, this, that data SAC is, is unified. But it has, you know, really quasi unlimited scale because that, that would be the problem, right? If you have only one that it doesn't scale, [00:07:00] of course it will not work. Uh, and you can, you know, use that stack, you know, to put all your data and run all your workloads, uh, and in his to be elastic.

And the last CCT is, is, is kind of super interesting because even if you had it. Uh, a system that could scale could scale to an unlimited scale. It would be very expensive to have the system always at the, at its max. Right? So you want something that can adapt to demand, you know, uh, grow with demand and reduce when there is less demand so that you really can pay only what you use.

Um, and the other aspect of this, the stack, uh, you know, in terms of, of, of scale is that, as I said, it's global, right. You know, organizations are, um, mostly, you know, large organization on multinationals who you want these stack to be kind of distributed. Um, the, the second aspect, which I believe is also super critical is that if you just work [00:08:00] and, and.

And that sounds a little bit ironic, but most of our organization that have to deal with data, they spend all their time and all the analogy to make that stack work and not to get value from the data, but just to make it work. Uh, and it dissolves. You know, all the norms that you have to set and, and, and, and tuning and repairing things and, and fixing stuff.

Uh, so, so for me, the, the, the one important aspect is that it should just work. It should be self manage. It should be simple. You should have no knobs, they should, you know, solve complex problem, but that's. Solving his problem is inside, right? It's not exposed. Its complexity is not exposed to two users and then it should be of course, ideally available.

Um, when you want to use it, is there, right? The, the, the, the first aspect is probably, you know, security and Google. This is like [00:09:00] really critical, especially in the modern world of the cloud where you want to connect the system. Having this high level of security and governance is, is, is key. 

So the last aspect, which is, you know, really critical tool, um, is that this data stack should be open and collaborative and, and one of the most important.

you know, to connect your data with other data, you know, interact with the rest of the world. Um, and the other aspect of course, is the, the ecosystem and having a very, you know, uh, vibrant ecosystem of partners, uh, is very important for star Trek because these, the way. You'll know you're there. I stack can connect with the rest of the world and you can use all the tools, um, that, that that's, um, that, that is part of the stack.

So, so now in terms of specificity of that stack, you, you can think of. The [00:10:00] because the sandwich okay. And at the bottom of the sandwich, Alexa, which is so, so I'm going to use this picture and imagine a big sandwich. And at the bottom of the sandwich, right at the bottom of the stack, uh, of course it's the cloud infrastructure.

Attributes of, of the scale and having only one system, the only place where you can do that is really in the cloud. Um, and really the stack, the more that has taxed should be bitch, uh, from the ground up, you know, to leverage, you know, the attributes of the crowd. And obviously the crowd is a place where you have, you know, quizzy and limited access to compute and storage resources.

So that's great. Um, also the cloud. that was, you know, talking about this compute on demand, where you can, you know, assemble, you know, huge, you know, uh, farm of compute resources and these compute resources. So in [00:11:00] snowflake it texts literally subs again. To grab, you know, a lot of compute surveils onboard of 10, thousands of 10, and you can run, you know, we've always compute resources, run workloads.

And of course this accelerate the workloads because you have a lot of resources to run these workloads for. They run they're already.

Steve Hamm: Right.

Benoit Dageville: they run potentially one or two times, 1000 times faster than on a non elastic system on premise. And then when you're done, you can release this compute resources. And what it creates is is this idea that.

That's because you can, you know, your Randy's resources when you use them and you don't pay anything when you don't choose them, you can run fast for free. Um, and you're literally, we have customers, we used to do, you know, process, you know, the end of the week, you know, workload. So they were using the weekend.

They were processing all their data, their report. They were generating many reports that they had to generate actually during the weekend. So [00:12:00] two days. And w and when they migrated to snowflake, they were at the, uh, Friday night, at six after six, they were running all their workload. They were using thousands, many thousands, actually of compute resources.

Everything was running in two hours and to two hours. And, and, and that was transformational for them. And even though they use a lot of resources because they use these resources only for a very short period of time, two hours, they were paying actually far less than they used to pay on, on premise. So, so that's the bottom of the stack is, is really leveraging this cloud fabric.

Um, the last aspect of course is what I said is this is global. The stack should be global multi region and the cloud is. Because you can leverage the cloud region and, and put our shoe to a different cloud provider that, that exists. So at the center of the stack of your data stack, um, this is where you are the meat, right?

And that's the [00:13:00] data. This is, this is the most important aspect is you have the data. And of course the data stack should be. At, you know, storing the data. And then we talk about it's all type of data, whereas a structure or semi-structure, or even instructional data. And that, you know, they stack should be able to scale to multiple petabytes data sets or even the exact bytes.

And that's really critical. Right. Because in the modern world, you know, data is generated by machines and machines can generate a lot of data. Uh, so it's very important to have a system where there is no limit in terms of Homan, how much data you can store and, uh, you know, the structure of this. Uh, the, the, the, the, the, the last thing is of, of that storage or the, the meat is that the system, your data stack, you know, should secure this data and provide, you know, full controlling governance against that data.

And [00:14:00] that's, that's important, uh, because this is probably your most critical assets is your data. You don't want someone to, excuse trait, for example, your data, the modification of the data is also key. When you have a lot of data, it's very complex to have processes that add more datas, you know, change these data, you know, transform it.

And you want to do that with no transaction access as that, if something fares, nothing is done then, and your data is always consistent and that's, you know, very important. Uh, and the last aspect of these data stacks. So this is the top, the top of the stack of course, is to, you know, find all the that you need, uh, to run your more day on that I workloads.

And in particular, you'd heard a stack should be. Really able to run many things, you know, talking many languages, you know, SQL is, is, is of course key language to manipulate data. [00:15:00] But you want also to have data program, that'd be in on top of your stack. So Python, Java, Scala, and any language for it, really, frankly.

And it should be. Dani in a very secure way, right? As soon as you start to introduce, you know, languages like Python and Java, you have to worry about security and you have to make sure that this language, we know how really doing what you, what you think they ultimately. Um, and so that our program maybe GT at that layer is really critical for complex workload, like machine learning or even that I engineering workloads.

And, and finally, you know, you really want the top of your stack to really pull out, pull out the modern, that applications. I started to think about the sandwich. The meat is in the middle.

Steve Hamm: yeah, you're familiar with the old cartoon, Blondie and Dagwood. Her husband had a huge sandwich that was about a foot and a half tall.

Benoit Dageville: Exactly.

Steve Hamm: was an [00:16:00] American thing, but maybe it was in France to, okay. Okay. That's wonderful. Hey, just one more thing. Cause you talked about native Python, Java support.

That's a very new thing with your platform. What about data lake support? It seems like that's a fairly recent thing and a very significant thing. So talk about how you're doing that.

Benoit Dageville: Yeah, actually the snowflake, I would say from day one. So 2012 is when we started, was all about combining two Wells that were completely separated at the time, which was the world of big data where you could put all your data in a central place and you had. You could do all these workload against that data and you kill the, have, you know, semi-structured data, uh, and the world of that away house, which was much more structure where you had facts, access fast access to your data.

But you know, more limited in [00:17:00] terms of size of data. You could never have petabytes scale, you know, in a data warehouse. And also the data warehouse, you know, was, was limited to just structured data, no structure. You could not store, you know, you petabyte scale. Uh, web logs in, in your data warehouse. So snowflake from day one was about, you know, combining these two worlds and creating one system, you know, in the cloud, you know, for, for that.

And we didn't have, as I said, you know, support for unstructured data, fully unstructured data. But I was not in the rent to the architecture of the system is because in terms of workload, we wanted first to focus on the weblogs at scale and also structured data with transaction and all of these. So I would say, you know, that lake was, you know, part of the, the fundamental architecture of, of, of snowflake.

We didn't call it that at that time. And it was more big data. The term that was used, uh, that I like is really the evolution of, of the big data and I duke system.[00:18:00]

Steve Hamm: Okay. Gotcha. Now the snowflakes data cloud or data platform has been evolving rapidly over the years. It seems like it's now evolving into a powerful application development environment. You've got snow park, you've got the acquisition of streamlet, and you've also got the powered by snowflake program for, for partners.

So tell us how you see that working for your business partners and for application developers within.

Benoit Dageville: Yeah. Yeah. So, so very good points. Uh, really, I see the, the snowflake platform really as being the single collaboration hub, which is empowering, you know, more and more users to have this seamless access to data. Uh, and of course the goal is, is to leverage these data, to get insights and also build, you know, new data products.

And this new data product is really this more day on, [00:19:00] you know, data application. And what is important is developers. They want flexibility when working with their data. Um, and, and the. They want to focus on working on the data. As I say, they're not fixing, you know, the underlying system where their application is running.

Uh, so they don't want to do administrative work and maintenance. Uh, um, and that's the way a snowflake is, is really great because we have enabled, you know, uh, people to be full services on top of snowflake and not have to worry about managing. The data stack has, as you said. Um, so smell park, you know, is a very important aspect of our story.

Of course, it brings what we call data program, uh, to the snowflake platform. So language has choice, right? It's not any more about, you know, just sequel. You can die key, you know, push PI, turn and Java and Scala, [00:20:00] you know, diety run that, you know, code inside the platform. And of course, This is enabling, you know, many new use case, uh, for us, um, in particular, you know, data scientist and developers, uh, um, um, they, they, they kind of really push in all the Logix can be, you know, new, uh, application in our platform.

Um, um, so. So, so, so, so you're right. We are refocusing on that aspect and we have actually a program it's called Paul by snowflake program, which is really designed to have the software company, which are building on top of snowflake and application developers at large. To build and to operate and to grow the application on snowflake and the program, you know, um, par by snowflake, it all feels technical advice.

Um, uh, to add, you know, uh, access to port [00:21:00] engineer from snowflake who are specialized in apps development and, and put, actually we have, you know, join, go to market opportunities with these companies and. To give you just some name, the poor by snowflake, uh, companies include companies like BlackRock capital, one Walnut music group, UI path, you know, strata release work.

And we have many, many, many more,

Steve Hamm: Okay. In your answer, you, you mentioned data to applications and I think it would be really helpful to people. If you could compare and contrast these new data applications with more traditional software applications that people are familiar with.

Benoit Dageville: Yeah, so that applications are really application. Which are leveraging data. I know at their, you know, at the center of this application, there is data on potentially a lot of data. So, you know, these application might leverage, you know, machine learning, for example, to give you, you know, recommendation or they [00:22:00] can, you know, manage, you know, data to, you know, forecast for example.

So they, they. They are really very tied to the data and they need a platform that can, you know, store a lot of data and, and return on, you know, uh, results of, of these questions that the application is asking on behalf of their user. No readings. You know, quickly, and you have a lot of data application that you interact with everyday.

Uh, so I'm actually appalled by, by still flag, uh, behind the scene. Uh, when you go to California and you look at COVID reports, you know, a lot of these data that these applications exposing, uh, to, uh, to the government of California, It's coming actually from, from snowflake via the scene. So, so many of our users and customers are being application, which are really powered by snowflake.

Um, and that's a new market, uh, and, and you can put the , you know, this, this, this [00:23:00] application IQ funds fund snowflake,

Steve Hamm: Yeah. Yeah, yeah, no. We talked about the different data types structured. Semi-structured unstructured. Um, I'm just curious. I mean, I know that. That the, that the data, the snowflake data platform really makes it all this, all these kinds of data, very accessible and queries very quick and all this kind of stuff.

But I've, I just wonder our data scientists and data analysts, are they now able to mix data types in a query in a, in a particular one query to give them kind of a 360 degree view of the situation. Different kinds of information, whether it be an image, text numbers in columns, can they really kind of blend those all together or do they have to do different searches for the different data types?

Benoit Dageville: Yeah, absolutely. I mean, that's, that's the, the, the match he can and the best he's. So, so as I said, I mean, just as a introduction [00:24:00] production of that, We can support install, flag dykey, you know, storing all type of data, you know, unstructured semi-structure or structured, but, but we can also, of course it will be useless to store that data if we didn't work, if there was no way to process that data and query that data, as you said, and, and the beauty of, of snowflake is that by, by combining all these data, In, in one system you can query and potentially have only a single Quaid that's that mixed all these data together.

And maybe to make it a little bit more concrete, uh, let me give you a very quick example. I'm going to try, so imagine that that's. You can build, for example, a full data pipeline that, you know, so you have, let's say you're a company or an organization that has a set of images, which are coming, you know, real time in the city are occluded by the user of this, this company and on these images, as soon as an image.

It's applauded, you know, there is a [00:25:00] pipeline which is created, you know, to extract some information of these images and to make them this information available for quaint. So you can really build this. So the data engineer of this company by call figure likely in snowflake this type of pipeline and make, make sure that as soon as an image, You know, landing our storage key, but snowflake task will be immediate.

He executed to process the new images that have landed and that task, for example, can you snowpack and maybe let's say Java, you're still parked to process this, these new images and, you know, you can link with any. You know, open source, you know, Java or any code that you want to, you know, read this images and extraction or some meta data on, for example, the nanny was a signal that you want to extract.

You could even use machine learning for that. And extract is the signal for, from this image. So for example, let's say [00:26:00] that one information that you extract from the image is the geo location, where, where this image was taken. And so you can use. Um, that, that Java program to extract this information. And as I said, many more things and you can push, you know, in, in your pipeline, you know, that information and store it in a regular table or so, so probably that information is semi-structured.

Because, you know, depending on the image, you might have very different type of information that you extract, but all of these can be stored, you know, in a regular table, uh, as, uh, you know, semi-structured data type, uh, including the geolocation. We asked to portfolio geospatial location, uh, that can be stalled also in that table, along with the reference to.

I know that you just process or imagine a table where you have as many and rows of information as images that you have, you know, in your, in your storage, in your stuff like storage, [00:27:00] where these images are stored. And now this table, you can die key, you know, crate from, uh, from snowflake. Um, you can either.

Uh, Ted asks no flag, uh, automization service to index, you know, so to speak or the data in the table, such that when you create these data, the semi-structured data, this is really, really fast. Uh, if you want to pug, you know, more than application, you know, and you want to have an application quading, these tables, these data, and showing it, you, you can do it very, very quickly that way.

Uh, but, but, but the beauty too is. W what you can do is, is joined that data now with, you know, data that you have in the marketplace, you know, data sets that match, you know, these geolocation coordinate that, that you extracted, you can connect that with data sets, which are positioning, uh, these data on, you know, the map.

And for example, you [00:28:00] can now build a fake. Easy application. And I say, you, you mentioned stream meet acquisition. So you can build that, you know, put a shoe with trim. It's, uh, I've an application that where you, you, you want to equate some, you know, location where you want to see measures and, and, and that's, you know, application, we run a query that should be a directly in SQL at that point.

You don't need, you know, more than SQL. Yeah. Connects, you know, join this data with this data on the marketplace, you know, subs, you know, find a subset of images which are relevant to your query and return that to the application and what we return when we returned the image, we returned just a reference, a secure reference to that image.

And the application then can use this, you know, secure reference to fetch back. This, this image from house storage and this page, you know, for example, in your web browser and all of the guests in all this pipeline and this application is really literally no, a few lines of [00:29:00] code. So that's the, the, the, the true magic of a unified service, uh, you know, sofa is really not tool Quito independent solution.

It is a singular system where we can connect on all these data types, all these features together. So. To make it really easy to build a fully entwined application from the data engineering pipeline to the application stuff. And then that's where either the vision.

Steve Hamm: Yeah. Yeah. You know, you, you use the term end to end, you know, when we think about all the developments we've discussed so far in our conversation today, Um, it seems like there's really a transformation going on enabled by the platform, uh, how cloud applications are built, how they're deployed, how they're promoted, sold, and transacted, all of that is done using one system.

So talk about that. What does that do for businesses when they can do all [00:30:00] those things?

Benoit Dageville: I mean, this is really sourceful native. I mean, near, you know, as I said, there is new application brand new applications, which, which are being developed on top of this platform, because it's so easy because you don't need to teach, you know, many solutions together and you can do, you know, really descend to, and from the time you build the application, you deploy it, you promote it.

You sold it. We have even monetization in our marketplace, so you can do the entire work flow directly in the platform. And the nice thing with Mr flack is that it scales. So if your application is really successful and you have more and more, you know, users of your application, you know, the system we scale with these users and you don't need to worry about running out of scale at the same time, you don't need to overproduce.

Uh, the resource is because, you know, it's really, you, you pay what you use in, in snowflake. So you are able to have [00:31:00] a tiny start-ups with a vein or small food and prints, you know, leveraging this, this bar and giving that call to their users and building on top of snowflake. Um, and at, uh, you know, and they can start and grow with their user base.

Steve Hamm: Yeah. Yeah. Now you've talked about the different workloads wide array of them. What kind of, what, what are the coolest new workloads that you're seeing and kind of, where do we go from here with what comes next?

Benoit Dageville: This is a great question. So, so I mean, if it's interesting too, to see the story and indeed how we grew. So flag, you know, from, you know, when we started in, in 2012, so, so as I said, when we started this, it was all about, you know, combining and unifying these data warehouse and big data in a single system.

And, and also leverage the power of the crowds to be really this, this cloud native. Which we support all these workloads and can [00:32:00] scale, you know, really on demand and with this, you know, instantly last CCT. And of course this is fully managed service that's snowflake is, but, but as, as we progress, you know, a brand, you know, I would say 2016, um, we really developed this, this, this model of the data cloud, what we, you know, now named the data cloud, making snow flag, really a super cloud.

W which she superposed on top of 40, as a crowd platform and where data collaboration, uh, is really at the center of, of our system. This is the marketplace, the data marketplace, and how you can connect. As I said to, you know, your data sets with your customer, that I, as we've provided us, we've, you know, public data sets.

Um, and, and we started also to focus on. On this, you know, per, by snowflake application, developer, data engineering, data science workloads, we wanted to likely run, you know, ma um, but, but you're right [00:33:00] somewhere Westfield. The best is, is really in front of us. And, and, and really, we have only been scratching the surface of what is possible, you know, today.

So, so I'm very excited about the Fisher. And what is the future is, is to me, is, is running more than. Of the application Logix can say snowflake up to the point where 100% of any application can run in our cloud. Uh, that's the vision. And, and we are, of course, um, we are not yet. There they are, you know, fundamental building blocks that we need to hide.

Um, and we are doing a lot this year is going to be one of those. Expecting you, I would say stuff like, you know, his story and you should come, Steve, you should come really come to us. No flex summit in June, 2022, it's in Las Vegas. And, and we have so many announcements to make, uh, that that will paint, you know, this feature [00:34:00] of, of, of really snowflake as, as the best in know place, you know, to, to, to run, you know, complex, you know, that application.

Steve Hamm: Yeah. Well, thank you for the invitation. I'll either be there or I'll watch it on video because that's typically what I do these days. You get so trained by zoom, you know?

Benoit Dageville: That's true.

Steve Hamm: No, that's cool. Now, you know, when I think about snowflake and I think of all these different workloads and all these different uses of, of, uh, data management systems, I think of stuff like, kind of is the Swiss army knife of this field, of this domain.

Why is it important for customers to be able to tap one platform for a wide variety of uses rather than kind of different platforms or engines for each different use?

Benoit Dageville: Yeah, no, th this is a great, great question. And, and, and I'm asking, you know, often that question, because, you know, there is really a philosophy of, you know, is a Swiss army knife. You have a tool and many tools in your [00:35:00] toolbox, right? Then you can teach, you know, a complex solution and to add. We've using all these tools and snowflake is really, you know, different.

Uh, what, what we wanted to do is really to have a one single platform, um, where, you know, you could run, you know, this entwined workloads, but, but within the same platform without, you know, having to stitch a global solution. And why is it better? I mean, it's better really for. I would say two main reasons that the three main reasons, maybe the first one is simplicity.

Uh, when you have to connect different systems and you have to learn each of the system, they have a difference in a way to be administered to I've used a model. So, so the complexity is really IO, uh, um, because you have to really, really learn all this, this, this, this thing. And. And it's really not a single system at the time.

That's what you have to see people think, okay. I will only do machine running, why I [00:36:00] need to do, you know, data pipeline. And it happens that you know why you are doing machine learning. You need to prepare the data and you need to have data pipelines. So you are going to use, you know, more than one tool at a time.

Uh, actually probably most of them, um, most of these tools, so connecting them is complicated and they're all prone to that. The other aspect. Uh, is, is, is efficiency, right? When you have to connect tools, you have to move data from one tool to the answer. And this is very inefficient and potentially transforming the data as you move from one, no representation to another, because this tool, you know, wants the data in that format.

And, and, and so, so having this, this pipeline, you know, cross tool is very slow and the. You cannot get this retirement, you know? Uh Muldaur and the, and the last aspect, which is also very critical is security and governance. If you start to copy data and move data, you lose, you know, the security [00:37:00] aspects of the system that owns, you know, and protect this data.

As I said, this is the middle of the sandwich. And also you, you, you have, you know, no governance of that data because. Because you, you need all these tools to, you know, understand these governance, which is not the case. So, so that makes, you know, the system, they, you know, very complex, very slow and, and they, you know,

Steve Hamm: Yeah. Yeah. Now snowflake has had a vertical industry strategy for years, but for most of this time, it's mainly about a sales strategy, how you approach the market. But I understand that it's been shifting pretty significantly in recent times. So what technologies and technology partnerships are you adding to address the needs of particular industries?

Benoit Dageville: Yeah. Th th this is a great point social first. Um, if you think about the different vertical, like finance media FK or [00:38:00] retail, Um, uh, so, so, so, so, so they have, you know, very different ways of using data and they have a different need, uh, in terms of the data sets that they need to use. So, as I said, snowflake is not.

You know, it's not an empty platform. When you start to use, you know, snowflake, you have a lot of data at your fingerprints and figure tips, sorry that, that, that you can use, uh, and leverage and, and these data, you know, needs to be customized a lot for these different vertical industries. Um, so-so there is a lot of work that goes in the.

In, in, in building our marketplace, such that we have, you know, data set for each of these verticals. And he's also the tool and, and, and, you know, for example, um, uh, for, for retail, um, and in media and advertising, you have, you know, this concept of clean rooms. So we are building, you know, dedicated. You know, uh, um, uh, [00:39:00] software, you know, for, for this particular, that, so having a deep understanding on our side of these different, you know, vertical is, is, is really critical so that we can, you know, talk, you know, their language and provide, you know, the value that they expect versus providing a Geri platform, um, that they have to associate to their particular business.

And I can give you some examples, um, uh, of, of, uh, you know, companies who are using, you know, uh, snowflake, uh, in, in, in that, you know, so for example, a good one is, is Disney. Disney says this on the media and advertising, right? Disney wanted to deliver, you know, really highly personalized content and, and Disney has built an impressive, you know, clean route solution as I was mentioning for its clients.

And this was, you know, built in conjunction with every snowflake, right? For this clean room [00:40:00] technology and Disney advertising says skin room, you know, they provide their, the customer. You know, advanced pre-planning insights, you know, and activation and cross portfolio measurement and brain safe in your brain safe way.

So, so, so having all these features, which are very, you know, verticalize is, is really important. Um,

Steve Hamm: Yeah. Hey, explain clean room. I'm sorry. Explain clean room. I've people may not understand that isn't it. Where two companies can share parts of their data. Is that the way it works?

Benoit Dageville: It's it's uh, um, so w where yes, I mean, yes or no. Uh, um, so, so, so of course, uh, snowflake data sharing is about, you know, sharing data. But I, I don't want to share my data with you. Uh, I want to share it in a, in a nice way, if you want. I want, I would just want to share the data that intersects with your data, right?

You have customers that [00:41:00] have testimonials. I know a lot about my customers, like, you know, a lot about your customers. So if we put our data together, We will know more, but I don't want to expose any information about the custom mills you don't have, and you don't want to expose any information about the customers that I don't have.

So we need to intersect our data sets in a secure way. So this is the clean room doing that for us. And we've guaranteed that we are not also going to expose, you know, personal, you know, information about, you know, Um, each of our customers, right? It needs to be done in a government secure way. And that this is the technology that we provide.

We provide this, this, if you want a DMZ zone where, where you can, you know, connect these two data sets together and, and, um, and, and this, this. You know, the provider and costumer or the consumer of on the costumer side, you can see only, um, what you [00:42:00] are rude to see. Uh, and this is governed by the keen rule.

Steve Hamm: Yeah, I got that. So Benwah, we're going to wrap up now with, uh, with one last question or one last line of questioning, you know, I look and see you. You mentioned several times. 2012 that's when you launched the company, it's been a decade, which are it's remarkable at, it seems like probably longer than that to you.

But looking back, how does today compare with the early days of snowflake? I mean, did you ever imagine snowflake would be where it is today? You know, like any surprises.

Benoit Dageville: I mean, for sure. I would never have imagined that, but. You know, every year is different, but this is what I love in this last 10 years is that every year I think we have completely reinvented snowflake. And, and it's true that at the beginning, I mean, even today at the beginning of this year, I have no idea what will be the end of that [00:43:00] year.

Right? I know we have, you know, so many things we are going to be, so at most I can see your head off of one year. And for me, this is really about, you know, reinventing and expanding in all these, these data cloud, uh, that, that, that we are building it's really about that. And I don't want to think 10 years ahead, I never fought, you know, 10 years ahead and.

And, and also what is important is, is when we start his stuff, like we, we never wanted to build a business. So sometime people say, oh, I view, imagine how successful you would be. Uh, um, and I never imagined that because I, I really never really care about that. I wanted to build an amazing product. Uh, we wanted to build with Jerry actually, uh, and, and, and really.

W what was to really reinvent the analytic in the modern world of the cloud. I mean, your first question about, you know, what is the data more than that? Our stack is, is really, uh, [00:44:00] uh, very accurate, right? How, w w what is it that. How should you do things differently with the cloud and, and for us, when we realized, you know, 12 years, I mean, 10 years ago that the crowd could completely change the way, uh, analytic system would work.

We, we, we knew that that's, that is not a small revolution. It's, it's completely reinventing the system from the ground up and, and what, what will happen at the end. And, and I don't know when we'll reach the end of that journey. Uh, but. But would it be so different than what it is today? So, so at the same time, it's 10 years now, but I asked the impression every year, every year, it's like that had the impression that we are just starting.

And this is, I have to say so exciting because we are touching now the art of what is an agility, you know, system is the application. That's appalled port by the system, right. That that's where we want to be. And, and we have just, you know, crisis.[00:45:00]

Steve Hamm: Yeah. Well, that's very cool. I love what you said about how you're transforming the company every, every year and what that means is continuous transformation. And I

Benoit Dageville: Yes. Yes. It's not is very important to say that, right. Is it Nancy in the scope or enlarging the scope of, of what we do? We always beat, you know, I mean, we, we continue to build what we have started, you know, 10 years ago.

Steve Hamm: Yeah. Well, that's great. That's inspiring. Ben, why it's once again, it's so great to talk to you and I absolutely look forward to seeing what you guys are going to unveil in, in June.

Benoit Dageville: Yes, come to Las Vegas and Steve, thank you so much. Bye bye.