The Data Cloud Podcast

Cash Back on Your Data Stack with Mark Stange-Tregear, VP of Analytics at Rakuten Rewards

Episode Summary

This episode features an interview with Mark Stange-Tregear, Vice President of Analytics at Rakuten Rewards. Mark was previously the Director of Analytics at Ebates and the Director of Analytics Services at RealPage, Inc. In this episode, Mark talks about how to successfully communicate with both merchants and consumers, the nuances of analyzing consumer data, the future of cloud data analytics, and much more.

Episode Notes

This episode features an interview with Mark Stange-Tregear, Vice President of Analytics at Rakuten Rewards. Mark was previously the Director of Analytics at Ebates and the Director of Analytics Services at RealPage, Inc.

In this episode, Mark talks about how to successfully communicate with both merchants and consumers, the nuances of analyzing consumer data, the future of cloud data analytics, and much more. 

--------

How you approach data will define what’s possible for your organization. Data engineers, data scientists, application developers, and a host of other data professionals who depend on the Snowflake Data Cloud continue to thrive thanks to a decade of technology breakthroughs. But that journey is only the beginning.

Attend Snowflake Summit 2023 in Las Vegas June 26-29 to learn how to access, build, and monetize data, tools, models, and applications in ways that were previously unimaginable. Enable seamless alignment and collaboration across these crucial functions in the Data Cloud to transform nearly every aspect of your organization.
Learn more and register at www.snowflake.com/summit

Episode Transcription


 

Steve Hamm: [00:00:00] so Mark, it's great to talk to you today. Welcome to the show.

Mark Stange-Tregear: [00:00:04] thank you for having me.

Steve Hamm: [00:00:05] Hey, uh, it's in rewards is a cashback and shopping rewards company. As I understand it, can you begin by explaining how the business works, including the affiliate networks, and also tell us about the competitive dynamics that Rakuten rewards faces.

Mark Stange-Tregear: [00:00:23] Yeah, absolutely.

the company was formally called the Ebates. We rebranded last year. Um, Ebates has been around for a long, long time. We started back in the late nineties.

And, um, we were acquired by rack tan, uh, which is a very large Japanese company in 2015.  I've been with the company since 2014. Um, and I, I was brought on board to work on business intelligence systems and analytics, you know, being. Basically doing that for the company ever since, um, through various tech iterations and various, uh, changes and growth spurts within the company.

So as you, you mentioned the coal, the coal product that we are offer, um, is cashback, um, on, um, online purchases. We are a membership, a membership service. So you sign up, you get, you have a rectangle wards account. And then when you go and shop at the various places that you would normally shop online, depending on how much you spend, you will, you'll get a percentage of what you spend back.

There's no great mystery to this. Um, in terms of the business model, it can seem a little bit. Magic, you know, I go and spend a couple of hundred dollars at best buy or Macy's and I'm getting $10 back. Why would, how can write down rewards for it to do that? It isn't mysterious. It isn't magical. Um, but it is a pretty beneficial model where effectively We get paid for, um, member or buyer acquisition from these companies.

And then we just handle some of that. Um, some of that money back to the consumer in the form of cash back. Um, there's been a lot of cash back pay down. We're rapidly,  heading into the billions of dollars. Of cashback. That's being sent out to our members over the last, over the life lifetime of the company.

And that was billion with a B not million with an N. Um, we go through millions, um, routinely. So, um, you know, I think we have a very loyal, very dedicated customer base. And I think, you know, we've, we've owned a lot of money for them over the years, as well as being a pretty beneficial. Program for our merchant partners, because the incentives that we get are based on what people buy.

So it isn't to find advertising fee, you know, and it, it really does work for everyone involved.

Steve Hamm: [00:02:39] So how many affiliates are there? And also what's kind of a typical percentage cashback that somebody gets.

Mark Stange-Tregear: [00:02:46] Yeah, that's a good question. So, um, if you, if you go onto rack fan rewards, you'll see something like 3000 partners on there.

Um, varying from very big named brands. Um, you know, your Macy's your best buys  through various are the, um, maybe slightly smaller brands, but still very visible. Um, they're are a bit more niche. I think about sax. Um, or, um, or, um, you know, tiger direct computing. And then, you know, we also work a lot with travel and we work with a lot of small businesses as well.

Wow. You know, brands that are really much Maneesh or don't have the presence of some of the major merchants, um, in the space. So we try to span pretty much all of it. Um, and I think we do a pretty good job of that. Um, work with the majority of the major, the major, um, online retailers in the U S as well as a very large number of smaller ones.

Cashback rates can be highly variable. It can be anything from. Less than 1% in some situations. Um, although that's relatively unusual and it's, you know, all the way up to, you know, blockbuster deals where it's 20 and the 25 kind of thing, ascend. So a typical range would be in the five to 10, um, on, uh, on any given day.

Um, it. Usually, you know, on a, unless you're out buying a new car or something, you know, the amount of cash that you went on, one transaction, probably isn't life changing. The real impact comes when you use this repeatedly over time and you suddenly realize just how much he you spending online, um, how much you're getting back.

I mean, we routinely issue checks in the hundreds of dollars to regular consumers. Right.

Steve Hamm: [00:04:31] Okay. How many customers do you have?

Mark Stange-Tregear: [00:04:35] Um, uh, the, in terms of active user base, we tend to focus on a number it's certainly in the millions. It depends a little bit on how active you want to count it. Um, we have a couple of million, very dedicated users.

We have, you know, you can probably count up to a more like 10 million, um, reasonably active members. And then we've got a long tail of people who've used us occasionally.

Steve Hamm: [00:04:58] Yeah. And how about the competitive dynamics of the industry? I mean, do you face a lot of competition?

Mark Stange-Tregear: [00:05:05] I think it's a competitive space and it's also a space where there's a lot of innovation.

I mean, like, uh, like I said before, the, um, there's a strong value proposition, both on both the B2B side and the BTC side in our industry. And that makes for a fairly fruitful, um, environment where, um, You know where businesses can grow and flourish and try to find a slightly different edge. So there are competitors in the space.

I think Rakuten Rewards has been around for a long, long time and has a, has a very substantial slice of market share. Um, we also own shop style, which is a similar kind of value proposition. Um, but , more focused on the high end fashion space. Um, and there are other competitors out there, there are competitive products and clearly we keep an, you know, like any, any good business.

We keep an eye on them. And, um, we, we watch what they're up to as I'm sure they watch what we're up to and, . A lot of this though, is about the relationship with the, with, with the other businesses we work with with, with the brand names that we work with. Um, and also with our consumers and having those conversations now for certainly with some merchants would be in working with them for 20 years.

Um, and for consumers, some of them have been around for 20 years. And I think, um, you know, we, we. Really focus as a company on that relationship building directly with the B2B side of things. But also that, that really is the way that we think about the, um, our communication with our members. We don't tend to think of them as just sort of gas buyers, who we may never see.

Again, we really do think about. Them as members and trying to understand what they want and need from our platform. And what would w what would support them in the long run?

Steve Hamm: [00:06:47] Now we're speaking in the middle of the COVID-19 crisis and retailing is in a chaotic situation,

a lot of brick and mortar retailing is collapsing. And there's really a highly, very demand for on online retailing. It's not just a steady thing at the same time. You mentioned a minute ago, travel is one of your big customers. So you're in the middle of a maelstrom here, and I want to know how is the situation impacting your business and how is the company using data and analytics to deal with that?

Mark Stange-Tregear: [00:07:24] Yeah, I mean, I, I wouldn't disagree with your, use of the term chaotic there. I think that is it. I mean, clearly this is unprecedented and you're right to call out. I mean, you know, we, we do work with a lot of travel merchants,  travel is not something that people are doing in the same volumes clearly right now.

Um, although we all still seem, you know, We are still seeing some business through that channel. It clearly it was impacted. And in other spaces we're seeing different dynamics is people do shift, um, shopping that they may have done in person or may have done traditionally in person online. So, yeah, without question, right.

It's a, it's a, it's a complicated time. It's a chaotic time. Um, it is impacting us in a variety of different ways, some negative, some positive. Um, I think to get to the mean, to be a question about the data and analytics, I think for me, the key. Point and what I've always tried to do within business intelligence and really had as the core of what we, what I, and with my team, we're trying to build a rec town rewards is visibility and insight.

And I think that that is really being drawn to the fall right now, as the situation gets more chaotic, you know, in the normal run of business, maybe day to day, You kind of know what's going to happen, right? We're, we're a big enough business that there are usually massive fluctuations on a day to day basis.

You, you get into a situation like this, where there are a lot of variables. There's a lot of different factors going on a lot of parts of the business and moving in other directions. And it really calls to attention. The fact that you need good visibility into what's happening and. For a company that drives itself on his data as we do, what that means is that you need fast, reliable, easy.

I access information about what is happening. And I think, um, I think that's what, we're, what we've been able to do. I mean, we are watching the dynamics between the different retail verticals. On a daily basis, we're watching the dynamics, um, between different parts of the country on a daily basis. We're watching the chefs as people.

Uh, a home more and are they using their mobile devices, uh, or mobile apps less and maybe focusing more on desktop applications again, are people taking more longer to make decisions because they've just got more hands on the time or, uh, are we seeing more, what look like very quick purchase decisions?

Because I don't know, maybe it was people with small kids or something I've got less time on their hands. All of a sudden. And I think for me, the, the key here is to have that. Clean reliable up to the minute data so that you can understand what's happening and so that you can then make plans to, um, plans to respond as appropriately as possible.

I know as well that in working with our, our merchant partners in particular, our B to B relationships, being able to understand the dynamics, being able to talk to them. About what we're seeing, what they're seeing and how we can work to yeah. There to make sure that the consumer experience is good. Um, hers really come into play.

And again, just because of the way our company operates, that has been largely focused on understanding the numbers and the data and making sure that we're. Fully understanding the situation and the dynamics. Um, and that we don't have to go back and question or, or, um, ask questions of the, you know, about the quality.

Yeah. The data, or wait too long to get the insights that we need.

Steve Hamm: [00:11:05] Yeah, it seems like your company is in a real enviable position because you have those close relationships with your affiliates, with the retailers and other businesses on one end, and you can collaborate with them and compare notes with them.

And on the other end, it's a membership organization. So you know the customer. So, I mean, you, you kind of have, this is a wonderful view of both of your market. How are you able to take advantage of that?

Mark Stange-Tregear: [00:11:34] I like the word that the use of the word enviable. Um, sometimes it's easy to feel a little bit of pressure there.

You know, we, we try and maintain very strong relationships in, in two, in two different directions. And that can be, that can be difficult, but I think I, I feel comfortable speaking for myself, at least when I say, you know, I don't really see. A downside to it. It may be hard work sometimes, right. To make sure that you're communicating appropriately to all your consumer base and to all of the merchants, but it's, I think it does put us in a, in a pretty good position to see those market dynamics and.

I think it does mean that we're able to, to respond and, um, to have intelligent conversations about, um, you know, with the merchants about what we see happening in the market you place play, what would really work for them and by extension for our membership base in terms of offers or in terms of, um, in terms of other deals that they want to put out there, um, you know, where they can potentially help.

In this situation as well, um, by offering a little bit more incentive or by, um, or by curating the experience a little bit more. Yeah. Um, not easy to have that volume of conversations and to think about how that works in such a chaotic situation. But I do think that that's okay. Key goal for us and something that we've been working hard.

Steve Hamm: [00:12:53] That makes total sense. Now you're the vice president of analytics. If you could describe what your day to day job is like and what are the main challenges you're facing? Not just during COVID, but you know, in general,

Mark Stange-Tregear: [00:13:10] Yeah, absolutely. And I think my role is adapted and changed over the years as the company's grown.

Uh, we went through very rapid growth and it's still growing. I'm still growing pretty rapidly. So I think it's an ever evolving situation. My day to day job primary, I serve two, uh, two main functions and they, um, uh, my team and that is to provide kind of. Expert level understanding of the data that's available and how to use the systems that we have within the company.

Um, also offering analytical insight, um, across  what you would think of as traditional sort of business analytics, product analytics, database analytics, and data science. Um, now officially working directly with business groups around the company, everyone from sales and marketing to finance and product and everyone in between audit and everyone in between to really allow the companies to get, hold up the data that they need and to make sure that it's quality data.

Um, and that's one side of it is, um, almost like an analytic service organization. I think the other, the other big part of my role, um, and this is not a role that you'll hear described a lot. In the industry, but is as the product manager for the business intelligence solutions at an enterprise level. Um, so we run a combined team of analysts, analysts, and engineers.

Um, I have a, I have a peer, um, called Joji John who leads the engineering side of things, but this isn't a traditional. You know, business analysts, um, sitting within the business teams and then, um, an it group running the, running the business intelligence tools. We run that all as a combined team and we have done, um, since I joined the company and then that puts me and, um, the analytics side of the team and the effectively in the role of product managers.

And what I mean by that is that we do a lot of the upfront work thinking about. What, what can our systems do? What can our systems not do? Where are they working? Where are they not working? Where do we need data that we don't have? Or where do we need to do some data modeling or clean up the, the, we haven't done previously.

It provides a price advantageous model. Um, and I've talked to many people over the years about this and, and. Had it confirmed that this is a pretty beneficial model because what it does is that the gap between it and what the business really needs. Right. The engineering team on having to sit in business conversations with kind of with relatively fuzzy requirements, you know, that that is translated to them through my team, um, to a greater or lesser extent.

And likewise the, it, the it and the engineering team on. Out there building things with the hope that someone will adopt them because we're a federally working with them to help determine that pipeline so that they only build stuff that we know is going to be used and needed. Um, so it means that for a relatively, for.

Really quite small team, actually, we can have a lot of impact within the organization. Um, like I said, I've had conversations with other BI leaders and BI teams over the years, and I think our team has a proportion of the business size is significantly smaller than a lot of business intelligence teams then yet I think our impact because it's consolidated, we've been able to keep, um, You know, pretty material within the company and we still operate with enterprise level business intelligence systems.

Um, wow. It's having, without having to scale the team too much more and. I think it, it really does offer an interesting perspective for the analyst, um, for the analyst and the engineer working on this. And it really does form a more productive working relationship, um, over time.

Steve Hamm: [00:17:10] Yeah. That's really a cool model.

Now. You're a deeply technical person. Take us back in time. When and why in your life did you first become interested in technology and then. More recently, what are the main ones, data analytics shifts that you've seen and that you've been involved in?

Mark Stange-Tregear: [00:17:30] Yeah. Yeah. I get who I am relatively technical, I think deeply.

I don't, I hope I live up to that. Maybe I wouldn't like to try and claim that, but yeah, no, I, I am interested in technology. When did I get interested in technology? You'd have to go well, while back I remember, you know, growing up in the eighties and having, you know, the Amstrad one 28 K. You know, machines and typing in some basic programming language to make little, you know, space invaders games.

And I was that kid. Um, so I guess the answer, the answer in part goes way, way, way back into my childhood. More relevantly, I guess. Um, I, you know, I D I. I was, I went through science degree is I sort of branched into philosophy degrees, interestingly enough. And then ultimately came back through, um, when I went into the workplace, I went into marketing and gradually moved into sort of marketing analytics and marketing decision making.

And then just shifted more and more into the business intelligence space over time. Right. I, I guess. The first, the first time I was focused, wholly on analytics was, um, I was working for a startup in Boston called world winner. Um, there was acquired by the game show network while I was there. This is back in 2008, 2009.

We working with, uh, um, big business objects and crystal reports and those for those technologies back then. And I guess that would really be the, the first iteration. The, since I really got the reins of the systems and the, and the sort of the, the, in a position to make decision making capabilities that really started with when I moved to, to Ebates and, um, Around 2014 and we've gone through what I think is three stages of evolution there.

Right since I joined. So when I joined the company, we were, we had been running off of SQL server machines. Um, I think a very standard paradigm. You know, we had Postgres databases in production. We copied the data over into SQL server. Did a little bit of cleanup and ETL processing had a SQL server analysis services cubes sitting on top, which we fed into XL.

And, you know, we were, we were running on those single SQL server machines. Um, as I was coming on board, that system was starting to kind of creak at the edges a little bit, you know, are needed gum beyond what the. Biggest SQL server machine could handle at the time. And we were starting to do what I think a lot of people do, which is we were splintering the we're splintering, the hardware.

So we were now running multiple different sequel server machines and trying to keep the different types of workloads. And we were trying to keep the data sync and we were starting to see issues with, well, that's fine. Except, you know, Until you need to blend the data back together, and then it's kind of a pain.

Um, or while we replicated this ETL change onto these two machines, but we got it on this third one and we forgot that there's this special keys on this third one and I was not working or, you know, the third machine smaller than the other two. So code that runs on two of them, doesn't run on the third one.

And, you know, I think. Anyone that's tried to manage that kind of multi-instance single machine implementation for a meaningful business intelligence stack knows what I'm talking about. It gets pretty painful, pretty quickly. Um, at a certain point, we decided that we were going to shift into a clustered environment and we looked at the standard, the standard players in that space, including Terra data at the time, at the time.

Um, we absolutely decided though that we were going to shift into Hadoop. Um, and if you. Go out and Google my name. You'll still probably find quite a lot of, quite a lot of content out there with my name plastered on articles, about how we shifted Ebates onto the video. And we did, and I think, you know, this was.

Um, I D I don't know how many companies actually managed to manage to go that entire route, but we did. We shifted all of our business intelligence activities onto Hudu at an enterprise level. Um, we were serving data lakes, data, mods, data warehouse, and, and reporting as well as various other ETL functions, offer Hadoop cluster.

Um, and we had it kind of working honestly. Um, and that really became. Sort of our phase two, the difficulty we ran into there, it was just maintaining that architecture at the, uh, in the long run, you know, Hadoop is, was a critical phase in the evolution of business intelligence. Um, and, and being able to scale business intelligence platforms up to.

You know, up to modern data volumes and modern data. Yeah. You says, but the overhead we were finding at least, and I'll speak just for all company, as a, as a medium sized company, we were finding the overhead of trying to. Buy enough servers manage those servers, manage the, manage the software layer on top manage concurrency issues and hard drives and switches.

And so on was becoming a real distraction from our core focus, which was to the right data, to the right people at the right time. You know, that's always been our core mantra and we will find it more and more that we were just becoming hardware, di diagnostic teams. Um, and there really wasn't where we wanted to be.

So we did start looking for alternatives, the racks and rewards organization at the time was also looking to shift into the cloud. So, um, we. We started analyzing, you know, the big cloud vendor platforms for business intelligence and ran side-by-side testing on, on the major players in that space.

Um, and ultimately ended up deciding to move into snowflake. Um, that was back in 2018. We did a sort of fall full-scale POC. Um, we decided we liked what we saw and shifted the rest of our technology in 2019, meaning that by the end of 2019, we will fully offer dupe. And we were into what I think of as phase three of this journey, which is, um, entirely on snowflake.

another way of putting that, I guess is you've got that traditional shift from going from single point machines into a clustered architecture, into a cloud clustered architecture. Um, I think the, the bit that we may be a bit on mutual in, I'm not saying unique, but maybe a bit unusual and is that we actually did sort of complete the migration.

Completely, um, at each of those phases, um, you know, we were completely on the Hadoop and completely out of SQL server. And then now we're completely on snowflake and we have no single instance or on premise stuff like that.

Steve Hamm: [00:24:30] That's to be congratulated. I mean, you hear stories about companies launched in these huge initiatives and ultimately failing.

It's just amazing.

Mark Stange-Tregear: [00:24:41] Yeah. Yeah. And I, I would be remiss if I didn't. Um, if I didn't call out the engineering team I worked with is phenomenal and, you know, really. Yes. The majority of the credits on, on really making it possible for us to do that. I mean, Showalter was work on the analytics side and, you know, a lot of rebuilding to do, but really the engineers, um, have to get the credit for making that possible.

Steve Hamm: [00:25:06] Hey, you talked about the proof of concept that you did with snowflake early on. Could you go into some detail that, what, what were you trying to accomplish and what kind of benefits did you get?

Mark Stange-Tregear: [00:25:18] Yeah. Yeah, you certainly can. I mean, I think that the, um, the core of the answer question goes back to where we were struggling with Hadoop.

Um, so let me kind of start that the, so we did get, we did get our enterprise platform on Hadoop. It was operating, it was just about working, but as we made more data available through Hadoop and we threw more workload at the machines. Right. We were just running out to the CPU and memory. Um, and ultimately, even though there's some ways to manage CPU or memory, we then started just hitting the point where the hard drives.

We couldn't get the data off the hard drives fast enough, um, to keep up with demand, which means there's a couple of different ways you can go, you can buy new stacks of servers. Um, You know, uh, material capital expanse and, and some pretty lengthy implementation times typically, or you can start trying to cut workload or constrain workloads, um, or you can start splintering workload when effectively go back to the old SQL server days where you're having to re blend data or you're managing multiple platforms.

So what we were. The issue we were primarily dealing with was scalability, um, fluid, scalability, and concurrency of load. I think the other pressing issue was that we were having to, we were really in this, do we overbuy or do we under by question? Um, we're a relatively spiky business, you know, uh, Q4 as a commerce company is much, much bigger for us than the other quarters of the year.

So. We had to effectively buy the hardware to cover peak loads. Um, but then a good trend, a lot of the time that was relatively inefficient, you know, and we had to try and balance in between which left us in a situation where we didn't, we weren't super satisfied either with the low profile, with the low time performance or with the high time performance.

Um, so I think as we were looking at. A migration to the cloud. We wanted to see a couple of key things, and this is how we evaluated the, how we evaluated the different platforms that we looked at and the platforms that we looked at at the, the players, you would imagine in this space, you know, Redshift, big query, snowflake, Zeo, et cetera.

The key, the key questions that we came down to a one, can we get the data into this platform fast enough that we can get close to real time analytics back outs without having to jump through too many, the hoops. Right. So it's relatively important for us to, um, at least in some areas we can see what's happening today.

You know, what's happening right now or, you know, five minutes ago, at least. So question one, can we get the data in that fast enough? Question two for us really was around concurrency. You know, I've got hundreds and hundreds of users using these platforms extensively all day, right? We're a business that runs on data.

So I have very high, you know, for the size of company, I have very high concurrency demands, um, and honestly, pretty high expectations from my end users about what the performance should look like. Um, You know, people get bold after about five seconds in my experience. So, you know, unlicensed system could viably handle multiple hundred people concurrently with average query times and a couple of in the range of a couple of seconds.

And it wasn't really gonna work for us. Another key point is we wanted a platform where we didn't, you have to maintain separate data lakes, data warehouses, and then reporting lies. I've personally, okay. Always struggled with that architecture and then having to, you know, pick a different data sets of different technologies, just to do different parts of what I.

Think of as analytics. Um, so we really wanted a solution that could give us all three and one, we also wanted a solution that had, that was at Lee. There was really quite fully featured when it comes to sequel. Um, our main interest language with the detours cells equal. Um, and, you know, we wanted it to be as fully featured as possible and do a lot of things by default.

And ultimately by the time you put those conditions out there, um, we sort of whittled down the competitors and, uh, the one that really worked best for us is snowflake. I'm not saying the same, that that would be everyone's decision, but that's why we got to evaluate along those conditions.

Steve Hamm: [00:29:49] Are you using, are you using it very broadly now throughout the organization?

Mark Stange-Tregear: [00:29:54] Yeah. I mean, yes. Is the simple answer,

Steve Hamm: [00:29:58] if you could, to kind of tick off a few of the, of the uses. Yes.

Mark Stange-Tregear: [00:30:02] Yeah, absolutely. So, I mean, we tend to, I tend to talk about it and think about it as the, as the enterprise data hub. So. Other than our internal HR data, you will find pretty much any data that you need within the company is in snowflake.

We add to it continuously, um, as new projects or new, uh, new initiatives stand up. We have be on my team, logging in and my team service in one way or another most parts of the business directly or indirectly. And my team are obviously routinely in their room and we'll drive everything from snowflake, um, in terms of providing data.

But, um, we then have. You know, users either directly in snowflake or using Tableau on top of snowflake throughout every business unit in the company, it is used to do, um, to verify all of our financial statements to meet our audit requirements it's used in marketing planning. Um, we use it to feed data out to.

Third party platforms that we work with, including, you know, Google as well as our, you know, our boutique vendors that work on our TV advertising spend optimization. Um, it's used in product analytics, customer segmentation is used in B to B reporting and modeling and projections it's um, so we use it. I mean, it'd be other than HR.

And our internal HR data it's used pretty much any way you could think of to a grocery store alone.

Steve Hamm: [00:31:37] Yeah. Oh, that's, that's great for snowflake and obviously you like it too. So let me drill down here. What you look at this whole array of uses of applications, where you, where you do use snowflake as a component.

Pick one, maybe the most ambitious project you're working on that has snowflake as a, a key component of it. Can you describe that? And maybe something that's really new?

Mark Stange-Tregear: [00:32:07] Yeah, absolutely. I think one of the key components that we offer back out to our members is, um, cash back. Right. That's kind of what we do.

Um, and there's. You know, as you earn cash back, um, on our site, there are multiple different ways in which you can do so, and clearly your sensitivity to different cashback rates. Um, what maybe merchants are interested in offering a cashback rate levels can vary. Um, so there is an initiative within a large scale initiative that is core powered with snowflake to really dig in and understand.

How would we vary cashback rates for different members in different situations and for different merchants, literally trying to get down to the point where as an individual, you know, one of our multi-millions of members, you will have customized rates, ideally focused on why you have a core interest. Um, you know, and how does snowflake play into that?

Well, we use snowflake as the control mechanism, um, for that. So we pull the, you know, there's a lot of potential data points that go into making those kinds of decisions about what reads should be offered and to who and when and why and how then they should be communicated out snowflake. First of all, stores, all the information that you would use to make those decisions.

Right. As the primary repository for all the kind of the con the component parts, whether it's your previous shopping history, whether it's direct consumer feedback through, you know, you telling us what your favorite stores are, whether it's, um, click stream data. Or, or other types of data that could in, could inform those decisions, the models, then to decide who, which members may be interested in, what offers and on what merchants, and then calculates largely within snowflake.

Um, today. Every now, and then we will, we will do some feature engineering within snowflake. Compassed some of that logic through a separate data engineering tool, but ultimately the results that they get written back into snowflake and snowflake is effectively the, um, the production system store for that member personalization data.

It's then passed on from snowflake into various kind of API layers so that our applications or communications can be customized, but it's really the hub of that, um, of that entire project. And really that gets to the core value is sort of a quantum leap in terms of the core value proposition that the business is offering, which is, you know, Who gets well cashed by grades and when, and how do we make that as compelling as possible for both our merchant partners and our consumers.

Steve Hamm: [00:35:03] , that's a great project. You know, over the past five years, AI and particular machine learning have really become mainstream in the enterprise, used for lots of different applications and lots of different kinds of technology are being used. So I wanted to find out how you're using AI or maybe particular machine learning and how that fits with your use of snowflakes technology.

Mark Stange-Tregear: [00:35:31] Yeah, absolutely. I mean, we use, uh, we use machine learning and to AI and in multiple different areas, certainly, you know, personalization for the members is a big one. We also use it in areas like, um, financial planning, risk management, fraud detection. Um, I think, yeah, like any other big visible company that is something we unfortunately have to deal with.

Um, And, and manage it. So, um, you know, understanding that as well as doing things for our merchant partners to do with, um, modeling of, you know, what would we expect, um, the response to be for certain, for certain cashback rates within certain audiences or certain targeting, um, criteria. Um, but all machine learning, artificial intelligence derived, at least in pop.

When we start thinking about Snowflake's involvement,  and I'm going to simplify a little bit here, but, um, I tend to think in kind of the components of a data science project or, or even better the components of a data science implementation. So often. Your, I would say typically with the data science implementation, if you're looking at a new one, you have to go through a couple of phases.

There's the initial feature engineering. You've got to produce the data that you want to try and train a model on our train, um, or train the AI on in the first place. Then you go through a phase of. Essentially training the models, building the models, deciding which model is going to be the best. And then ideally, and I think, and honestly, over the years I've seen this there's the biggest, biggest struggle is you've then actually got to find a way to get the output of that model back in and making a difference.

Otherwise,  I owe this an allergy to a data science. So data scientists I worked with several years ago, you ended up just building stuff and sticking out on a shelf. Right. And it looks very pretty, but it doesn't really do anything for you. So where snowflake enters that picture, is it sort of the start and the end of that process?

Increasingly, what I'm finding is the traditional feature engineering that we had been doing using R or Python or, and sort of spilling out into different systems. We're just doing it snowflake. Um, it's easier to do it with CQL more people can work on it. We can optimize the cost more efficiently. And also what we're finding then is that these feature sets that are built.

Can become data artifacts that we can use for other purposes in and of themselves, because it's just equal that we, that we can run on a repeated basis and we actually make data available for visualization analytics or other purposes. You know, as, as a side benefit, almost what we then typically do is we would, we would feature engineering, snowflake, drop the data.

Then into Sage maker, we tend to use Sage maker as our, as our data science platform, um, run and train the models within Sage maker. And then what we do from there is we pipe the data back into snowflake. Well, why would we do that? Well, cause it's, um, it's pretty easy then to make use of that data.

Snowflake, like I said, we wrote is already. Connected into a lot of downstream applications, certainly for internal analytics and visualization, but it's also directly connected to, um, our email tool. We use message gears, um, for marketing and email. Um, so there's no, there's no real gap there. We've also already got technology that complaint data into CAFCA streams or into API layers.

To make a media use of that data. So that traditional, Oh, we built it and it sits on the shelf. Problem really gets reduced. If the output of the model can be written back into snowflake, which has already connected to the downstream applications in uses. So

Steve Hamm: [00:39:32] there's

Mark Stange-Tregear: [00:39:32] a little bit of a long story, but I would say, you know, As we look at snowflake today, increasingly it is taking the up from feature engineering work and the stream implementation all of the data.

Um, we also using separate tools for the model building and testing

Steve Hamm: [00:39:49] mark right now. I'd like to ask you to be a bit of a visionary, looking ahead, five years or more.

What changes do you see coming in cloud data analytics, both the technologies and how they're applied to business.

Mark Stange-Tregear: [00:40:07] Yeah. So I think that's a good question.

And it's one that I've really been starting to ask myself. I think the, or because at this point, I think if you asked me that question two or three years ago that we were running with Hadoop, it was good technology. The cloud platforms were just starting to really hit materials scale, and the technology was still almost standing in the way of the use.

I think with the evolution of the cloud platforms over the last couple of years, the technology, and, and I've literally been in meetings where I've said this, the technology is getting out of the way. Like I don't have to worry about, can I pause through a hundred terabyte table anymore because I can, there's not even a problem.

It's just a matter of, am I willing to pay to do it? Um, it isn't an issue doing it. Yeah. The question for me then becomes okay. If the technology is getting out of the way. What should a data platform be doing? Not what can it do? And I do feel like that's a fundamental shift in attitude and an outlook that I'm looking at.

And I know other people around me are looking at the way. I tend to think about this as I think about what are the things that our data platforms should be doing. Um, and some of them are pretty, some of them were pretty clear and pretty well understood and well known. Um, you know, we should be offering good.

The good storage, it should be offering good compute. Ideally, we'd go concurrency. You should be able to visualize your data. You should be able to run. You should be able to do things with the data, um, pretty effectively like sending email or making the data available via API APIs. There are then some other components, which I think that the industry is really just starting to wrap its arms around, um, data cataloging and data discovery.

You know, one of the things when you've got a, you know, A petabyte of data in a, in a cloud database unless you're an expert on that cloud database, how are you supposed to find anything in there? It's probably divided into thousands of tables, there's nuances on all of them. So how are you supposed to do that?

And I'm not saying that there aren't vendors out there doing that, but I do think that that's going to be over the next couple of years. There's going to be an evolution there. Because having all this data is one thing. Being able to actually find it, what you need in a scalable way, I think is still a growing edge for the industry.

And in particular, I think being able to do that in a, um, in an environment where you're not having to jump into other tools or you're not having to jump into partial solutions. Is is still really not within the scope of anything I've seen today. Um, the, the other big thing I see coming, and I think, um, with the release of the GDPR laws in Europe, um, with CCPA coming out to California, um, Brazil just enacted new laws.

I think, um, Japan. There's now got much straight to privacy laws in place. I think the notion of data governance and really focusing on what is the balance between the collection of law and use of large amounts of data on water, the water, the consumers rights around that data. It's still an area that's relatively difficult to manage.

I know that we have put a lot of time and effort, um, especially around CCPA. Cause we're obviously, we're, we've got a big presence in California to make sure that we're looking at the spirit of that law. Not just, not just the, um, not just the letter of the law and thinking about what that means for us in the long term.

And I do think that there is another wave of innovation that is likely to happen. Helping companies to really make that shift from store everything and meet use about what you, what you need when you need it to. Stole what makes sense and be able to do it in a way that works for the consumers and isn't overly invasive and isn't overly problematic and doesn't allow for, um, sort of unmitigated use in situations that weren't intended by the consumer.

And it's really important to us. I mean, this is why we, right. Like we take our. Communication and our service offering with our consumers. Very, very seriously that that's a core value proposition. We don't have a business without the consumers, and it really, those laws in a way were a challenge to really look at that from a data point of view and think, could we do an enough here?

Um, and what do we need to do better? Or what would we want to do better to make sure that we can stand up in front of our consumers and say that we're doing everything we can, um, and really following  the spirit of these evolving conversations around privacy and, and control of your own personal data.

Steve Hamm: [00:45:05] Oh, that's great. Yeah. And drill down here a little bit on something. I mean, when I think about your business, one of the core attributes of your businesses, that you are competent saving the consumer for them, for information, they put, they, they give you a lot of information about yourself. They get in return, they get a discount, but when you look across the whole field of marketing and retailing and online retailing, you know, In a lot of cases, that's not.

So, and, and, you know, people have been talking for years about the whole idea of well business hat will businesses have to pay the consumer for the information they provide. Do you see that ever happening?

Mark Stange-Tregear: [00:45:49] I, I see it happening in some places. I think in some, in some situations, I mean, consumers are being. Offered money for insight for years, right? I mean, consumer research groups, um, there's entire companies built on, you know, sending out surveys and collecting information back. Um, and that's a very deliberate form of that.

I think for a lot of businesses, do I think it's going to become that explicit or that, that transactional? No, not necessarily. I think that will be an evolution though, in the way that we think about. That we think about the storage of data and the value proposition that the consumer, that the, that the product is providing back to the consumer.

So, I mean, my understanding, just going back to our business and my understanding of all consumers is that, you know, they understand that we have to track certain things. Like clearly we need to know where they're shopping to be able to give them cash back at those stores. Right. Like the, the there's a basic level level of data and information there.

Yeah. The program doesn't work. If, if there isn't a, if there isn't an agreement that we will hold that information, right. We need to know where you went shopping, what the cashback rate was offered at the time. We need to know what your, and to some extent, we need to know what you've spent, right. Because otherwise we can't make sure that we're giving you the cash back that you want.

And. I think the question becomes what additional information is collected as part of that are the consumers comfortable with the additional data that's, um, that's collected. And do they see that there is enough of a value proposition with the product that they are willing, that data to be collected? I think that it gets even more complicated.

Or that calculation becomes even more complicated where potentially the, a lot of services as included out there. But I mean, this is true. A lot of services is that the data is collected with the intention of improving the consumer experience. Genuinely. I mean, I think that I can, I can stand up and say that that is definitely what we're doing.

If we're collecting information is because we want to try and improve the consumer experience in terms of making, you know, making it easier to find what we, what you're looking for in terms of offering you the best deals so that you don't miss a deal. Right. When it, when it's out there. And I think it's a.

The difficulty comes from the fact that sometimes you don't know what information will help to improve the customer experience. And this is a conversation that's happening around how much data is it right. To connect, to collect from a consumer, with the understanding that you're trying to improve that experience.

If you then don't find a use for that data, you know, or it doesn't group potentially particularly beneficial. I do think that that will be a growing area, the standing of that dynamic, um, I think it's already there to a certain extent, but I think that, that, that understanding from consumers we'll get deeper and we'll get a little bit more sophisticated over time.

Um, as some of these conversations, um, that were, you know, about privacy and about data collection online and about security online become more. A component of our kind of national and international conversations,

Steve Hamm: [00:49:16] mark. I want to thank you so much for your time today, your stories, your insights, about what you do with data and how you've been doing it is really fascinating. I loved hearing kind of that operational stuff about how you combine analytics and engineering and your company and how that really. Uh, improved communications and mix mix the, the technologies that your engineering people more useful and, and, and more used.

I thought that was fascinating. And also just the story of the evolution of it. The company has taken over the past few years in how it, how it does. It's, it's big data analytics, you know, from, from the SQL server to the Hadoop and then finally to snowflake and the cloud data warehouse. So it's been really interesting.

I think, I think the listeners today are gonna find it very interesting. So thanks again for your time.

Mark Stange-Tregear: [00:50:07] Absolutely happy to be here. And thank you for letting me share some of this and so a really interesting conversation and a really interesting time to be having this conversation.