
This is software (AWS) generated transcription and it is not perfect.
um, e think for think For me a lot of it is luck making the right decision at the right time and for some of the right decisions. There was a lot of confusion where the decision was an obvious at the time. Eso for my background. I studied chemical engineering. I planned on going to medical school and I ended up getting a job. But until Micron and, um, I worked there for five years, and then I had an opportunity to go work for a hedge fund and that that really came from my my own personal motivation with high performance computing. So I was doing things on the side, not related to my job where I loved GPS. I thought they were really awesome. This is before deep learning in that that allowed me to get a job at a hedge fund. But then, working in the hedge fund allowed me to get a job for Sequoia Capital Company called Higher View, where I was their chief data scientist and build out their data science team. And that experience built up my confidence to go to a startup and make the leap three years ago and now I We were purchased by data Robot and now I work for data. Robot is their chief ai evangelist, which is a really, really fun job. But the path to get there was was not necessarily planned.
Yeah, So Data row. But they are. They started as a nought oml company. So auto mail stands for automated machine learning. So, back in 2012 the initial concept of data robot waas they could automate the process of building a competitive predictive model and back around 2012 into 2014 that was a controversial topic. People thought you needed expert data scientists to do that. And now a lot of other companies have followed two, and they've realized that not only can you do that, but you can do it better than most state of scientists. They don't have the experience, and since then did a robot has really evolved mawr into an unending end ai solution. So they go all the way from data capturing, did gathering, cleaning all the way through the model building process and then an entire part of model deployment that a lot of data scientists are not familiar with. So they take care of allowing models to get shipped into production, but also monitoring them if there are problems and usually for any model that matters, that's a really big deal. So, banking where you have compliance or in manufacturing, you can't have models that drift. And they could do it for lots of reasons. They could drift because features drift, meaning inputs, and then my roles and responsibilities. This is this is kind of a new area, this evangelism piece. And so my role is I I'm a co host for podcast called Mawr Intelligent tomorrow. So we have a lot of executives that come on that podcast, Um, CEOs and chief data officers we had Congressman Will Hurd on, um, looks like we're gonna have someone from the FBI coming on, um, executives from Google and IBM talking about AI strategy. So that's that's really fun to generate, thought leadership, but also get access to very important decision makers that are harder to get access to. And then the other part of my role is maybe anchored in storytelling. So if you're if you're going to go give a strategic talk or a keynote, can you do it in a way that your, um, the talk is memorable? It's inspiring, and it leads to new opportunities, and in doing that is something that is less common. If you go to a data science for AI conference, I would argue that the vast majority the talks were forgettable
um, it's a great question. You have tools that feel more academic and people will. They won't like this, but I would say our is maybe more for research and pythons for production. Um, when I was in school, I was taught Matt lab and I thought it was great. I became an expert in Matt Lab, but I noticed it was really holding me back until I wanted Python. So I use python when it comes to when it comes to machine learning algorithms. I think Python has done a lot to catch up because are used to lead our. If you wanted the latest random forests, you have to use our But now, in Python, I would argue that not only this python have a lot of your standard clustering and machine learning algorithms. It also has a lot more deep learning support. Deep learning. There is a lot of confusion in the market, and you see it by the number of framework. So Google has tensorflow. Mxnet has, um, Amazon has mxnet. Facebook has pytorch. Baidu has paddle paddle. You have all these different frameworks, and we see for people that are getting started, a lot of them use tensorflow because of the Google backing. It's popular but tensorflow for anyone that does things in production scale, it is not my first recommendation. So we used mxnet in our startup. But I think PYTORCH is starting to overtake mxnet. And so the recommendation for the students. It's really important to use a framework that you feel comfortable. Um, it's not necessarily hacking, but you feel comfortable customizing. So if you need to get under the hood, a good comparison would be if you wanted to be a core contributor and actually contribute code back, I would say in Pytorch in Mxnet, that would be, ah, lot more straightforward than, uh, Tensorflow. So I'm very outspoken critic of Tensorflow. Hey, I hate tensorflow with a passion. There's a core of question out there that's titled Why does Ben Taylor hate tensorflow? And I don't know if anyone's answered it yet, but a Yeah, so please use pytorch or mxnet and carries is a good place to learn, but I wouldn't recommend it for productionWell, I I think I have a bias because data robot is being for a lot of ai. The interesting thing. Here's the vast majority of a I actually never makes it to production. So I would say anywhere from 80 to 90% of AI initiatives inside very large companies where you've we know their brands who know their names. They are buried in notebooks, Jupiter notebooks, python scripts. They don't actually go into a formal production pipeline. But then you have customers that use data robot and they all the models go through our production pipeline, and they're they're managed. There's the scale. They run on the cloud we can support. Multi cloud. And so it's I don't have a good answer that question, because I don't have good examples. Toe Look at where people are shipping models into production unless you're fame company. So if you're if you're very large company, they've built things from scratch Internally, they have their own pipelines that work. Some of them are public. You can understand how the production of stuff at Facebook in Netflix, but for most companies that don't have a very strong AI research group, um, they when I was at higher view. We took advantage of a lot of the Amazon tools and wait for our our startup. We took a lot of advantage of also the Amazon tools using Ramdas using server less because it allowed for insane scaling. We could support AH, 100 million inferences per month, and we didn't have to. We don't have to worry about it. We could sleep at night and with server list that just runs. So if you look at the last 10 years or 20 years, huge improvements have been made when it comes to auto scaling deployment. Even looking at stuff like Docker in imagining how people used to deal with virtualization, it's it's amazing. Or the Twitter clone in five minutes, Um, which is really exciting for students because they can do stuff in a weekend where 10 years ago, we could not even imagine being able to do that, especially as a single individual