Pearson Principal Data Scientist and Team Manager
University of Notre Dame Doctor of Philosophy (Ph.D.), Quantitative Psychology
Current Time 0:00
/
Duration Time -:-
Progress: NaN%

How did you get to where you are today? What is your story? What incidents and experiences shaped your career path?

Summarized By: Jeff Musk on Wed Dec 18 2019
The best place to start this story is probably in my undergraduate degree, which was a bachelor's in psychology. I was drawn to, psychology degree in high school and in my sophomore year at U C. Santa Cruz took a quantitative psychology class, which was basically an advanced statistics class. What actually really grabbed my attention in that class is, that most of the lectures were completely done through the professor just coding. And so he would ask a question and, answer it by just coding in an R terminal, up on a projector. And that was the first time that I really saw that there was not just power of understanding statistics and the mathematics, but how in general coding and then more specifically, scientific or statistical coding can help you answer and do things with data that that just wasn't possible prior to that. From that, I started pursuing a quantitative psychology PhD, which I did at the University of Notre Dame. There, a lot of my projects and masters and dissertation was focused around the intersection of more traditional psychometrics and high stakes testing and specifically around a field called item response theory with big. How do you do this at a scale that the testing industry didn't really a background? Now, my draw boards, psychometrics, were based on my advisor and her, expertise in the field. I was drawn toward more big data and the scale of the problem through an internship that I did with Pearson, which is just my current employer. There, I was working with Dr John Barons, who's still a vice president at Pearson. And he was faced with the challenge of taking massive amounts of student data that Pearson had collected through its existing products and trying to figure out what to do with it in order to draw insights both that could inform and write feedback to the product as well as you inform future products. Now that, to me, that just continued my beliefs in both the importance of data and the importance of leveraging and not just to make business decisions, but to inform product and product development. Coming off of graduate school, I had, my first job at a more traditional, assessment company called NWEA. I left that company, after a fairly short tenure and came to work at Pearson under a research and development lab that started pursuing more and more machine learning and artificial intelligence capabilities as both the talent for those techniques became available. And the team acquired that talent and both and developed it internally as well. And as the company of Pearson started to shift towards investing more and more into AI. And so, over the past, 3-4 years I have both, personally developed the skill set that's more aligned in artificial intelligence and specifically, and deep learning and reinforcement learning. And the company as a whole has shifted their investment to support hiring talent in those fields as well as building the talent internally and so probably, year and 1/2 ago, I got my own team and started off with small team of two people that was tasked with bringing reinforcement, learning capabilities into on educational product, and we launched that product of a few weeks ago. It's a calculus tutor named ADA, and it's an iOS store. And now the team of data scientists underneath me numbers 11 and will probably grow to 13 or 15 before the end of next year. And so that's kind of the broad story. I think going back to the first interaction with that professor in our terminal. I don't think I could have predicted that I ended up here, but certainly, certainly it's been a journey, and it's tough to pull out just a few of the key incidents or experiences that shape that career path, I think. More recently, one of the biggest things has been the opportunity of my leadership on DME, IVP, and other executives to have early opportunities for leadership and to be a part of both strategy and vision meetings and then executing that vision and bringing it down from the VP level to what do we actually do this week in order to meet our product deadlines and to build these new capabilities that the world hasn't seen.

What are the responsibilities and decisions that you handle at work? Discuss weekly hours you spend in the office, for work travel, and working from home.

Based on experience at: Principal Data Scientist and Team Manager, Pearson
Summarized By: Jeff Musk on Wed Dec 18 2019
Currently, we have made a strategic decision to hire people that are mostly going to be working in the office. I typically like to work from home about, you know, one day, a week or one day every few weeks. That's more for needing, administration time or just time to really dig into something that I want to work on independently. That's typically how we use working from home. Most often we're working and I'm working in the office. And that's mostly due to the value of standing in front of the Whiteboard with someone and building those personal connections and having those, those conversations. So going back to the first part of the question about responsibilities. So I am accountable for delivering capabilities in our products. And so, for example, I mentioned this reinforcement learning-based recommend er's that are in the app today. I was accountable to our executives to deliver the code that met those product requirements and the product SLAs. Now the challenge for that and I think a challenge for applied R & D groups is that you have to both innovate and push the boundaries of what's possible, but also do that within a successful product launch time frame with integrations for both the front end and UX and UI, both design and development as well as, our engineering partners that actually put the final touches on the code. So my responsibilities that at the beginning of these kinds of projects as to understand, what is the executive vision? For example, our senior vice president, Malena Maranova, when she joined Pearson about a year ago, she said we were launching a product within a year. And these are the kinds of AI that it's going to do, including reinforcement learning, including handwriting recognition. And then it's my job to figure out to bring that down a level to say, what does that actually mean? Working with our product team, what are the actual product requirements and writing those down, then, taking not a level deeper, defining what are the API contracts to communicate between the different capabilities and then going a level deeper from that within this single recommender what is the actual code that will deliver this? And how do we know it's working? And how do we know how to improve it after the product launches? So, really, my role is a bridge between the executives and the day to day mostly coding tasks. One of my biggest responsibilities is to ensure that my team is both aligned and correctly prioritized to deliver on the executive's vision and strategy. Also being a team manager, there's a lot of responsibilities that I have to the individuals on my team. So anything relating to HR promotion, compensation, fall on the manager professional development, finding support and identifying appropriate conferences, for example. And, just everything that comes with being an employee. I'm the person that the direct reports come to just like I go to my direct report when I have similar needs. 

What tools (software programs, frameworks, models, algorithms, languages) do you use at work? Do you prefer certain tools more than the others? Why?

Based on experience at: Principal Data Scientist and Team Manager, Pearson
Summarized By: Jeff Musk on Wed Dec 18 2019
The easy answer is we use Python. But that's a vague answer. The better answer is that use the tools that are going to get the job done. And so, most recently, on the back end, that's been python oftentimes supported by Flask or Django or Falcon, depending on the actual needs. For deep learning, we've mostly used TensorFlow, but there have been issues where we have a solution that TensorFlow doesn't support, but PyTorch does. And so some people learned PyTorch in order for us to use those tools. From an engineering perspective, we've been using more and more of Docker in order to actually deploy and support our integration with the other services. A lot of our cloud infrastructure is based now on Google Cloud. And so we're leveraging a lot of those tools and data store and logging and etcetera. In terms of, algorithms and models. Again, we and and I hire for people that can learn and can solve problems. And so it's much less important that you know, we are a Computervision team that is really good at CNN's It's much more important that we are a team that can understand what our needs are of the group and then find the right solution to that on. For example, reinforcement learning is a very interesting piece. When we're looking at. So, for example, I just got back from NeurIPS. If you look at a lot of the reinforcement learning that's happening in NeurIPS, it's happening on, video games or simulations. And so if we take you to know, kind of the basic example of Quadcopter, trying to teach the quadcopter to land in a simulation, you can fail millions of times before you figure out the right way to land that quadcopter. For us, we were launching a product that had zero data to start with, and so we had to build a solution that would use reinforcement learning but had never seen data before to start with. And that is a very different solution we didn't have and we would never have an education. 10 million tries to do wrong before we figure out the right approach, and so we tend to lean towards much more interpretable models. That doesn't mean that we don't use deep learning because we do in in many cases. But the more that we can explain why the model is working and how it is working, the better. And so there are a lot of solutions we have that certainly have a Beijing flavor to them that not only have Pryor's that allow us to smoothly transition from no data to a little data to a lot of data, but also in many cases, have a generative capability to them as well. And so we can not only used the model for predictions but can understand how it's representing the world and how it is understanding the problem that we've given it, so a little more concretely. A lot of the innovation is not that we do is not in the actual algorithms or the more core computational science part of it. A lot of the innovation that we bring is finding the right solution and adapting it to the right problem. A lot of times we maybe find that in theory and then our writing that from scratch in you know, fairly based python, pie, for example. Other times we see that you know, the right solution involves something that's related to object detection. We've done this in the past where we found, uh, tensor flows object detection Repo That sets out this is how to get to state of the art from five years ago. And we just start walking down that list of models until we reach one that suits our needs on dso much more than than a clear set of tools. We need and we value people that can find the right solutions to the right problems and adapt their toolset to build that solution. The engineering teams and both front and back end have the same philosophy where, yes, we might be spending 90% of our time in python or 90% of our time in Swift. But what's more important is knowing the art of engineering. And it's more important to understand what is going on from a more higher level and it's fairly theoretical perspective. But to understand what you're trying to do an abstracted engineering task than it is just to get the lines of code, right? So it's not just a data sciencing. 

What things do you like about your job? Were there any pleasant surprises?

Based on experience at: Principal Data Scientist and Team Manager, Pearson
Summarized By: Jeff Musk on Wed Dec 18 2019
I love my job. There have been many pleasant surprises, in part because it's my career has changed so much since I joined. I mean, if I go back to grad school I was roughly an intern. I was an intern with roughly the same leadership. And that leadership has evolved over time. But my role went from, you know, big data and, you know, analytics kind of role into data science role where I was responsible for, creating small little pieces of capabilities to a senior data scientist role where I was leading people and then now to a role of the team manager, where I am responsible for others but not necessarily responsible for writing all the codes, even though I'm accountable for that work. One of the things that I appreciate about both Pearson and then specifically that this team is how much we focus on the people and care about the people. Both my boss and then and then I lead by example on this we know that if you're not, right as a person if there's something bothering you and your personal life or health or whatever else comes up Justus being a human, that you're not gonna be a good worker. And it's much better to take the time, deal with whatever you need to personally and then come to work in 100%. And so there is certainly a flexible culture in terms of, both balancing what it takes to be successful at work, and then what it takes to be successful is a human. And that comes as a bit of a trade-off because while we certainly strive to have you know, a normal 40-hour workweek. There are times when things are very busy and you know that balance goes both ways. But I certainly have ah have appreciated the ability to take care of my family and myself when I need to while still being in a fairly high paced organization. The other thing, too, that I've come to really appreciate, probably over the last year, is the power of alignment. That it is it's a very different job than I thought it was going to be. You can be the smartest person on your team, and that team can fail if the team is not working together in an aligned way. And so just the amount of time that it takes to both hire people that understand that, and I understand that their success is not measured on an individual capacity. But it's measured as the team's success and the amount of effort that it takes and the time that it takes to keep people aligned. When you know my organization is 11. The larger RND organization is about 35 the larger AI organization at Pearson is, I think, a little bit over 100. And we have the third-party groups that we work with. And so to keep that amount of people focused on the same North Star gazing is both incredibly difficult, but it is also really rewarding to have that high functioning team when you get there. So those are some things I like. 

What are the job titles of people you routinely work with inside and outside of your organization? What approaches do you find to be effective in working with them?

Based on experience at: Principal Data Scientist and Team Manager, Pearson
Summarized By: Jeff Musk on Wed Dec 18 2019
In my organization, there are just Data scientists at different levels. The more Interesting job titles that we interact with are within the R and D team, there are three categories. One is iOS engineers, one is back end engineers and then the third is our, function that takes on both, prototyping and product exploration as well as project management. And so, given just the amount of complexity to our work, everyone is a project manager. But we also have people that have specific job titles, as a project manager. the hardest thing, I think one of the biggest kinds of shocks to new people when they join the organization, especially if they come from a data science background and not really a computer science background, is that we have an expectation that we can estimate how long something will take. Now that is fairly easy, or it's at least trivial. If it's a known engineering task and it's could be well defined, it can be incredibly difficult and frustrating when it's a more research task. So how do you know how long it's going to take to do a lit review and to come up with a small proof of concept showing a solution. That's where our project managers help out a lot in trying to both give us the space for creativity and innovation, but also provide a structure so that people aren't wandering off for six months exploring a cool idea that never comes out to be something valuable. For working with engineers, the most important thing is to have some engineering chops, is to be able to have a shared language with the engineers. Now, what's nice about our organization is that the engineers know they're working in an R and D organization, so they come easily halfway to being a data scientist. And the data scientists know they're working in a product organization, so they will need to build and launch things. And so they come, you know, the other half of the way. But it's incredible to get, you know, two people pair programming that have very little overlap in their skills. And one maybe, you know, for example, we might be trying to get a prediction model based in TensorFlow onto a certain kind of hardware so that it makes predictions faster. And so the overlap between the skill set, the data Scientists might not have, any kind of DevOps skills. And the engineer might have only a little bit of TensorFlow skills, but they figure out how to have a common language and how to achieve something that they couldn't independently. The other group that I mentioned, actually I mentioned two other groups. One other group that's worth mentioning is, UX or UI design. I have come to appreciate that there is a lot to design and there is a lot of information that is both communicated and can be manipulated in how a product actually looks and how the user interacts with that product. This has become really front and center for me over the past year with reinforcement learning, because we need to draw that reward signal back from some sort of user interaction back to our, reinforcement learning back end service that required a lot of partnership with, re UX folks to be able to say what element the user can interact with, is a reward signal. How do we instrument that? And then how do we get that back to our service? It still comes down to finding a common language, but it was certainly a new thing for me to understand what is a common language between, looking at wireframes and understanding what's trying to be communicated in the design to where I was thinking, which was in, APIs and JSONs and Python code. It's a very different thing and certainly stretched my skill set this past year. And then the last job title I mentioned are executives or VPs. There you also need to find a common language, but there's no expectation that they're gonna come down to your language. And so being able to go through a crazy day where some things work and some things didn't and, taking all of the details that I'm aware of and being able to distill that into one or two bullet points that VP can digest and act on, that they could be actionable insights for him or her is a, something that is both difficult to do and is something that I'm still learning. It's hard when you're in kind of the I don't want to say the chaos of the moment but when you're really passionate and working through a problem, to be able to step back and say I don't think the VP needs to know that because they can't act on it or it doesn't affect what they're going to do in the next day it's difficult to understand what their day looks like versus what my day looks like. And to know that there might be, we do daily stand up with our VP, which is both for updates and for, whatever issues that he needs to be aware of and that short little 15-minute update is our one connection point throughout the day and the rest of the day he's working on other things and I'm working on other things, but we have to make sure that we maximize that 15 minutes of time and that again comes to finding a common way to communicate. But it's very different going up to an executive-level on. That's a skill that is both, I think, difficult to learn and is incredibly valuable and desired. If you can learn it and can work on it.

What major challenges do you face in your job and how do you handle them? Can you discuss a few accomplishments?

Based on experience at: Principal Data Scientist and Team Manager, Pearson
Summarized By: Jeff Musk on Wed Dec 18 2019
The best accomplishment is launching an App. I worked in an R & D lab where our major outcome was a prototype but no one would actually use them. It was just to show that capability was possible with this data that we had. And so to be able to not have to explain in a slide deck what I do but to just tell people to go to the App Store and download data calculus is like that to me is an amazing accomplishment that I have never seen before. I've never seen my work being being used by thousands of people. The major challenge, I would say, goes back to the tension between research and development. It is incredibly difficult to do innovative things on a timeline. But if we only focus on innovation, will never launch anything. And if we only focus on timelines, then what we launch won't be interesting and won't be good enough to meet what the market needs. How I handle that and how that the really the team is organized to handle that is in our vastly different skill sets. And so even within the data science organization, we draw from all sorts of backgrounds all sorts of levels of education from, you know, BA to Ph.D. et cetera. The other teams, especially where you know the really the majority of the R and D team has an engineering background, either front under back end and then the prototype and the project management group. We all keep each other saying, and I think having at that both balance and having close teamwork keeps us both aligned and appropriately prioritized, so that when it is time to be innovative we are. But we also know when it's time to say now we just need to build it and we need to make it work and so that it's a constant balance, which is why I think I'd call it out as one of the major challenges. But that's that, and I don't see that going away. I think that will continue to be a balance because if we get too satisfied with one or the other than then we won't then I don't think we'll be successful as an organization in both pushing the boundaries of what AI can do for education and actually building products that students use.

What are the recent developments in the field? How significant are these improvements over past work? What are their implications for future research & industry applications, if any?

Based on experience at: Principal Data Scientist and Team Manager, Pearson
Summarized By: Jeff Musk on Wed Dec 18 2019
I saw this over the past few years at NeurIPS, but this year it was pretty prominent. There were a lot of socials and a lot of booth workshops and tutorials about using AI ethically using AI for good. One of the main things that I will say when trying to give a sales pitch to a candidate for coming to work here is that we are not going to use your brainpower for increasing Ad clicks. We are going to build capabilities that if we do our job right, can impact millions of students across the world. I don't know in part because for me saying that we should probably use AI for good that that comes at a very big surprise for me. It's like, well what have we been doing is a field this whole time and it's. I mean, it's obvious what we've been doing. We've been making money as a field, but I really think that as individuals look back on their careers, they want to be able to say I did something positive for the world and I think with several things going on in the field, not just the monetization piece of it, but different scandals around privacy or Cambridge Analytica. There is a desire to use both our brainpower and just advancements in the field for good. And I was very glad to see that at NeurIPS, that it's not just people wanting to do stuff, but people trying to organize an order to maybe pivot either their careers or pivot there organizations and companies towards using our talents for something positive for humanity. The more technical thing that I was really impressed with, and this was a general trend was the move towards interpretable models. I don't know if I saw many papers at all that just proposed a deeper architecture, a more complicated one. There was a lot of talk about explainability of how do we bring both interpretability and control into our models? And then also a lot of work that was nothing even around, deep learning or black-box models? And so I think that that is certainly has a huge impact on the work that I do because if something goes wrong with a model that we deploy, we need to be able to explain why. And so there is a baseline of explainability to the work that that that that we do. And I'm very happy to see that there's a lot of academic push for that as well.

What qualities does your team look for while hiring? How does your team interview candidates?

Based on experience at: Principal Data Scientist and Team Manager, Pearson
Summarized By: Jeff Musk on Wed Dec 18 2019
When we write a job requirement, one of the ways we have found is to write it fairly vague. We want to cast a really wide net because people that are successful as data scientists don't just come from the same program, from the same degree and have the same toolset. And so I tend to shy away from wanting to list very specific packages with versions of this is what we need because, as I mentioned earlier, the toolset that we use will change and it changes pretty rapidly. Some of the win after candidates come in and interviewing them. Some of my internal rubrics follow along. One is that candidates need to have a fairly strong background in math and statistics and in the underlying methods that they are expert in it is very difficult if you only understand a model at the level of the particular implementation of a package that you happen to have. And so one example that I have seen in interviews in the past is somebody that knows logistic regression really well. But only knows it through Scikit-Learn. Now the implementation of logistic regression and Scikit-Learn uses regularization. And if you don't understand the underlying equations that that that both that logistic regression is is using as its model specifications and how to optimize that equation, then you are reliant on somebody else's implementation of it. So you need to have that. That underlying skill set. The second is coding skills that not just you have to be good on a whiteboard. You got to be good on a computer and to be able to either leverage or evaluate what other people have implemented or even potentially implemented yourself. Now, oftentimes, and we don't do any core work on like improving the speed of an algorithm. But we need to be able to go out in the literature or and go out and say, you know, open-source repositories and say this is a really good solution, and I'm gonna figure out how to use it. The third thing is, you need to be good at something, and I don't necessarily care what that is. But if you have a strong math background, you have a good foundation in coding or in computer science. Then what is it that you are an expert in. This is what I've seen called a T. So you want someone who's broad, but you also want someone that's really deep in something and again for me, it's less important what that actually is because the needs of what we do change so much. But for example, you might have a expertise in computer vision. You might have an expertise in reinforcement, learning, and expertise in psychometrics. But whatever it is, it's it's something that you're really good at. You're not just broad, but you also have the depth. And then the fourth isn't an interest or some motivation to be working in an education company. Everybody here is here for a reason, and they're not just here to collect a paycheck. Having people that are both motivated to do the work and that care about it at a level of it's not just a job really helps both with alignment because everyone is focused on the mission of the team and the mission of the organization as well as really improves. I think both our motivation and our quality of work, and so those are the four things that I interview candidates on. 

What are different entry-level jobs and subsequent job pathways that can lead students to a position such as yours?

Based on experience at: Principal Data Scientist and Team Manager, Pearson
Summarized By: Jeff Musk on Wed Dec 18 2019
Whatever job pathway you take, it needs to have a fairly rigorous Math foundation. I have certainly seen some candidates that have a BA, that's in a non-math-based degree. And then they do a short you know, three or four week python class, and they don't have the depth necessary. So there are lots of ways and lots of areas of study that provide that math and stats background. I think the coding background rarely comes from degree programs other than computer science. It certainly can come from other places, but that's mostly, in my experience, self-taught and self-motivated to actually go out and learn and do projects on your own and set up a website on your own and whatever it is in in in school that gets there. Once out of school again, I think that there are a ton of different pathways. We have people that have BA and Bsc that learn math on their own and learn coding on their own. We have several PhDs from all over the place, from computational linguistics to experimental psychology to biology, to chemistry, several people that have a master's in computer science or a master's in data science. And so I think there are certainly a set of skills needed foundational skills, and I mentioned those before foundation in math, foundation in coding. You're an expert in something that's relevant to what we're doing. There are many ways to get there, and as long as you know what you want and what gaps you need to fill. I think they're many different ways to do that. Certainly, we in the past have done internships that would qualify as an entry-level job that would build on a path. I've also had a lot of success hiring people that came from a more Analytics role and that both getting hands-on with the data and also as a way to build up coding skills. I think that there are a lot of opportunities if coming out of a school program, somebody doesn't have all the skills. I think something like an analytics-based job is a great way to get into more machine learning. Because if you don't know how to work with the data, you're not gonna be successful in this.

How did the program prepare you for your career? Think about faculty, resources, alumni, exposure & networking. What were the best parts?

Based on experience at: Doctor of Philosophy (Ph.D.), Quantitative Psychology, University of Notre Dame
Summarized By: Jeff Musk on Wed Dec 18 2019
Most of the work that both the faculty do. And the graduate students do are at an academic theoretical improving the equation part of the statistics. To me, that was one of the strongest things that I brought from that program and getting through the qualifying some of the toughest classes that I ever went through going through the dissertation, nothing is going to be as hard as that in my career, and so I know that I could be given virtually any problem, and I can break it down and figure out if I could do it and how to do it. The program also focused quite a bit on presentation skills, which I really appreciate. So every semester, every single graduate student would have to give a talk to the entire department. So by the time you got to doing job interviews five years later, you had already presented your work in front of experts 10 times. And so to be able to, both organize your work in a way that you communicate it to others and then to publicly defend your work to questions and engage in discussions with that is, I think, set me up for success both an executive communication and in job interviews. Because it's not the first time that you're being asked about your work. It's just another toolset that you have. The faculty there are great. I had 11 professors that in particular challenge me the class on two things. One was that she said that grad school is like shopping, and in the end, you will only have what you decided to put in your cart and what you decide to pay for. And I really took that to heart. I mean, You have to be active in your program of deciding. I'm gonna learn this, and I'm gonna kick ass at this. And if you take that and invest the time to learn that, then it's something that'll be with you for a long time. The second thing that she did and I remember this was in a generalized linear model class, is that there was this whole chunk in the middle of the semester where the students had to give lectures and giving a lecture on a topic that you have never learned before. It was easily one of the most scary experiences that I've ever done. But that really pushed to me the importance of being a teacher and the importance of understanding. If you can explain and if you can teach this topic to someone else, then you can master the topic yourself. And so not just thinking, What do I need to learn? But what do I need to learn to do something about it and to and to pass this on to someone else.

How did the program prepare you for your career? Think about faculty, resources, alumni, exposure & networking. What were the best parts?

Based on experience at: Bachelor’s Degree, Psychology, University of California, Santa Cruz
Summarized By: Jeff Musk on Wed Dec 18 2019
In Santa Cruz, I had an instructor that taught his stats class only through in R terminal and to me that that was like, "Oh, you could do this like, that's amazing." One of the best things that that I found in the program there, was that it allowed me and it showed me this whole world of quantitative psychology that I didn't know existed when I was a freshman in a sophomore. But as I identified that in my junior year and was able to work very closely with several faculty on their stats and their quant problems, that really gave me the opportunity. And you know, both the example material and letters of recommendation to get to the graduate program.