– What is SberData Division and why was it set up?
– It’s no secret that data is the new oil. Banks operate huge amounts of data; Sberbank has long been committed to effectively exploiting it. Before SberData Division was set up, there used to be a division that operated large volumes of data – my colleagues and I launched pilot projects underpinned by big data analysis. For example, we predicted client flow in the bank branches, our algorithm helped to evenly share staff workload so that clients didn’t have to queue for too long during peak hours.
– Was it challenging?
– I’m not sure how to define it. Random forest (https://habr.com/post/320726/) is a complex model for some people, for others it’s a piece of cake. It was interesting from a mathematical standpoint. We worked with time series databases. We were tasked with defining peak days, for example. Anyway, analyzing seasonal fluctuations, identifying obvious and non-obvious patterns was exciting.
– What came after these pilot projects?
– In 2016 Sberbank established the Data Factory, it is a comprehensive system that allows data storage and analysis and model-building. Before the factory, we had multiple systems, each one generated its own data. We grouped them together and within the factory transformed them into workable systems.
– What kinds of data are there and which are the most valuable?
– There is a whole lot of them. ATM logs, transaction data, CRM data, ‘Sberbank Online’ in-app activity. Data is all the footprints a client leaves when interacting with Sberbank. They all are valuable. Though different models require different data. The bank doesn’t require client data to understand that an ATM encashment is due. What it does need to know is when a deposit was made, how much money was deposited, what bills were used, who withdrew it and when. Though the client’s marital status isn’t important…however… You never know!
When building a model the Data Scientist (DS) first estimates what data might be appropriate. Then it is retrieved, processed, sampled and put into the model. The DS analyzes data effectiveness. If it’s not sufficient, they look for more. All in all roughly 80% of a DS’s work is data search. They should be hungry for data, in a good way, they should always look for new data sources.
– Did you work as a DS?
– I did, yes, I started as a DS. After I graduated from the Faculty of Computational Mathematics and Cybernetics of Moscow State University, I started working in telecommunications, I got a job with MTS. Among other things, I was engaged in data mining – this is finding interesting patterns, non-obvious stuff that can prove to be useful down the road. And after a while I realized I wanted more than just building models. I wanted to obtain a wider perspective and see products created using these models. Because just a model is not an aim in itself, it must serve as a base for future systems. So I changed jobs, got a job with Sber and became a product owner after a while, just as I wanted. At the moment my main focus is technologies, general infrastructure, data analysis, data retrieval, data compatibility and other things that make a DS’s life easier. It isn’t modeling as such, it’s more about architecture, infrastructure and data flows.
– Which product do you own?
– As part of the Data Factory there is a research lab. Data Specialists test their hypotheses, build models and do data mining. My product provides this data to the lab. We also make prototypes and help Data Scientists to collect data marts, because we know how to combine data effectively. We provide follow-up prototype support until our clients develop a fully-fledged data mart.
– What else did you learn in Sberbank?
– Negotiating. Sberbank is a large company, it employs a lot of people. I quickly learned to interact with other teams, security, administrators. Discovering something interesting and making use of it requires negotiating. Teamwork is key in Sberbank, it’s also something you need to learn to do. Because task distribution and delegation is not at all easl. But most importantly, you need to have the right mindset. If you have an enquiring mind and want to get to the bottom of it, you’ll be fine. And those who aren’t ready to go the extra mile leave the company in no time at all.
My effective work habits:
- I always finish what I started;
- I dive deep and figure out how processes work;
- I am eager to learn what’s going on in related areas, to see how they are organized, to borrow ideas;
- I’m always looking to improve my skills and obtain new ones, read books, develop soft and hard skills.
– Are there any internal courses where you can improve competencies?
– There is the Academy of Technology and Data, which develops courses for d.people, that is, people who work with data. You can study data science, all sorts of algorithms, statistics. Moreover, there are good data science courses on Coursera and EdX online platforms. Or, if you are interested in related areas, you can immerse yourself in blockchain technology, which Sberbank has now also started to use. There is a specialized laboratory and specialized courses. There is also AI (artificial intelligence), Internet of things and robotics laboratories. So you can go beyond data science and become a specialist in these areas.
I am a nerd, I always read articles on habr.com as well as our contractors’ articles, I mean Oracle, DIS Group. I learn to use new tools, for example, what is OpenStack, how to use it and what benefit it can give us. I completed a project management course in line with IPMA standards and received a certificate. I like two MIT Python programming courses on EdX. They explain the basics, the theory of probability, it’s a 101 course. I took many courses at Sberbank: about product ownership, public speaking, risk management, critical thinking, emotional intelligence. We have a system called Success Factors – you choose the courses available to you and sign up. Some of them are taught at the Corporate University, some are online, which is even more convenient. I recently enrolled in a finance course. It’s not exactly what I work with, but I’m trying to get to know finance better. When you start small, just with the basics, everything seems logical and clear. The same is true for maths — when you don’t understand what n-dimensional spaces are and why they are important, they make zero sense to you. But little by little it all becomes clear.
– What brought you in this area? Were you born into a family of mathematicians?
– My mother is an economics, finance and legal specialist, three in one. My sister is a linguist. My mother’s family are doctors. Though my grandparents were engineers. And I have always been interested in maths; back in school I liked to solve maths problems, take part in contests. Because maths is clear and logical. Unlike linguistics. Take stylistics for example, how am I supposed to know which register the word belongs to? Or at our literature classes we discussed what the author wanted to say with his work, and our teachers seemed to have a different opinion. I always wanted to ask them, ‘How do you know that, did you meet the author in person?’ Soft sciences leave a lot of room for interpretation. While maths is more about ‘either, or’. However there is an opinion that to solve a dead-end problem you need to step back, take a wider look at it and then solve it. It doesn’t always work for me, it takes time to learn to do it.
– What are your career plans and expectations?
– I like being the owner of the product inside various technologies. I love working with data, learning to use new platforms. I’m excited about programming languages and global data architecture in general. I do not consider myself to be a narrow specialist, only an architect or only a data scientist. I’m into multidisciplinary kinds of jobs. If I could, I would be doing blockchain, the Internet of things and robotics at the same time. I also admire Herman Gref’s strategic vision; I would love to learn to recognize potential development vectors, to know what my team should achieve in a certain period. Russia offers a lot of opportunities to professionals in my area. We have great programmers, developers, a lot of bright minds. To give you an idea, FindFace (facial recognition service algorithm) was created by my university friend Artem Kukharenko. He figured out how to optimize algorithms to make them work faster and more efficiently, and became famous.
– What are your hobbies outside of work?
– I do sports and fitness, go to hustle dance classes that take place by the river. I go swimming, travel, when I get away from an interesting task and take vacation leave. I want to learn to surf.