TechTalk ep6: And then there was Data

Paramita: Hello and welcome to PwC Luxembourg TechTalk. On today’s episode, we speak with Thierry Kremser, our Data and AI leader. We’ll talk about data analytics and how it’s impacting our business but also on a new philosophy that says all data should be public. So let’s jump right into the conversation.

Paramita: Hi Thierry.

Thierry: Hi.

P: So we'll talk about data analytics of course today. And if you have seen our podcast page we have spoken about data privacy before with Frederic Vonner. We just did an episode on digital trust with Greg Pitzer. And now we have you for data analytics. So I think it's really…

T: Tree combined and interrelated topics.

P: Yes. But to start with… a very basic question: what is data analytics?

T: It's very basic and it's also not easy to answer… so I'll try. So I think the purpose of data analytics is to gain insight from all the data we produce and we have. So it's a set of techniques to collect data, to manipulate, prepare, visualise and report on the insights that we could get from this data.

P: We hear data analytics and AI regularly in the same context. Is there a difference between the two? Is there a connection between the two?

T: Yes there is. AI is let's say the most advanced part of data analytics that uses some mathematics, some algorithms to solve problems to get insight from data. So you can use the many types of algorithms and some are known nowadays because there are a lot of news and buzz in the press around these words. So one of the algorithms is deep learning which allows you, allows a system to learn by itself what's in the data. There are other algorithms like algorithms to analyse images. All types of problems let's say have a solution in terms of mathematics and algorithms. You have some for analysing texts, you have some for analysing more structured data like databases you, have some to analyse images. And all these methods, all these techniques are used to solve different problems.

P: I have a question for you. Before we started our podcast one of my colleagues he when he heard that we'll have you today, he wanted to ask a question. So we just spoke of AI and algorithms and how they are connected with data processing and everything. And you yourself spoke of you know buzzwords. So his question was because he has a background in sociology…

So his question was: is all the buzz around so-called AI justified at the moment? He says I'm coming from a sociological background. And at university we used to work a lot with regression analysis and structural equation modelling to predict people's behaviour. Isn't what we call A.I. nowadays more of an automated version of that only with more computing power?

T: Indeed we could say so. I think AI is still… there's still a lot of research. So it's not a completely mature domain or technique. AI is an ever-evolving subject and of course computing power has greatly helped in the development of AI. Not only computing power but also something new which we didn't have 20 years ago is the amount of data that we are collecting on the Internet on all our I.T. systems, in cars, in our everyday life. We have electronics and we are producing a lot of data. And of course these mathematics are not necessarily new but they have been fine tuned and using the calculation power, storage power and all the data available to make it now much more powerful than in the past.

P: I think the word intelligence was probably his... So can we call it really “intelligence”. You know I had this episode… our very first episode was on AI and we spoke of general AI… the general artificial intelligence is like real… It resembles real human intelligence and probably when we speak of AI now and automated things we don't really speak of general AI. I think that was your question?

T: Okay so it's a good question. I would like to return the question asking what is human like intelligence? And if I read the book of Harari “Homo Deus”, Harari challenges the fact that we have a soul. And basically he demonstrates or tries to demonstrate that our brain is just a set of algorithms, very old algorithms which allow us to survive in the old days. And so the point of these algorithms is to tell you OK if you are in this situation you do like this. And then of course you have different algorithms possible and depending on your history, culture etc. you may favour some different algorithm. At the end of the day, as long as it's only algorithms, it's manipulating data, the mathematics and the artificial intelligence can reproduce that. Maybe not now but at some point in the future they will evolve to a point where they will be better than us. And they are already better than us in many areas. Of course they can do things that our brain cannot do already today like analyse a big set of data, find a needle in a haystack…

And today fortunately these algorithms are not replacing the brain but are rather complementary because usually using artificial intelligence you need at some point a human intervention to confirm and to give guidance to the machine. It's not always the case. There is also this learning in artificial intelligence… there's a long learning phase. Usually you have a long learning phase on a big set of data. And in this long learning phase, you tell the machine OK this is what you need to find. This is correct. This is good. This is not good so that the machine can step by step come to the conclusions. you want the machine to make. Now there are also machines we can learn by themselves. And this is really impressive. Like I think it was like 20 maybe 15 years ago there was a famous book on the industrial revolution saying that OK there will be a limit to what computers can do. And it was a very famous book in 2005-2006. The limit was you cannot drive a car. A machine cannot drive a car. And of course six or seven years later Google came with this autonomous car. And if we look at now all the electronics we have in a car we are not so far from this and if you look at what you have in a Tesla it's almost an autonomous car. It's not just activated because it's not allowed. So it's evolving very fast and I think that the gap between human brains and the machine is going to reduce to a point where…

P: Singularity?

T: Yeah… maybe. So what's the future... I don't want to imagine the future but if you want to read the book Homo Deus...

P: I will… I have started with Sapiens because that was the first one so yeah… Thank you I hope this answers some of your questions Ralph.

My next question is so OK we understand what is data analytics… it's processing and mining and collecting data. So how is it impacting business? Where does it come into the business spectrum?

T: So there are different use cases and applications of these techniques. In some areas, it is very mature. If you look at e-commerce and you look at some big e-commerce providers, of course they analyse everything you are buying, everything you're interested in. And thus have been able to propose very relevant articles to buy for years. So this is an area where artificial intelligence is very mature.

You have other areas like looking at banking, finance where it’s also taking off. You can now invest… ask a machine, an artificial intelligence to manage your investment. This is real. This is happening today. My bank is proposing this service. You can also use these techniques to detect fraud. Because when machines are able to detect fraud in a way that humans cannot. And let me explain. For years, detection of fraud has been focused on a set of rules like OK if you have transactions above a certain amount or if you have transactions which are unusual because of certain criteria, you need to flag these transactions so that we can assess whether it's a fraud or not. And of course the fraudsters, the bad guys, as soon as they understand these rules, they can try to bypass the rules. And they invent new techniques. With artificial intelligence you are able to monitor the transactions in a way where there are no rules so you can just ask the algorithms to identify automatically what are the normal behaviour and what is unusual. And as soon as you detect something unusual, something new, you can ask a human OK I have discovered a new pattern, a new element.

P: You can create an alert.

T: Exactly and you can create an alert. And by combining these rules with these new alerts, you're much stronger in terms of fraud detection.

You also have plenty of use cases linked to the analysis of texts. So now you have Natural Language Processing algorithms where you can analyse texts. Last weekend we had a team participating in the Game of Code in Luxembourg. So we have a team of people who spent 24 hours trying to solve or to develop a solution for certain questions. And they worked on all the material of the Luxembourg National Library. So all the archives and all these very old books and articles and documents they had.

So what they did and this is only in 24 hours they could use some Natural Language Processing techniques to analyse all the texts, all the documents. So that then you could research things like OK tell me of all the documents talking about diseases in the 19th century. But more than that they could do an analysis and tell you, year by year, looking at all the disease mentioned in all the articles during one century, they could tell you, year by year, how many disease were observed that year. So that they could inform for example someone doing some research in history of medicine. They could tell you OK this year there was an epidemic of flu or diphtheria or whatever and they could do that in 24 hours using cloud-powered computation.

So you can imagine all the implications of that in many use cases.

There are some other use cases in policymaking. Today, if a politician wants to work on a policy, on proposing something, he will ask some experts their views, what to do, what's your opinion, what do you think of this or that. But he will have very little information on the facts, on what's exactly on the market or what the citizens or the world are saying about environment, talking about economy etc. But the amount of data that he can get access to, to make decisions is somehow limited. And now we can easily take the web, take the social networks and grab the information and provide statistics, elements that the politician wouldn't have access to. And we can also monitor the effect of the policy after its implementation. We have tools now to monitor also the social networks so that we can tell a politician and we have a solution, there are solutions in the market and we have implemented such solutions. We can tell people OK look this is when the policy entered into real life. Look at what's being said on the network. Positive, negative. What is the buzz around it. What are the profiles of people talking about…

P: Yes, we’re talking about digital intelligence.

T: We do have solutions. So you see we could talk for hours on all these use cases.

P: Yeah. And I want to come back to the policymakers having access to our data because you know it's… it can be controversial if you think about it. And I want to come back to that and I want to pick your brains on that and what do you think about it. But before that I see how where is the entry point of data analytics in business. I know that we are doing a kind of a survey now in Luxembourg… do you have an idea where we stand right now in probably in Luxembourg or in Europe as to… do businesses have proper data analysis teams, are they prepared for large, big data processing?

T: I think there is not a single organization or company today that is at least not thinking about data analytics and artificial intelligence. But if you look at the results of the last CEO survey that we did on a set of Luxembourg CEOs, there's clearly a huge gap still between the needs of the CEO and the needs of a company, of the organization and what they actually get. There's a huge gap in terms of understanding of their clients’, their customers’ needs. There is a huge gap in terms of reporting, financial reporting. So there is a gap. Now, this being said of course there are domains as I said earlier like in retail where these techniques probably are more mature than in other domains. But I think in all industries there is clearly a need and some investments made in data and artificial intelligence. The point is that artificial intelligence requires some maturity in terms of skills, skill set but also in terms of data available. And the problem of these companies and organizations is that they have a lot of data but they are not necessarily properly stored, linked, interfaced or have a lot of quality issues.

So I think most of the organizations are working first on trying to clean these, all these big amounts of data…

P: What do you mean by clean?

T: Cleaning meaning you have redundancy, you do not connect you know like some TelCos (telephone companies) they have worked for years on being able to identify that they have the same client on their mobile and on the fixed line…these kind of things.

So preparing, cleaning the data is still an issue for many companies.

Then once you have a good set of data, you can of course start to analyse and just maybe the easy thing is to visualize, to produce reports so that already you have reports and visualizing. And you don't need artificial intelligence for that you use the brain, the human brain. And with that you can already get a lot of insight.

Now going to really artificial intelligence, it's just starting in most companies. So it's still more research, a few proof of concept, a few prototypes being developed but it's not very mature yet.

P: And how can they develop that? How can you organize a proper data team?

T: There’s probably not one answer to your question. I strongly believe that the artificial intelligence practice is going to be a commodity in a few years. I don't think we should consider these techniques are limited to a set of you know professors… And I believe that we need to democratize the data analytics and artificial intelligence techniques across the company. I think for me this is probably the most important step to really take this movement forward.

So it means different things. I think it means that people, everybody should be able, for me, in the company to use some basic techniques of data analytics wherever they are in thee same way as they are using today word, excel, PowerPoint...

P: But would we need some kind of a special training for that?

T: Absolutely. So you need to do upskilling, you need to train people, to provide them with some tools, some new tools so that they can, on their own, manipulate the data, visualize the data, produce reports and why not start to do a little bit of analytics on the data. Of course this doesn't mean that you don't need a team of experts anymore. You still need a team of experts. It's just that today these experts are working on almost everything around data you, know working on data quality, working on all the steps. I think these teams should focus on training, as we just said, training the organization. They should focus on putting together the policies, the governance around these practices so that of course you can decentralize the activity but at the same time you need to make sure that you have a certain number of rules or policies which guarantee or ensure the quality of the data and the relevance of what these people are doing.

And finally, I think that these teams, these expert teams should have a set of algorithms, artificial intelligence algorithms which they can use to work on the most complex cases.

So if your people are trained using basic data analytics and they will also you know say yeah but if I can do this I could also do that. Oh it's a little bit complex I cannot do it. So then they will call the experts and say look I've done this but I think we could probably do this and that. And then they can give these to the expert team which will be able to hopefully find a solution and give it back to the teams. So by doing this you also increase a lot the number of innovations that you can bring to a company or an organization. Not just you know management giving a few specific use cases to work on but also having everybody able to say look we've been doing this for years but we could probably automate that doing something different.

And if these people don't think about it don't have this in mind we just miss you know the opportunity to improve the efficiency of their process, their work and everything they do.

P: But isn't there a risk when we're talking about democratising data within a company? Isn't there a risk when people like non experts they're handling or processing sensitive data?

T: I think a lot is already done using excel. You cannot imagine how many processes in companies, even in big companies, where at some point there’s something prepared in excel by a human without necessarily controls or whatever. And that's it.

So I think yeah I understand the risk and I think… We have launched this at PwC you know. So we've launched this practice of really decentralizing the data work to everyone in the organization. We have upskilled like 400 people already and the outcome is great. We have many new ideas coming and we have people who are not dependent on the experts to do already a lot of things… the basics things.

What's the risk? Of course, there's a risk but this is why you need also proper governance and a set of rules to make sure that you limit the risk or you control what you're doing.

And at the end of the day I think it works very well.

P: Like with the introduction of any new technology...

T: Yes exactly.

P: Where do you think this all is heading to. I mean where do you see the future? I remember when we spoke in your office you were talking about GDPR and how GDPR is actually seeming to be a... Could you, could you please elaborate a little bit on that?

I mean why do you think GDPR looks like a constraint and not a…?

T: So GDPR… I won’t repeat the excellent podcast you did with Frédéric Vonner some days ago… So indeed GDPR protects the privacy of citizens in Europe.

So that's good. And we need it. But at the same time of course if you want to do Big Data analytics, artificial intelligence, you need data and the more data you have let's say the better the solution or the outcome.

So to be able to innovate and test new things you need the data. But of course now in Europe we are limited because the data cannot be used for anything else than its initial purpose.

So you cannot easily say OK I will take the data from the bank, combine it with the data from your telecom operator, mix it together to produce some insights for the bank or for the TelCo. You cannot. It's not allowed. So of course this is limiting the number of things European researchers and engineers can work on to develop their technology and their solution to run data analytics. And then, if you look at China and to a lesser extent US there's no GDPR but it's coming… in the US coming. But in China at least I mean it's open everywhere. So if you are registered to a grocery store then you can use the same registration to pay at the bank. You know everybody can share information and you know the State is now…

P: But what if I don't want that to be shared?

T: You cannot do anything in China and it's worse because now you know the State is going to scrutinise the citizens to kind of measure their level of… are they let's say compliant with the morality of the Chinese government…

So this is scary of course. And it may of course evolve in the future. So I'm not challenging GDPR. But at the same time, it is indeed putting barriers to the development of the artificial intelligence industry in Europe.

And this is a big risk because Europe is already somehow late on many technology, many technologies compared to the US or to China looking at production of phones, looking at computers, looking at big software companies. We have a few. We have some in Europe fortunately but most of these companies are not European anymore.

So this being said, I think that politicians have understood the importance of artificial intelligence in Europe. They are investing a lot. In France, President Macron has announced €2 billion investment in artificial intelligence. I think Germany has announced €3 billion.

So clearly, it is a hot topic on the agenda of our politicians. So what can they do? First we could say that they could focus on the specificities of the European market and industries and I’m thinking particularly of the industrial sector because there are a lot of use cases in the industrial sector using data analytics like preventive maintenance where you’re able to detect that parts of an aircraft or in a car are maybe about to fail and need to be maintained and replaced. Using predictive analytics, this is something which is possible.

I think Europe should focus maybe on these areas where Europe is already very strong like in the industrial world where with IOT and all these things to develop some solutions and patterns around artificial intelligence in this world.

At the end of the day, what will be the future? I hope that Europe will be able to catch up with the other countries and I am confident that… because at the same time there’s a lot of universities and researchers and students… so there’s a lot happening there… so we just need to make sure we invest and focus on the right use cases so that we can develop our own expertise and lead some specific domains.

P: I completely understand what you’re saying and the possibilities that free flow of data can have. For example, in medicine, I was reading if you have a… they do it even nowadays, people with kidney problems they have a sugar pump that takes the level of sugar and so if you have an artificial intelligent pump… it can collect data from people who are sick and match those data and find patterns and do predictive analysis and come up with a solution. Which is of course absolutely brilliant. And like you said, more the amount of data, better it is to come up with patterns and do predictive analysis. But then again, when you talk about predictive analytics, we were talking about policy makers some time back and you know there are rumours about what happened in the US… some say that they probably used data from Facebook to manipulate voters. So where do we draw the line?

I asked the same question to Fred and he said that awareness is important… I mean how much data we give up. But then you know we talked about dataism, this new philosophy that says that all data should be public… So where do we draw the line?

T: I don’t know if all data should be public. I read a book called The Circle… in this book, there’s the story of a tech company like Google. And this company is trying to make everything open to a point where they can follow a politician 24/7. So there’s a camera with the politician… every time the politician is talking, doing anything, everything is directly published on the internet. And it’s very scary and actually the book is very scary. I don’t think we’ll end up in a world where everything is open. There is a risk of social networks and manipulation of elections and it has proven to be more than a risk. It is a big threat. But I think, as Frédéric said, awareness is there. Awareness of politicians and I think they are creating laws and they are putting pressure on companies like Facebook and others. And at the same time I think there’s more and more citizens are also aware. So GDPR is already supporting that and I think more and more citizens are now cautious of what they read on the internet because the amount of fake data, fake news, manipulation is increasing.

Now coming back to your point on dataism and as you’re redaing Sapiens I don’t now if you’ve read that thing which really for me is interesting is that human created calculations, But the first thing we started to write wasn’t text. It was numbers and basic operations to be able to manage a huge amount of data which they could not manage just with their brains. Like manage taxes, agricultural problems, manage populations like urbanism and things like that. So they first created mathematics before creating “writing”. I think this also demonstrates how data management, mathematics is important to the development of humankind. So today we are reaching another level of mathematics where we are analysing automatically, learning automatically and making decisions without human brain intervention using artificial intelligence. For me it’s just kind of normal evolution. Now is it a risk for the future? I would say let’s see… it’s hard to predict…

P: I hope we could do a predictive analysis on that…

T: We should but we can be confident that politicians, citizens, looking at GDPR, looking at all the pressure on Facebook to avoid fake news being circulated, I think all of this is showing that while AI is developing we need to regulate this market. And I think for this part, Europe is ahead.

P: Unfortunately, we have to stop here. I really want to continue the discussion because it’s so interesting. Thank you so much for being here.

T: Thank you. It was a nice moment.

P: I hope that we can chat sometime in the future.

T: Whenever you want. Thank you.

P: What an interesting conversation that was. I hope you enjoyed it. And don’t forget as usual to comment with #PwCTechTalk and I’ll catch you next time.

Contact us

Pauline André

Director, Head of Marketing & Communications, PwC Luxembourg

Tel: +352 49 48 48 3582

Follow us

Required fields are marked with an asterisk(*)

By submitting your email address, you acknowledge that you have read the Privacy Statement and that you consent to our processing data in accordance with the Privacy Statement (including international transfers). If you change your mind at any time about wishing to receive the information from us, you can send us an email message using the Contact Us page.