Should We Be Worried About AI and Machine Learning? | Azeem Azhar, Schibsted Media | BoS Europe 2016

1 comment

Azeem Azhar, Product Manager, Schibsted Media Group

It is early days but change is coming. Faster than you think. You should worry, but are you concerned about the right things? Should you concern yourself with terrible tales of, ‘The Singularity’, or about how AI and Machine Learning will change the way that your business runs?

Slides, Video, & Notes below

Slides of Azeem’s talk at BoS Conference Europe 2016 here

Video

 

 


Come to a Business of Software Conference in 2018

BoS USA 2018 -or- BoS Europe 2018

Don't Miss a Thing - Get BoS Updates

Want us to let you know about new talk videos, speaker AMAs, Business of Software Conference and other event updates? Join the smart people who get BoS updates. Unsubscribe anytime. We will never sell your email address.

Notes

  • Boom or Winter?
    • What is AI?
      • umbrella term for a way of solving problems
      • ANI (artificial narrow intelligence) – Siri
        • This is where we are… wiht a few things happening in AGI
      • AGI (artificial general intelligence) – humanish
        • Nowhere near this… but making progress.
      • ASi (artificial super intellience) – What happens when machines can make better and smarter machines?
    • Umbrella Term…
      • Machine learning, NLP, symbolic reasoning, control theory, deep learning, knowledge representative
    • The Money is Flowing
      • 7x increase of VC in 5 years. This can be a boom or a bubble… there’s at least something going on.
      • People think there is going to be a breakthrough… then it disappears off the agenda and retreats back into university lab. Every 10 years we go from boom to winter and in every boom we think there won’t be a winter.
    • The Big Guys are Muscling In… Google, FB, MS, Apple, Amazon, Salesforce, IBM, etc.
      • “I think we aren’t going to see a winter because there are commercial big dogs + investments with customers driving things)
  • Accelerants
    • 7 Accelerants
      • 1-3: Development of __?? __
      • 4-6: Structural changes in the software industry.
      • 7: The prevoius 6 build up and converge into a lock-in where it a baseline and not a competitive edge.
    • 1. Moore’s Law
      • We create more transistors every 0.1s than there are stars in the galaxy
      • The # transistors doubles as price halves…
      • Moore’s law is coming to the end of its applicability
      • It’s worth pausing and thinking about what this means
      • Apple watch is more powerful than a cray 2
        • 1985: Cray 2. Fits in a room 1.6 Gflops. 15 million
        • 2010: Iphone. Fit’s in pocket. 1.9 Gflops
        • 2015: Apple watch. Fits on write. ~3 G flops.
        • When you carry your computer with a table and phone on top (while placing your coffee on top HAHAHA)… you have hore power than the entire US military and academia had.
    • 2. Data Explosion
      • 4ZB of data by 2020, 44x in 11 years.
        • Greater data drives ability to drive exponential mapping + math. High processing power is also required to achieve this
    • 3. Collaboration
      • The pace of learning used to be very very slow.
      • MUCH MORE open source and sharing of ideas, code, etc.
      • We are standing on the shoulders of giant.
    • 4. Software eats the world
      • A lot of things we thought were NOT software problems are now software problems. This is happening in domain after domain after domain.
      • Problems that need solving provide opportunity for AI to help solve.
    • 5. APIs & Microservices
      • A lot of software used to be monolithic insane thick code…
      • Now: microservices, APIs, clean semantics… this can be optimized, which makes a lot of more smaller problems for solving & optimization. AI can help make sense of this.
    • 6. Open Source
      • Major global companies release internal machine learning & AI frameworks.
      • Google, MS, FB: cognitively rick API services available.
      • Anyone can use these resources to grow and improve!
      • CloudVision created by MS and used by Google, accessible to us.
    • 7. Lock-in Loop
      • AI… improves product … PRODUCT
        • PRODUCT … creates data… DATA… improves AI… back to AI
        • PRODUCT … increases profits…. PROFITS… invested in improving AI … back to AI
      • In more industries… this is becoming a base-level expectations…
        • E.g.: Video games, search, mobile UI, defense, finance, CRM, health,, etc
      • These things drive a reinforcing cycle
  • Implications
    • 1. STUFF IS GETTING BETTER!
      • Better than human:
        • Object detection in images
          • error rate for categorizing images… from 2011 to now, machines are better!
        • Wide range of Video Games
          • … computer will beat humans in certain genres really easily
        • Chess
          • 2500+ Grandmaster
          • Stockfish for computer… better than bestw human
          • Even more important… these systems can play each other
        • AlphaGO
          • Computer beat a human a few years before scientists thought we would have
      • Good as Human
        • Verbal IQ: Measuring an AI system’s performance on a verbal IQ test for young children… ConceptNet is “four years old, its VIQ score is IQ 1000)
        • Suturing Intestinal Tissue
        • Predictions & Betting: Unanimous AI… predicted top 4 finishers in 2016 Kentucky Derby. 540-1 odds!!
      • Approaching Human
        • Visual Q&A
    • 2. Applications are Getting better!
      • Better foundational blocks > Integrated Applications
        • Blocks: NLP, CV, APIs, uSVC, Data Eng, Data Quality
        • Applications:
          • Customer Service: One solution will charge 100k for 1mil customer interactions…
          • Querying: Viv (from former Siri guys)
          • More interactions on mobile are being driven NOT by touch, but voice + AI
          • Legal
          • Engineering
            • Company does inspections of pipes to look for metal fatigue
        • Other:
          • Q&A; discovery; precedent detection based on IBM Watson
          • Tesla: Autonomous vehicles; networked learning system… every Tesla learns from every Tesla
          • Emma: : Autonomous writing tool
          • Xi: NLP-rich scheduling & calendaring assistant
    • Actions
      • Play or Pray
        • When you build it in… the systems are learning from itself.
        • People learn from when AI doesn’t work then feed it back into the system.
        • It’s so easy to do a basic integration … why wouldn’t you.
      • PLAY
        • Build a data moat
          • Core algorithms / frameworks are in open source
            • You won’t be able to construct a better algorithm than the open source solutions…
          • Unique AI needs unique data
            • You need to differentiate on the data you have… likely your customer data.
          • Build a data acquisition strategy
            • How can I build this strategy that gives me data that other people don’t have.
          • Example: Clarify does image identification as a service
            • The more customers they have, the more data they have, the more errors they make, the more accurate the systems becomes.
            • “network effect” of data
        • Clean APIs / Semantically sensible interfaces. How will you ensure coherent semantics in exposed interfaces?
          • Easier to integrate in other intelligent systems
            • Good architecture makes it easier to integrate into other systems both ways.
            • Viv allows anyone with an API to integrate into AI
          • Easier to optimise
          • Narrow AI
        • HUMANS NEEDED! Keep the human in the loop.
          • Ongoing training
            • Your system will continue to make mistakes.. humans still needed :-)
            • 94% correct still means 6% wrong. This facilitated retraining and better data for learning
          • Last-mile
            • Many of your systems are interfacing with the real world.
          • Risk management
            • Who is going to be responsible when these things go wrong?
        • Processes not oneshots. Build the right processes
          • Ongoing training investment
          • Understand your tradeoffs
            • All systems have tradeoffs. There will be false positives and negatives… and machines can’t evaluate the tradeoffs… they can only enact the laws we provide.
          • How to secure better data.
  • Should you be scared of AI?
    • Huge opportunities to build better products & reach new markets
    • Competitive threat if you do no innovate
    • Expensive
  • Your View?
    • Gartner-Hype Cycle curve (??)
  • Questions
    • As data becomes a form of capital… this is really scary for small business owners… do you have any hope for me?
      • Big companies have a big data advantage in SOME domains, but not in others… like YOUR customers and YOUR marketing funnel.
    • What about ethics?
      • People outside technology especially freak out.
      • Doing experiments for your business are cheap… then you can make a decision for build vs. buy
      • TensorFlow

Come to a Business of Software Conference in 2018

BoS USA 2018 -or- BoS Europe 2018

Don't Miss a Thing - Get BoS Updates

Want us to let you know about new talk videos, speaker AMAs, Business of Software Conference and other event updates? Join the smart people who get BoS updates. Unsubscribe anytime. We will never sell your email address.

Transcript

Azeem Azhar: Thanks, Mark! Brilliant! So I think I might start off – can you hear me at the back? Yeah? Perfect!

My name is Azeem and I work at a company called Schibsted Media group which is a large Norwegian media company, with 7000 employees in 29 countries, and I also run a newsletter called Exponential View, the URL is there if you want to sign up. We cover things to do with AI and some other bits and pieces. And you can find me on Twitter and my email address is there.

So Mark asked me to talk about AI and just because this is the afternoon slot and after lunch, the too long didn’t wanna listen, needed a nap is no [laughter] so you can now tune out. But just to give you a sense of the structure of the talk, just in case you do need a post lunch nap and you wake up and think where am I, four sections. Quick discussion, is there a boom or winter in AI? The second section is what are the accelerants of all of this, the third one is what are the implications and the 4th is what kind of actions can you take? Tiny housekeeping now, can we start a timer so I know where I am? That would be great!

What is AI?

So let’s kick off with what is AI? We’ve probably all read about deep mind or what’s going on and it’s worth framing out this map which is that AI is essentially a umbrella term for talking about a way of solving complex problems and it ranges from ANI – artificial narrow intelligence which is stuff we’re familiar with because we’re interacting with it quite frequently like Siri, through to AGI which is artificial general intelligence which is human-ish where you can have an AI system that can do most of the things a human can. We’re nowhere near this generalised AI, we’re making progress and we’re closer to it now than we were 10 years ago but then we are probably as close to it now as we thought we were 10 years ago. And then ASI, which is this idea of artificial super intelligence so what happens when the machines can start to make better and more clever machines so an evolutionary paradigm take over. So where we are today is the green blob here with a few things happening in the labs, with people trying to figure out the problems of generalised learning which we as humans are quite good at doing.

AI is a really broad umbrella term that’s really helpful for marketers and investors as a way of amping up their companies. And within that term it kind of encapsulates a lot of things we might have given different words to. Those include things like NLP, symbolic reasoning, knowledge representation, control theory, the other big ones you heard about are machine learning and deep learning but essentially they are an umbrella term. Not everything that uses machine learning is going to be an AI system. Not everything that tries to be an AI system will have a machine learning in it. It’s a bit complicated and diffused, I don’t think it’s worth to have a tight definition other than to be a bit sceptical and ask the 5 why’s when you come across something.

So there is a lot of money flowing into AI at the moment, this is a chart from CB Insights and there’s been a 7-fold increase of venture capital investment in the last years in what’s called artificial intelligency. This is not my typo, I’ve taken it from their website so I’m pleased to say [laughter] but there are my own typos later on. There’s been a tremendous increase in the last 5 years which as many of you who are seasoned in this industry will know it can mean there’s a boom or bubble. In 5 years’ time there are people who don’t get seats to the exit musical chair but there is something definitely going on and we will start seeing value come in, so when you think about AI, it’s often a story of what’s called AI booms and winters, so people think there’s gonna be a breakthrough. We try and try and we don’t have a breakthrough, it disappears off the agenda except in the university labs and we seem to have come out of a winter and into a new spring and we think it’s different each time. And I’m standing on the stage telling you it’s different this time but you can read in to that what you will.

The big guys are muscling in so sales forces acquired some companies who thought about machine learning and intelligence. One is related iq and the other is prediction io. IBM has got this whole Watson division cognitive computer services, they said they will put $1 billion in it and of course Apple has invested pretty heavily in Siri with the acquisition of that company and making it the default interface for the iPhone for the future. Google, Microsoft, Facebook and Amazon have all either put into open source key AI technologies or have launched AI based cognitive computer based services as API’s in the last year or so. So there’s a lot of interest and in that sense an inevitability of this stuff and this is one reason I think we won’t see a winter, because there are commercial investors by commercial companies with customers using those things.

So what’s driving it?

I think that there are 7 real accelerants behind all of this that are part of the system that is getting bigger, better and faster. So the first 3 are really kind of about core technology and their development and they’re related to Moore’s law and increasing price performance of processing over the last 10-15 years. The second is the data explosion the explosion and availability of data that we can use to train machines The third is the very fast rise of collaboration so that the researchers and engineers working on these first 2 can share their ideas with each other and learn and improve quickly.

Then there’s another set of 3 which has to do with structural changes in the software industry and the way in which software interacts with the world so the first is the idea that software is eating the world, I will dig into that later. The second is that increasingly software designed in this modular and reasonable form with micro services and APIs around it and that for a number of reasons make it easier for AI systems to be useful. And the third is the fact that the major tech companies are rapidly open sourcing their core technologies to put it out there.

And those layers build up to what we call lock in and the lock in is essentially having AI become integrated into your product or business, if you don’t have it you can’t compete. Therefore, everybody has to have it.

Moore’s Law

So let’s talk about Moore’s law. It’s a 60 year old law which is coming to the end which was this prediction that the available power in a CPU was gonna double every 18 month at the same price. It’s a bit more complicated than that, it’s about the number of transistors you can put on a chip. But Moore’s law has which has run for 60 years and it’s nearing its end is just mind boggling. So back in 1955 about the time Mark was starting at university, the average transistor cost 10 billion of a billion so about 10 bucks so there was 10 to the 1 transistors made. By 2014, the average transistor is costing 1 10 billionth of a dollar and we’re making 10 to the 20 of them, so 1000 billion billion transistors. So we create as humanity more transistors than there are stars in our galaxy. Just thinking back to 60 years ago, his teenage years, it was 4-5 transistors a year for the whole of our species. So something phenomenal has happened here and we can talk about the death of Moore’s Law but it’s still going on for a little while longer and there are other types of improvements behind this. And it’s really – I know we’ve all been in the software industry for a while, but it’s worth pausing and thinking about what that means.

This is the Cray 2. 1985. I was 13 years old and was trying to get in to watch Top Gun which was a 15 rated film. Cray 2 that is what I wanted, more than anything in the world, apart from a F14, 1.4 GFlops. This is the Apple iPhone 4, which many of us have broken and discarded in our drawers, 1.9 GFlops, fits in my pocket. This greater than 3 GFlops, sits on my wrist, $15 million, $300. It’s worth pausing and thinking about that and if any one of you have done that thing when you ran into a meeting room and you have your iPad stacked on your laptop with your phone stacked on top of that and you have a smartwatch, in your hand you’ve got more computing power than the entire US military and academia had in 1980, just as you run around and you’re balancing a coffee on top of all that. So I think it’s important to visualise how this effect and rate of innovation has absolutely, fundamentally changed things in ways we haven’t thought about and I always think about what about the analogies? And there are no analogies we’re used to unless you’re a cosmologist and we’re talking about the first seconds of the Big Bang.

More recently, GPU’s, graphic processing units, which are very good at parallelised operations have been a step change technology. On the left hand side we’re showing the power and performance curve comparing in green the Titan X which is an NVidia GPU and the Intel Xeon, which is a high end processor. And what you see is this is just in processing images per second – the GPU on a processor by processor basis is 6 times better than a CPU. In power efficiency and we’re now getting to the stage where power and heat and dissipation matters, it’s again 6 times better than the CPU. What’s intriguing is you can see what’s happened on image.net. it’s a benchmark for computer vision so how well does a set of algorithms identify what’s in a particular image? And the error rate of image net has completely collapsed 28% error rate down to 7% in 2014. And at the same time you don’t need an algorithm to spot this, the number of entries into the image net competition that uses GPU has gone from 0 to 110. It’s the availability of the GPU and their ability to process these vision machine algorithms and AI systems for computer vision that has resulted in I will be able to apply some great algorithms and drive the error rate of vision detection. So from 28 to 7% so I will get to that.

Machine Learning

So the first one was about Moore’s law, and computing power and improving our computing efficiency. The second is that like a child, AI systems and machine learning systems need data to learn from. And data sets were difficult to come by 50-60 years ago. Today, they really aren’t. So there’s gonna be according to IDC 44ZetaBytes of data produced every year. Now zettabyte is 1000 Exabyte’s, so 1million petabytes so 1 billion terabytes so a trillion gigabytes and that’s 44x in 11 years and the reason is Instagram, Snap Chat, YouTube, Facebook, computers generating data and we haven’t thought about the internet of things, I just put that thing up there. In case you will use this graph, I have down the interpolation myself, all I had a point up here and down there and I drew the curve between so please don’t use it! That’s why it’s got my analysis down there. As we know, there’s a huge amount of data that is out there and it’s now available to train systems. Why is that important? Because in traditional algorithms, you need to be able to go off and say which algorithm you want to recognise this as an ugly shirt, here are 1000 pictures of an ugly shirt and 1000 of a nice looking shirt and you have to look at both of those and in order to do that you have to have 2000pictures, 1000 of Mark [laughter] sorry, you gave me instructions to make fun of you [laughter]! But of course, having that data, is what drives our ability to actually apply this Maths in sensible ways and you need tons of data to drive these great maths but if you have tons of data, the maths run really slowly so you need to have super-fast processors that have to be cheap in order to do that in any kind of economical way and that’s a sort of the triumvirate, the place together.

Collaboration

So the next component that I think is really important is the idea of collaboration and the fact that again, 20 years ago, in a previous AI winter, the pace of learning was extremely slow. Academics did their academia and they looked at peer review and it took a long time to get anywhere. Preprints happened, people shared their conferences and they get put up on GitHub and people do clever things, there are competitions and a lot of discussion on stack overflow and so on and the result is we’re standing on the shoulders of giants. We have seen techniques that great example, how many of you saw this deep dream? What happens when it dreams algorithms? This was cause Google released something and they put that code on GitHub and people started to play around with it and what Google has done is here is one thing that allows you to put some images in a deep network learning and pull out some crazy intermediate stages and then guys got together and connected that to current video generation software and it allowed you to feed a music video in and then this algorithm started to generate its own music video that looked as if someone was on an acid trip. So, the collaboration is actually really important I think to one of the accelerants.

Then we’ve got these structural things happening in the industry. So, Mark Andressen has said and come up with this idea that software is eating the world. Show of hands, how many have heard of software is eating the world? Most of us, so what he’s interpreting there is that lots of things that we thought were not software problems are now becoming software problems. So, running a mini-cab company like Uber is now a software company, as much as it is about having wheels on the road and that’s happening in domain after domain. The last parts of analogue processing get minimised and that means more things are software problems which means they are amenable to being sold by software and amenable to being sold by optimisation and learning software like AI. And that creates some huge opportunities.

So you got the great opportunity but the other thing that’s changed is that software architectures have gone from being that thing on the left to this thing on the right. So what I’m showing on the left is this idea of monolithic, heavily intertwined code. And that’s what a lot of software used to look like. Can anyone put their hand up and say they have been responsible for producing software that looks like that thing? We’re all guilty as charged, right? And now we all want to check on our self-awareness. Do we all preach this? Clean semantics – of course we do! And it’s good that we do, because this is really hard to go in and optimise. We all know this and it’s often easier to just throw it away. This becomes much easier if you’re trying to – remember what I said at the beginning– we know how to do ANI, we don’t know how to do Artificial General Intelligence and sure as hell don’t know how to make the artificial super intelligence that can make sense of that. When you have these micro services driven and API driven systems, you can put very tightly defined optimisation things that respond to the inputs and outputs and contracts in between to do a better job. And that’s pretty important and if you guys have seen the demo of VIV you know that it’s being very smart because it can get access to open API’s on 3rd party services that are well defined.

Open Source Strategies from Majors

The 6th thing is the last year, the majors have really, really have gone out with open source and making their services available. Google, Facebook and Amazon have all released internal machine learning AI frameworks and supported them and the core ones they happen to use in production. The big one is Google’s Tentaflow which is what they now use in deep mine, to do all their research. So you as in AI or research or engineer, can use exactly the same tools that Google uses internally and that has really accelerated things.

If you don’t have AI specialists, you can start off to go off and get Google and Microsoft quality image recognition just by having these API’s envision API called Cloud Vision which is the same machine vision API that Google uses internally for most of its services and for a few dollars a month you can get access to that, to tag your own images and applications itself so that dramatically increases people’s access to this.

The AI Lock-In Loop

And all of this leads to what a friend of mine called Barney Pell calls the AI lock-in-loop. And essentially it’s a system that flows like this. Someone gets a bit of AI in this umbrella context and the result of that is that it improves the product and because the product is now better, 2 things happen. There’s more usage but their product creates more data so from every interaction between users and the product also wins market shares and theoretically it increase profits and with the increased data generated that produces more data that can be used to train the AI to get better and what the profits do is they get reinvested into the AI system and the product and it gets better. And you get into this loop of increasing returns.

And so now you’ve got a whole range of sectors where you simply cannot compete if you don’t have AI of some sort. You can’t build a video game or FPS and major X-Box without AI. You can’t enter internet search without heavy machine learning. As mobiles UI’s from Siri to Cortana, you can’t enter the UI space without space without that. Increasingly in areas like defence, you won’t be able to compete. In 5 years you won’t be able to sell a drone that doesn’t have autonomous navigation capabilities. Even in spaces like CRM, like cell phones, but unless you start to have AI like things like lead scoring or funnel optimisation or tagging, you won’t be able to compete and you can expect that the other categories will start to fall. So finance increasingly is difficult in health as well, a lot of the innovations are all about being able to make judgments on the data available and that’s made by machine learning systems. So we get to this scenario where these things drive a reinforcing cycle where dollars, engineers and research has to go in to make the products better.

Better Than Human

And the result is in this highly insightful slide, stuff is getting better. And here are some examples. And this is something that’s better than humans so this is the top and error rate in this image detection task on the image net competition and the average error rate of the top 5 entries, nearly 30%. And you can see it declining and there was a point reached last year where the top 5 entries in image rates now detect and categorise images better than humans so we’ve gotten better than humans in showing pictures and it’s happened pretty quickly.

Another example where now machines are better than humans is in this wide range from video games. This is data from deep mine and it shows 30 video games and the percentage better the computer is than the human, the best world quality human. So don’t play deep mines computer unboxing, it really obliterates you. But even in your favourite game like Pong he’s like 132% better than the average human. So that class of video games has been dealt with and mastered in quite a generalised way as well so it’s a deep queue network running the same one across all these games. Chess fell a long time ago so this is in 96, with deep blue but just to give you a sense, 2800 to 2900 is the ELO rating for a grand master. Stock Fish which you can download for your MacBook is already a 3200 up here and at the end of December the highest ELO rating was a thing called Komodo, 3350. And the thing about this is that these computers can play each other and make themselves better. So chess has gone as well so as we know Go has gone too off the basis of 30 million human games and Alpha Go is 1920 CPU’s and 280 GPU’s so that’s its brain and petabytes of storage in order to do that. And Alpha Go beat a human a few years before most computers scientists thought would get a computer to beat the best humans in the world. And I don’t know if you know, but Lisa Doll who was playing one of the games, he had to play like a maniac, he just had to do something really crazy that you would never normally do cause every other edge case has been mastered by the machine.

It’s kind of getting as good as humans is Verbal IQ. I’m sorry about the amount of text in this slide, but I will read it aloud. Essentially, some researchers trained a deep neural network on 2 once of verbal IQ questions are the kids that our kids do in school while they’re in SATS and this is last year, concept net achieved the IQ of 100 for the average 4-year-old human and that was a year ago. So that’s as good as human in the traditional improvement of the systems, it will be much better than a 4-year-old and nesting up against a 7-year-old.

We just last week better than human suturing intestinal tissue, so this is a medical robot over here, suturing. It’s a pig’s intestine over here and in the red box, this is the error rate. So these are traditional error rates of humans and surgeons. And here is star down here, the machine. It’s a robot controlled by a bunch of stuff which is getting better and it’s done very, very well. And last week, unanimous AI which uses a slightly different approach, predicted the top 4 finishers of the Kentucky Derby – 540 to 1 odds. So somebody put a dollar on that, the journalist who wrote that, and made 541 dollars. This is actually fantastic news because this can put all the book makers out of business because we can find scores on Premier League anywhere we look and we can do that social good [laughter]. So that’s pretty impressive.

And more complex task we’re approaching human standards. So this is a visual QA when you give the computer something and you ask them what vegetable is on the plate? The computer says broccoli. How many school busses are visible here, they said 2. I said one when I first saw this, I’m sure some of you did as well. There are two. Doesn’t always get it right we can really punish it for getting this one wrong. What is the number on the table? The neural net said 4, the real number was 40. And we’re starting to see more and more of those things, back to the thing I discussed about open source and collaboration, the minute these are available, they’re on a blog post, the approach is discussed and 100’s of researchers are looking at it.

Applications are getting better

So the result of all of this, is that applications are getting better because the quality of the natural language processing is getting much better, the API’s are being more well designed, micro services are more well understood. Data engineering, how do we flow the data around systems is getting better and data quality – how do we manage the quality of data for this system is also improving? The result is that you can build integrated applications over the top and so I just chose 4 here. Customer service is a very interesting one so there are lots of companies now working in the domain of automating customer service. Is anyone here working on automating customer service through AI or chat bots? Over there? The eye on my mobile guys? So there’s a lot of innovation here, tons of companies in this sector. And in one example, for one company, they are charging about $120,000 a year to handle a million transactions over Facebook messenger and that’s essentially the cost of 1 customer service rep or maybe a couple– but you have to handle a million customer interactions. And for the domain that they’re thinking about which is product configuration or ticket sales, they are getting up to 97% task completion without touching a human. And that’s a combination of the NLP and the API’s and sort of all coming together in customer service.

In terms of querying, it’s getting much better. So last week a company founded by the ex-Siri guys called Viv demoed in New York and it’s like a Siri or Cortana that allows you to go off and ask a complicated question with many parts and context and you have to break down the context and figure it all out. It’s quite impressive and you can see how it would start to replace the touch mobile interface. I’ve found personally setting out the anecdote that maybe one in ten getting on for 1 in 5 for one of my phone interactions get driven by my now using Siri and you can see that increasing in time. Couple of other areas, randomly. So in legal you’ve seen a lot of work happen with electronic work discovery but just in the last 2 weeks, you started to see legal AI applications, some of them built on IBM Watson, be hired by law firms. I think secretly we’re all delighted that the lawyers – no one will be upset if every lawyer loses their job. And the – but there is sort of some work in there and in engineering as well, just as a random domain, there’s a company in the UK called tractable which does inspections of pipes and installations to look for metal fatigue and things and it’s more accurate than a human inspector and much cheaper and we’re seeing the same things in agriculture. People are flying drones over the fields, taking photos and having the AI identify when there are certain types of parasitic inspections on the wheat or tomatoes and it can be done far better and quickly and cheaper than a human.

Lawyers, Drivers, PAS, Journalists

And to give a sort of deeper flavour of that, I talked about Ross, the lawyer precedent detection which is based in IBM Watson. In reality, they will not go away, because the AI will make mistakes which means more liability suites which means more work for the lawyers. So [laughter] the Tesla stuff, so Tesla is interesting because it’s got an autonomous vehicle system which is pretty good but what’s really interesting is that every Tesla learns from every other Tesla. So what’s fascinating with what they’ve built is that they have hundreds of thousands of cars out there and al of them are sending back these are the road conditions what I saw, this is what I did and didn’t do well, this is where I was uncertain and the models are getting better and better and it’s gonna be harder for people to compete. Emma is an autonomous writing tool that there was a great video on it last week where they pitched a journalist called Sara versus a robo one called Emma and to see who wrote the better copy – look that up to see who won. And there’s x.i which has got this NLP rich scheduling and calendar assistant which I’ve been using for a while and it works reasonably well some of the time and the rest of time they will try to book you lunch with me at 9 AM. So there are a lot of apps going on and I think as I see a few of these from some of the work I do in many domains, I’m pretty much every human first name has now gone to an AI system. It’s the new domain squatting.

Implications for Technology Businesses

So what are the implications for technology businesses, like ours? So one of my major argument is that it’s kind of either play or pray. You choose to play or you kind of pray that something happens and a meteor hits the planet, IBM acquires your company so you can open your spa that you want and something like that. The – because it’s going to be everywhere and the moment you start to embed AI into apps and it works well, your products get much better and you start to give people little delighters and those delighters quickly become hygiene factors. Because you’re not playing, the people who are playing are getting all the data and the learning experience for why the AI doesn’t work and they can feed those elements when it doesn’t work back into the AI so that it does work in that experience and it gets better and better. And then eventually they get to a point where they are optimising far more efficiently their allocation of resources, because a lot of these algorithms are nothing more than optimisation algorithms.

So if you’re in the software business you really have to play in the way that in 94 or 95 there were some companies like prodigy and genie and CompuServe which said we don’t have to play on the web. And I don’t think you really have a choice. It varies from companies to companies, there will be some companies where there’s not appropriate and some exceptions to it but it’s really important to say there’s this new set of technology that is coming to the mainstream that might give us incremental benefit and we need to think about how we make use of these technologies in our stack. And part of the reason is also that if you’re not doing it, genuinely it’s so easy to do a basic version now cause stuff is open source and kids are coming out with masters in machine learning with UCL and any other universities, that there will be products out there that will start to use this stuff. DCU, Trinity, sorry about that. DCU, Trinity, Cork, Limerick, Wicklow, is that enough? [laughter]

So, then I’ve got these 4 ideas generally about how you do that well,

  • so one is about the importance of building a data moat;
  • the importance of having semantically sensible interfaces;
  • the importance of still keeping people in the loop;
  • and their importance in recognising that this is a process and not a one shot thing.

So I will just dive into each of those and then we’ll have some time for questions.

Data Moat

So the first thing to ask yourself how am I gonna build a data moat? And the reason you need to ask this is because the core algorithms and frameworks are all in open source, the ones you need have been open source for a long time ago and you are not likely to be able to innovate on the algorithm in any meaningful way compared to a deepmine or Facebook or a research lab at the university. That’s really where the proper innovation, the low level science innovation around the algorithm is gonna work but rest assured if the average candidate to image net is achieving better than human accuracy and they are publishing their code on GitHub, you can do well with what’s available and accessible out there.

So I think you will not be able to construct a competitive advantage or dependability though your algorithmic work. But in order to make what I call uniquer AI and other people call it the unique AI, you need to have a unique data so you need to be able to differentiate on the data that you’ve got which quite often will be your own customer data and – but then you need to think about how do I build a data acquisition strategy that gives me really good data that other people don’t have so even if they have the same framework as me and apply the same maths, I do a better job of it.

A good example of a company that is building a data moat is a business called clarify. So Clarify does a machine vision, it recognises images as a service. So you will send it a photo and it says it’s got a VW Golf in it, you’ll send it a photo and it’ll say it’s got a border collie in it and the purpose of that tagging is often used in brand analytics. But what clarify has in their data acquisition strategy is the more customers they have, the more data they get to look at and the more frequently they, in absolute terms would improve by retraining the system on the things- where mistakes were made. And that’s a business where you get a data network effect. And so thinking about how you establish that because you have customers and analytics about them and how you can in turn that into something that gets richer as you get new customers I think it’s a pretty important strategy and it’s challenging if you have a small number of very high value customers, but there are still things that you can do there. But when you’re a company that will have a few hundred or thousands customers, you can start to do something with the data that makes it differentiated and then ultimately makes it defensible.

Semantically sensible interfaces

So the next is about asking the question how do we ensure coherent semantics in the interfaces that are exposed? And this is really important, it’s about the value of having clean APIs and I’m sure many of you remember from star wars episode 4, the way R2D2 was able to plug into the Death Star and access the plans. That’s an antipattern and a good example of why you don’t want a clean interface but it was a good example and helpful to the rebel alliance [laughter] and so it’s the same thing is true here when you think about your business which is that if you have clean and well defined APIs which is 10 years since Jeff wrote that memo where it said use Microsoft and clean, exposable APIs or I will personally come and fire you. So most of us are already here. But the benefits are that it becomes easy to integrate if your software is kind of well architected and separable and integrate it into your own systems but also into other people systems.

A good example of that is thinking about Viv which is the voice based AI that launched a week or so ago. Viv will be a developer platform that will allow anyone with an API to plug into the power of Viv so that you can have a completely natural conversation and access your service. I saw the booking people talking earlier today and they have clean APIs and can plug into Viv and all of a sudden they can expose that product to people through a voice interface. If you don’t have it, you can’t do that and are behind the curve but I think it’s important to be able to do that because it allows you to start to integrate into some of the other systems that people will build.

The second is that it becomes easier to optimise if you have these clean interfaces and it’s something that it’s soluble by narrow AI rather than by general AI. And a great example is Siri and Viv, these systems look like they are general AI cause you can ask Siri all sort of questions like what’s the weather like in Dublin? Who won the Spurs game a few days ago? If anyone knows the answer to that, tell David, the next speaker. And they look like they’re general intelligence but they’re not, they are mish mash of narrow API’s that have all been tied together. So you can do that to make a wizard of Oz effect to make your system look smarter than they are.

Keep the human

I’ve got the wrong slide here. Keep the human in the loop is important – that shouldn’t be R2D2, that should be Ricky Gervais dancing in the office. I put my pictures in the wrong slides. Keep the human, don’t have a robot in there! You need humans. And the reason you need humans is that your AI system is gonna need ongoing training because it’s gonna continue to make mistakes and even if you think about the example I gave you with a customer service where they responded to 90% of the queries or image classification where they are getting 94% things correct, 6% things are wrong. So those 6% need to be manually reviewed and reinforced and retrained back into the system so you will need to have humans there.

The second is you will need humans for maybe the first mile and certainly the last mile because many of your systems end up interfacing with the real world or require a personal touch and that will be delivered by a person for a few more years at least.

The final one is the idea of risk management, so we haven’t quite got to a point where we’re clear about who’s gonna be responsible for when these things go wrong and that’s why with the work around autonomous cars, regulators say cars can be self-driving as long as there is a fully alert driver sitting in the driver seat with his hands on the wheel. So because we don’t know who is gonna be responsible and this ties down to the lawyers that are automated. They are there, waiting for the call. So from a risk management perspective, you probably want to have humans in the loop.

Build the right processes

I know you are all waiting for this so this is Ricky Gervais in the office. What’s going on here? I’m so sorry, guys. I seem to have mixed my slides. Build the right processes. So it’s not a one shot piece of code that you put together when you build an AI system. So a lot of the code that we’ve built because it’s kind of business logic that is separable from the underlying data, you can build it in 1986, having passed its test, do a few updates and then it can be still running 20 years later. Not so with this kind of stuff where you’re learning from the data. There is going to be an ongoing process, I touched on this with the idea of training the system but it’s also that in any optimisation system with AI you will start to make trade-offs. So if the job of your AI is to, for example, triage whether someone is a potential criminal or not, you will accuse wrongly the innocent people and don’t see the ones which are. And you have to understand these trade-offs and machines can’t evaluate the trade-offs just enact the evaluation rules we give them, so you need to have a process ongoing. The final part is how do I continue to secure better data? This is a different and full product where thinking to traditional software problems where I need a bit of code that needs to handle my user log it and I can write it and leave it like that. This is not like that and it does require an ongoing process.

So just to summarise for this,

  • lots of opportunities to build better products and to reach new markets,
  • there is a competitive threat if you don’t innovate,
  • it’s expensive and it requires internal capability building for most companies so it will be difficult to do.

The good news is that AI is unlikely to end humanity in the near future [laughter].

So we’ve got a few minutes left and I guess one of my questions is that with every new technology, whether it’s sort of IOT or Digital cameras or laser printers, they go through this hype cycle where there’s some technology trigger, everyone gets excited. This is measured by the number of tech crunch articles and there’s a peak of expectations. We all overpromise and under deliver. Not intentionally, but we got fat and lazy, we get into the trough of disillusionment and after that and the CEO change, there’s a slope of enlightenment and a plateau of productivity and we all have laser printers and fax machines and we’re super happy for it.

So perhaps unconventionally I’d love to hear where people think we are with AI and maybe can I do it with a show of hands or ask people directly? I’m the boss, great! So we will have 3 choices, slope of enlightenment, trough of disillusionment or peak of inflated expectations. So who thinks we are at the peak of inflated expectations? So we got approximately 20 people. Who thinks we’re in the trough of disillusionment? 4. You’re either cynical or optimistic, I can’t work out which. So it’s either gonna get better or it’s just terrible. Who thinks we’re at the slope of enlightenment? Interesting! So that was slightly more than the peak of inflated expectations to give you a sense like 30 people. So we’re an optimistic bunch here.

And I guess I probably agree with the lot, I mean you probably know more about this than I do, so I am with all the caveats of what I know and don’t know, I would guess that in some areas we’re about here and those areas include things like machine vision which is really good. I think it includes things like machines understanding human text which is getting better and better. I think we are really up here when it comes to a complex task completion type of work so Amy which is a scheduling assistant which is a really nice product still fails more frequently than it succeeds in real world use cases and fails badly.

So it’s quite interesting but what I think is relevant for us if we’re being pragmatic about building our businesses is that there is a bunch of stuff that is in here and in here that you can now get access to and start to say we need to put this into our systems because it will meet this customer need or not. Cool! Well that is me saying carpe diem, and I’m @Azeem at Twitter if you want to reach me, thank you [clapping].


Come to a Business of Software Conference in 2018

BoS USA 2018 -or- BoS Europe 2018

Don't Miss a Thing - Get BoS Updates

Want us to let you know about new talk videos, speaker AMAs, Business of Software Conference and other event updates? Join the smart people who get BoS updates. Unsubscribe anytime. We will never sell your email address.

Q&A

Mark Littlewood: Thank you very much, Azeem! A couple questions. I have to say I can’t recommend exponential view more highly! You can google it. I can’t mention it more often actually. Questions?

Audience Question: Hi, so my concern is that as data becomes a form of capital, this is really bad news for small business owners because now we have to compete against big companies who have a financial advantage and that scares me! Do you have any hope for me [laughter]?

Azeem Azhar: On the limited information available, I can barely even see you with the light in my eyes. So big companies have got a bit data advantage in certain domains but there’s lots of places where they don’t have it, like knowing your specific customers and your marketing funnel and so on. And so if you are trying to go up against Google in machine translation, Chinese to Finnish, you’ve probably got a big big problem and if it’s on image detection as well unless you’re seeing a million new images a day that are hand labelled you’ve got a big problem. But it fits about your own customer data and what you see in your market and your ability to source that niche data in other ways, maybe through crowd sourcing or partnership with auxiliary businesses, then I think you can build your own data centre and make sense of it, but choose your battles essentially in a sense.

Audience Question: Ok, cheeky question. What AI do you use to generate your newsletter?

Azeem Azhar: It’s super AI cause artificial human intelligence – it’s just me posting stuff.

Audience Question: My real question is around the ethics of AI. So I’ve recently done a series of events in the US talking about AI and it’s the topic I’ve been looking at for the last year and the biggest feedback I get, particularly from people who don’t work in technology circles is but what about all this data floating around about me and what are the implications for my life, my insurance policy? How I’m gonna deal with my doctor and that kind of thing.

Azeem Azhar: So I think this question about ethics is really important and what I think has been very helpful about the AI discussion has been that it has made us talk about it. So like you and the people you’ve been talking to about last year I was thinking about it as well and realised it’s not so much about AI ethics as it is about product and business ethics. Because what makes these AI ethics questions complex actually complexities we’ve already had so when I phone up my utility provider and they route me through to a decision tree to a customer service agent or to an IVR system, that is a set of business optimisations that they have decided upon which I have no right to redress in the AI thing is starting to do is it’s starting to shine a light on all of the quick shortcuts we’ve made in very pragmatic decisions that are not unethical but they are a-ethical and taken without an ethical frame. So really good example there I think is there was a great blog post written by a woman who said I’m a physical product designer and most products are designed for men in the world, even if they are meant to be used to everybody. Even looking at the handle size on the door, it’s designed for a male hand and not a female hand which is smaller. And that’s a product ethics question that is in common with almost every other product ethics question has and what the debate has done is raise some scary sexy question about autonomous killing machines but more interestingly starts to turn questions on every day optimisations, problems we run into. So we have one type of credits called and as consumers we have no write to readdress system is good or bad. So I think it’s super relevant but it needs to be broader than just questions about AI except in the killing space.

Azeem Azhar: One more! Go on!

Audience Question: If AI is gonna be the core differentiator within your industry as a business, how much of the capacity do you need in house and what sort of capacity are you looking at? Do you have some couple together an open CV or do you need a mathematician to do that? How much can you rely on other businesses to do that stuff for you?

Azeem Azhar: That’s a really good question! I think it just depends on your business and you have to make an immediate tactical trade-off between what am I gonna lose or gain from this investment but you also have to think about that bigger strategic question which is if someone else in my sector gets AI, how do I deal with it? When I come back to your specific question, examples tends to flow and image recognition, the way I would approach it would be to say I can very cheaply go to Google Vision or the Microsoft equivalent or the clarify guys and very quickly see whether my customers genuinely get a benefit from my being able to tag images automatically? And if they do get it, it will cost me $200 bucks I can make a buy, build or hire someone to do it. So do I want to rent this capability from Microsoft because it’s useful but not strategic or do I have a path to make it strategic and therefore I am willing to adjust my resources and put additional resources into hiring somebody or allocating somebody towards building this ourselves? I think what’s helpful at least in the image recognition space is also true in extraction and NLP is there are APIs that you can access cheaply to figure out if it’s gonna give you the value long term.

Mark Littlewood: Ok, I’m sorry, we don’t have time to do that, but why don’t you come and sit next to Azeem and we can have a chat in the break? Ladies and gentlemen, Azeem Azhar! Thank you! [clapping].


Come to a Business of Software Conference in 2018

BoS USA 2018 -or- BoS Europe 2018

Don't Miss a Thing - Get BoS Updates

Want us to let you know about new talk videos, speaker AMAs, Business of Software Conference and other event updates? Join the smart people who get BoS updates. Unsubscribe anytime. We will never sell your email address.

One response to “Should We Be Worried About AI and Machine Learning? | Azeem Azhar, Schibsted Media | BoS Europe 2016”