by Simon Bisson
Artificial Intelligence is the new frontier for software development. Simon Bisson checks out the state of the art.
HardCopy Issue: 70 | Published: November 4, 2016
If there’s one lesson to be learnt from the last few years, it’s that we’re in the early stages of a new industrial revolution, one where we’ll finally be able to deliver on many of the promises of artificial intelligence (AI). But this isn’t science fictional AI, where we’re building robots that will end up replacing us; instead it’s much more mundane, a world where highly-focused machines handle complex and repetitive tasks, or fill in for us where they can save time and effort.
Yes, that’s going to mean changes in what jobs are available, but it’s also going to mean more time for creative work, for exploring new ideas and trying out new things. Some aspects, like self-driving cars, are going to change the way we live, while others, like prediction engines, will make the world less risky.
Just a few years ago, AI seemed to be one of those things, like nuclear fusion, that was always thirty years away. In reality, of course, each new breakthrough rapidly became part of the day-to-day fabric of computing. Apple’s Newton used machine learning-based handwriting recognition in the mid-1990s. It wasn’t successful, but just a few years later the same underlying tools were powering the handwriting engine used by Microsoft’s Windows XP Tablet Edition far more accurately.
What changed? The answer was simple: computers were more powerful, and we had much more data we could use to train those algorithms. That was 13 years ago, and today’s computer systems are even more powerful, with the resources of the cloud to power a new generation of machine learning algorithms. Similarly, we’ve been able to take advantage of the arrays of computing engines in GPUs to build massively parallel neural network systems. We’ve also got access to even more data, along with the tools and storage needed to use it to train our new AI systems.
Low level AI
More complex AI problems can be addressed using the deep learning and neural networking algorithms that are coming out of research labs at Microsoft, Google, and Facebook. These tools aren’t for the faint of heart: they’re complex engines that need a lot of compute power, either in your own high performance computing cluster or in the cloud.
Microsoft Research has been working on many approaches to AI. One is the Computational Network Toolkit (CNTK) which is designed to give you a set of tools for building various neural networks using a series of training tools, running on both CPUs and GPUs. You can download a VM or a container ready to run, or work with the source code using Github. A series of prebuilt solutions can help you get started, but be warned, this is at heart a research tool and gives you extremely low level access the neural networks you’re building.
Google’s Tensor Flow takes a different approach, with a data flow graph connecting mathematical operations. Nodes can run on CPU and GPUs, using Python to link existing operators and C++ to add new data operators. The result is flexible and powerful, and models can move from development laptops to hyper-scale cloud systems and GPU arrays without needing any code changes.
Tools like CNTK and Tensor Flow have quickly become important resources for AI researchers, overshadowing more specialised tooling like Facebook’s Lua-based Torch and the academic CAFFE deep learning framework.
And then there’s the lessons we’ve learned building hyper-scale search engines. The Googles and Bings of the world aren’t just huge databases of content and links; they’re massive machine learning systems that aim to understand document relevance so they can give you the best answers to your search queries, by using your context (what you’re doing, where you are and so forth) to refine their output.
Modern AI is not just one technology: it’s a range of different approaches that are suitable for different tasks. Some, like machine learning, build on familiar rules-based approaches and statistical analysis, while others use neural networks to find patterns in an almost intuitive fashion.
The key to much modern AI is the combination of powerful computer hardware with large amounts of data. Today’s deep learning systems take a statistical approach to working with data, taking advantage of cloud scale to process and build knowledge maps. It’s an approach that works well with natural language processing, and is being used to handle machine translation in near real time. One intriguing result is that deep learning has been able to find links between words in different languages, for example linking ‘man’ and ‘woman’ to ‘king’ and ‘queen’ without being given direct definitions.
Neural networks take a different approach, training networks to respond to specific inputs, and using the outputs as a basis of a control system. Newer techniques, such as deep feedforward networks, take advantage of convolutional neural networks to model non-linear relationships, and provide much of the basis of recent improvements in image recognition. They are also being used to examine complex pattern spaces, an approach recently used by Google DeepMind’s AlphaGo to beat a human Go player for the first time. Other neural network approaches have been used to improve speech recognition.
Combining different techniques can result in major leaps forward. Tying together a neural network-powered speech recognition system with a deep learning-driven translator gives you something like Microsoft’s Skype Translator, which takes speech in one language and gives you near-real time translated subtitles at both ends of the conversation.
Using modern AI
Much of the AI around at the moment is mundane. Predictive text used to be built up from Markov chains of likely words, held in a local database. Now, however, it’s a machine learning service. Microsoft-owned Swiftkey recently switched its Android swipe keyboard to one that uses a neural net to predict the next word – or even the next phrase – you’re likely to use, learning from both your own social media postings, and from a large corpus of data collected from users all over the world.
Similarly, machine learning systems sit at the heart of credit card fraud detection systems, aiming to identify patterns of unusual usage. Perhaps a transaction is impossible, based on distance, or perhaps there’s a pattern of small transactions that indicate someone is trying to see if a card has been reported compromised and blocked. Using context from current transactions, and patterns that are linked to stolen accounts, machine learning systems can quickly respond in a predefined manner – sending messages to registered phone numbers to get additional authentication where there’s a low probability of the card being stolen (your first transaction after you get off a plane) or blocking a card when there’s a sudden unusual purchase (such as buying an airline ticket between two suspicious locations).
Much of the recent AI hype has been around self-driving vehicles. Here developers have taken advantage of several advances in machine learning, using image recognition to locate a car on a map and on a road in relation to the rest of the cars around it. Prediction algorithms are also used to determine how other vehicles are going to behave, while other tools interpret radar signals to give a 3D view of the space around a vehicle which can be used as a framework for the rest of the car’s sensor suite. It’s a set of complex problems that couldn’t be handled without machine learning, coupled with the years of training and research that have gone into building autonomous driving control systems.
Like much deep AI work, the sponsors of early self-driving car research were the military, with the aim of gaining lower cost and lower risk convoys of trucks. Using prizes such as the DARPA Grand Challenge, self-driving vehicles quickly moved from rough desert roads to simulated streets and on to the open road. You don’t need to be in a fully autonomous vehicle to take advantage of machine learning; many driver-assist systems, like automatic braking, take advantage of neural networks – enough for NVIDIA to deliver GPU-based chipsets intended for use in car sensor suites.
The results of this set of changes are significant: it’s become cheaper and cheaper to add AI to applications and devices thanks to widely available algorithms, relatively cheap off-the-shelf hardware such as NVIDIA’s GPU-derived chipsets, and publicly accessible datasets that can help train machine learning systems.
One area where AI is key is in the development of chatbots: conversational interfaces to applications and services. What’s most important here is understanding just what a user wants; a requirement that means a bot will need some form of natural language processing coupled with some way of delivering an appropriate response. That’s where AI comes in, offering deep-learning powered natural language tooling, followed by machine learning decision trees to deliver the result. A demo at Microsoft’s BUILD 2016 event showed an AI-powered chatbot parsing pizza orders, and being trained to understand slang and colloquialisms in tandem with a human call centre operative.
The chatbot future is one where low-skilled call centre tasks are handled by machines, leaving humans handling exceptions and escalations. That’s one part of tomorrow’s AI-powered world, where humans aren’t directly replaced by machines so much as having the machine handle the boring parts of their jobs. That said it is a future we’re going to have to find our way into carefully, with an understanding of the disruptions our software can cause.
Image recognition tools are another area where modern AI techniques have significant advantages over older technologies. It used to be difficult to tell a puppy from a kitten; now we’re able to identify individual breeds of dog. What’s changed is the scale of the data we have to train our neural networks. Services like Flickr and Google Photos have collected millions of tagged images, and we’ve been able to use this content to train a new class of image recognition tools. Train a neural network across a big enough set of data, and it’s possible to start building cross-links that make it easier to recognise content and to use user-confirmation to reinforce links. You may have noticed that services like OneDrive are already automatically classifying images, and the same techniques are being used to process hand-written cheques and to read road signs and to interpret road conditions.
Using AI toolkits
The best way to start with modern AI is to pick a toolkit and start training it with your data. Relatively high level tooling, like Google’s Cloud Prediction API or Microsoft’s Azure ML, build on the research work already carried out by the big search engines. You can use these tools with your own data to train your own choice of algorithms, before hooking a service’s RESTful API into your application or workflow.
At the highest level you’ll find tooling that offers basic pre-trained machine learning. Here services like Microsoft’s Cognitive Services APIs can be used to quickly add some of the benefits of AI to an application. These APIs are often task-based, delivering tools like face and speech recognition, as well as helping interpret user text input and offering machine-powered translations.
Such high level services simplify the process of building and using AI in your applications. With APIs like these, all you need is a REST package with the data you want to process, and a subscription. You can then use variants of the same tools used to translate Skype messages in your own applications, or to parse natural language to understand user sentiment and context. Other options allow your software to recognise faces, and even understand just what that expression might mean.
Lower level, tools from Azure, Google, and Amazon Web Services allow you to use machine learning to process large scale data from a range of sources. Big data and algorithms derived from the tools used to build and run search engines mean you can train and test machine learning systems using graphical expression builders and workflow engines. While it’s important to have a large initial training set of data, it’s also important to have a set of your own test data so you can determine the statistical significance of outputs, tuning your machine learning system to give you an acceptable level of false positives and false negatives.
One key use case for these services is handling data generated by Internet of Things (IoT) sensors. While you may want to record historic data, you’re also going to need actionable information about significant outliers. Training a machine learning system to spot likely failures can help reduce costs by letting you schedule pre-emptive maintenance before any failure occurs. That’s likely to be critically important if you’re, for example, GE monitoring jet engines on a 777, or Thyssen Krupp monitoring lifts and escalators.
That’s why Microsoft is now offering Azure ML as part of its Azure IoT platform, working with partners to deliver starter kits that include IoT devices and sensors to help you learn how to use machine learning with data streams. By combining a kit with familiar Arduino hardware with a trial cloud Azure ML service, you can quickly pick up any new skills before using them on a full scale project.
These are subscription services so you pay per transaction, something that you need to bear in mind if you’re using a machine learning service in conjunction with an IoT data stream. Another thing to watch out for is the machine learning algorithm you choose; all the main cloud providers offer several different machine learning models which support different use cases and different types of data. If you don’t get the results you think you need, it’s worth applying a different algorithm to your training data – you may well be surprised by new outputs. Another option is preprocessing data using statistical tools and languages, including R (the specialist language for statistical computing and graphics). That way you can feed a machine learning algorithm with data that’s already a set of outliers, so you don’t overload a server with data.
So the future seems bright, but we can’t look at AI without considering some of the possible issues. And while the issue of machines replacing people may be the most obvious, it’s not the most important.
Perhaps the biggest risk coming from the new wave of machine-learning driven AI is its lack of transparency. Why did a machine refuse a credit card transaction? Why did it slow down your car and let another in front? Why is another refusing to let an aircraft fly until an apparently perfectly serviceable engine is replaced?
What we’re doing is putting trust in algorithms that have been generated from big data. While it’s data we’re reasonably sure is accurate, the resulting decision loops and weightings can be almost impossible to document – and the more data we use to train a machine learning system, the more complex the resulting algorithm can become.
Nevertheless, modern AI is a powerful tool that can help solve many problems that only a few years ago were seen as near impossible, but are now within the reach of any developer. While the underlying technologies are a huge topic in foundational computer science, tools like Azure ML mean you don’t need to be a post-doctoral researcher to keep up with the state of the art – and to build it into your software and your services.