How to Create Artificial Intelligence Software

Introduction

If you keep abreast of the latest in developments in IT you will have heard about Artificial Intelligence or Machine Learning.

Articles are shared online daily; software vendors and service providers have begun to offer AI as-a-service thereby making it easier to integrate artificial intelligence into your existing software products without the need for a PhD or Data Scientists.

The commoditisation of artificial intelligence is upon us.

In this article, suitable for developers, technical founders and CTOs that are exploring artificial intelligence, we cover the following points:

Definition

Before we get into what’s involved in building software that leverages artificial intelligence, let’s start with a definition of what the term “artificial intelligence” means:

Artificial intelligence (AI) is intelligence exhibited by machines. In computer science, the field of AI research defines itself as the study of “intelligent agents”: any device that perceives its environment and takes actions that maximize its chance of success at some goal.  Colloquially, the term “artificial intelligence” is applied when a machine mimics “cognitive” functions that humans associate with other human minds, such as “learning” and “problem solving”.

Source: Wikipedia

History of Artificial Intelligence

In the 1950s, a pioneering British scientist named Alan Turing created The Turing Test.  Turing, who developed a machine that helped cracked the German Enigma code laid the groundwork for modern computing and theorized about AI.  The Turing Test opened with the statement:

 Can machines think?

In the test, Turing proposed:

a human evaluator would judge natural language conversations between a human and a machine designed to generate human-like responses. The evaluator would be aware that one of the two partners in conversation is a machine, and all participants would be separated from one another”

Source (Wikipedia)

turing

His test and paper were criticised, that said, it is still influential and an important topic in the realm of artificial intelligence and machine learning.

The Evolution of Artificial Intelligence

The Turing Test was meant to display artificial intelligence but it only demonstrated one “type” of AI – “weak AI”.

What types?
There are two flavours of artificial intelligence and you see a definition of these here:

Strong AI
Strong Artificial Intelligence (AI) is a type of machine intelligence that is equivalent to human intelligence. Key characteristics of strong AI include the ability to reason, solve puzzles, make judgments, plan, learn, and communicate. It should also have consciousness, objective thoughts, self-awareness, sentience, and sapience.  Strong AI is also called True Intelligence or Artificial General Intelligence (AGI).

strongweakai

Weak AI
Machine intelligence that is limited to a specific or narrow area. Weak Artificial Intelligence (AI) simulates human cognition and benefits mankind by automating time-consuming tasks and by analysing data in ways that humans sometimes can’t.  This sometimes gets called Narrow AI.

Download Our Project Specification Template

Source: Investopedia

At the time of writing, Strong AI simply does not exist.  Some people think it may never exist whereas others think it will be possibly soon.  Some examples of Weak AI include:

  • Self-driving cars
  • Siri
  • Alexa etc.

These are deemed “weak” as they can only operate with in a specific niche or problem domain and it doesn’t take much to throw them off balance.  The AI behind these weak AI agents isn’t solely restricted to organisations such as Tesla, Apple or Amazon.

In recent years, we have seen the commoditisation of artificial intelligence APIs from vendors such as Microsoft with their Cognitive Services Platform and IBMs infamous Watson.

Business pioneers such as Elon Musk, Reid Hoffman and Peter Thiel also launched OpenAI which is non-profit research company.  OpenAIs mission is to:

build safe AGI, and ensure AGI’s benefits are as widely and evenly distributed as possible

Let’s explore some of these offerings and how they segue into building your own artificial intelligence software

IBM Watson

watson

Most people will be familiar with the IBM Research project called Watson.  Back in 2011, the reigning champions of the TV show Jeopardy! competed against Watson and lost!  Watson was an early form of artificial intelligence that could understand natural language (NLP).

Fast forward a few years, and after several acquisitions and IBM are now a major player in the artificial intelligence and machine learning space.

IBMs Blue Mix Catalog contains a vast array of cognitive APIs for developers to implement which include but are not limited to:

  • Language Translation
  • Natural Language Classification
  • Speech to Text
  • Personality Insights
  • Visual recognition

Microsoft Cognitive Services

Microsoft also have their own artificial intelligence and machine learning implementations in the form of their Cognitive Services platform which is available over Azure and REST endpoints.

cognitive

The Cognitive Services platform offers similar APIs to that of IBMs Blue Mix Catalog, that said, there are more options in terms of integrating with the Bing platform which include but are not limited to the following:

  • Bing Autosuggest
  • Bing Entities Search
  • Bing Image Search
  • Bing News Search
  • Bing Speech
  • Bing Spell Check
  • Bing Video Search
  • Bing Web Search

The role of Big Data in Artificial Intelligence

Big Data is all around us. It is the name given to data sets that are so large and complex that traditional data parsing techniques cannot process. These data sets exist in your enterprise network, outside your network and in other systems such as social media channels for example.  It’s the data locked away in everyday IT system.

The sheer volume of Big Data being generated is staggering.  The following infographic by SocialMediaToday illustrates the just how much data is generated!

dataneversleeps

This shows no signs of slowing down, as user adoption increases for online services such as Twitter, Instagram and Facebook, we can be certain the amount of user generated content is only set to increase.

But what’s the connection between Big Data and artificial intelligence?
Consider some of these stats from the above infographic.  Every 60 seconds:

  • 6 million searches are made on Google
  • 46,000 users post photos on Instagram
  • 1 million videos watched on YouTube
  • 15 million text messages sent

This is clearly a lot of information.  Traditional software programming and data processing techniques such as ETL simply cannot scale and process such datasets.  To land this idea, consider the following example:

You’re a developer tasked with writing an email spam filter.  Here are some of the steps you might take to write such software:

  1. First, you would get samples of spam email to see what it looks like. You’d notice some key words or phrases such as “buy Viagra” “make money”.  You’d add these to a “black list”.
  1. Next, you would code some string manipulation methods, possibly in C# / .NET, to identify these words or phrases. You could then flag emails that contained these.
  1. Finally, you would test the application, and iterate over steps 1 and 2 until your code is catching a decent enough percentage of spam emails.

emailflowon

This approach would work but the result would be a maintenance nightmare that contains lots of hard coded string manipulation business rules.  Not to mention that each time “new” spam types were encountered, you’d need to lift the hood on your application source code and change it.

What’s the solution?

This is where artificial intelligence and machine learning really shine.

By implementing models and leveraging algorithms to process what appears to be vast quantities of unstructured data in a variety of different forms, software can be developed and “trained” to identify patterns in datasets.

It can also attempt to categorise specific records in datasets and in some cases, learn how to do this with minimal human intervention.  This takes us onto building applications that leverage artificial intelligence.

Building software with Artificial Intelligence

So far, and to recap, we’ve covered:

  • the history of artificial intelligence and machine learning
  • the types of artificial intelligence
  • some of the machine learning offerings
  • the role of Big Data

If you’re thinking of exploring artificial intelligence or machine learning, you need to ask yourself the question:

Do I really need to implement artificial intelligence in my software solution?

Which raises another question:

How do you know if you need to implement artificial intelligence!?

Here are some guidelines:

Read How We Helped a Marketing Company to Build a Back-Office Custom Ads Dashboard

  • Do you deal with vast quantities of data?
  • Is the data in various formats?
  • Are you dealing with frequently changing parameters?
  • Is the data arriving at every increasing velocity?

If you have answered yes to most of these points, then you might want to consider integrating artificial intelligence and machine learning into your tech stack.

Managing Email Spam

Consider our email spam problem from earlier on, remember the problems you’d encounter by implementing this application using traditional programming techniques?  Well, most of this can be alleviated by implementing an AI / ML approach.

spam

The Problem
Email spam detection is ultimately a text classification problem.

Fortunately, this is a well-studied area and several techniques can be employed by the machine to determine which Category new incoming email belongs to.

Algorithm and approach
Before we delve into the technicalities of which API to use, it can be helpful to further understand the background of how such a solution can be approach by using Supervised Learning.

If you’re unfamiliar with this, Supervised Learning is a technique that involves the construction of a “Classifier”.  The Classifier is responsible for categorizing text into a Category (sometimes known as a Label).

Some classification techniques include:

  • Naïve Bayes
  • Maximum Entropy
  • Support Vector Machines (SVM)

Naïve Bayes is a relatively simple and accurate way for the machine to identify which category an email belongs to.

Bayesian Theorem is a probability theory used to arrive at predictions considering recent evidence.  The theorem was discovered by an English Presbyterian and mathematician called Thomas Bayes and published posthumously in 1763.

The rule is written like this:

p(A|B) = p(B|A) p(A) / p(B)

To deconstruct the rule, here are descriptions of each component that form it

p(A|B) 

‘The probability of A given B’.  This basically means the probability of finding observation A, given that some part of evidence B is there.  This is what we want to find out. (Boone)

p(B|A)

This is the probability of the evidence turning up, given that the outcome obtains.

p(A)

This is the probability of the outcome occurring, without the knowledge of the new evidence.

p(B)

This is the probability of the evidence arising, without regard to the outcome.

We can apply this rule to our email spam problem and plug in the parameters:

P(spam |words) = P(words/spam)P(spam) / P(words)

This is all fine and well but may be hard to visualize without data and values. So, taking the above into consideration, imagine we had a database of 100 emails.

We also believe that emails which contain the word “buy” to be spam emails.  We can now apply the Bayes Rule:

Training Data

·         100 emails in total

·         60 of those 100 emails are spam

·         48 of those 60 emails that are spam have the word “buy”

·         12 of those 60 emails that are spam don’t have the word “buy”

·         40 of those 100 emails aren’t spam

·         4 of those 40 emails that aren’t spam have the word “buy”

·         36 of those 40 emails that aren’t spam don’t have the word “buy”

What is the probability that an email is spam if it has the word “buy” in the content?

The answer to the above is as follows:

There are 48 emails that are spam and have the word “buy”.

And there are 52 emails that have the word “buy”: 48 that are spam plus 4 that aren’t spam.

So the probability that an email is spam if it has the word “buy” is 48/52 = 0.92

As mentioned previously, the rule and notation is based on probabilities, so we can redefine the problem to use probabilities rather than quantities.  Using the same database of emails.

  • 60% of those emails are spam
  • 80% of those emails that are spam have the word “buy”
  • 20% of those emails that are spam don’t have the word “buy”
  • 40% of those emails aren’t spam
  • 10% of those emails that aren’t spam have the word “buy”
  •  90% of those emails that aren’t spam don’t have the word “buy”

What is the probability that an email is spam if it has the word “buy”?  The notation to arrive at the answer looks like this:

  • P(spam) = the probability that an email is spam
  • P(not spam) = the probability that an email isn’t spam 
  • P(“buy”|spam) = the probability that an email that it is spam has the word “buy”
  • P(“buy”|not spam) = the probability that an email that it isn’t spam has the word “buy”
  • P(spam|”buy”) = the probability that an email that has the word “buy” is spam

So P(spam|”buy”) is the answer we are looking for

P(“buy”|spam) * P(spam) counts all the emails that are spam and have the word “buy”
P(“buy”|not spam) * P(not spam) counts all the emails that aren’t spam and have the word “buy”
Summing the previous two P(“buy”|spam) * P(spam) + P(“buy”|not spam) * P(not spam) we count all the emails that have the word “buy”

Meaning the resulting equation looks like this: (This is Bayesian Theorem)               

P(spam|”buy”) = P(“buy”|spam) * P(spam) / (P(“buy”|spam) * P(spam) + P(“buy”|not spam) * P(not spam))

Or , to inject the numbers:  0.8 * 0.6 / (0.8*0.6 + 0.1*0.4) = 0.48 / 0.52

The result of the simulation was: 0.922248596074798

In plain English
After running incoming emails through this classification model, we can safely assume that with a probability of 92%, if emails contain the word “buy”, it should be placed in the spam folder!

Training Data
The classifier must be “trained” with sample datasets prior to determining future text classifications, in our email example, we had 100 emails, some contained with word “buy” and others didn’t.  Without accurate training data, the machine cannot accurately make reliable predictions on future events

  training

APIs
With the theory and example out of the way, you now have an appreciation for how probabilities can be determined by the machine.  Rather than code up numerous combinations of string patterns and complexes business rules, spam identification can be determined using maths.  This is a use case the machine will excel in!

The great news is that you don’t need to code up theorems such as the Bayes Rule, the heavy lifting is now being done for you by the likes of Microsoft and IBM!  You can create an account on Microsoft Azure and leverage artificial intelligence APIs that get exposed as REST endpoints.  You simply pass in the parameters and the APIs return JSON with your results.

For example, using Azure offers a Topic Detection API. This returns the detected topics for a list of submitted text records. A topic is identified with a key phrase, which can be one or more related words.

This is ideal for mining short, human-written text such as reviews and user feedback.  Introducing an API like this into your stack could help your business gain actionable insights from existing datasets.

You can see an example of the Topic Detection API in action in the following screenshot:

topicdetection

Note how the API has detected the Language, Key Topics and Sentiment.  In this screenshot, you can see the underlying JSON:

topicdetectionresults

Imagine you had to manually code classes and custom APIs to determine these data points?  Whilst these APIs do attract a cost, the tricky thinking has already been done thereby freeing you up to solve the business problem.

Additional Applications of Artificial Intelligence

We’ve covered email classification and explained how one can mine data to glean actionable insights, other potential areas where artificial intelligence is starting to be applied include:

AdTech and Sentiment Analysis
UK start up SocialOpinion leverages artificial intelligence to identify the current sentiment of a brands, products and services on social media. In addition to this, Microsoft Cognitive Services LUIS is also used to try and determine sales leads.

Crime prediction
The company Pred Pol built a software product that leverages big data, machine learning and analytics.  The vision was to see if software could use historical data sets to anticipate crime locations and times and allow officers to pre-emptively prevent these crimes from occurring.

The software the following to help achieve this:

  • Past type of crime
  • Crime location
  • Date/time of crime

Research has shown that additional crimes tend to occur close to the original crime spot.  At the start of each shift, officers examine Google maps which are overlaid with boxes that indicate potential criminal hotspots.

Driverless Cars
NVidia, a firm over two decades old with a history in computer graphics partnered with Audi to build the next generation of autonomous vehicles or the “AI Car” as it’s being dubbed, powered by NVIDIA DRIVE PX, this technology can understand in real-time what’s happening around the vehicle, location itself on a map and plan safe paths ahead.

The Future of Artificial Intelligence

At the time of writing, Facebook developed an AI system which created its own language.  The system had developed a more refined vocabulary to communicate with itself and researches at Facebook shut down the system when they realised it was no longer communicating in readable English.

The system was originally trained in English but soon diverged from this when it realised there were more efficient ways to communicate with each “AI agent”.  Matrix anyone?!

Summary

We’ve covered a lot in this blog post, the origins of AI, through to how you can implement it (and if you should) as well as explored some of the current APIs out there which shield you from the complex algorithms that are often involved.

As much as we’d like to be able to, it can be hard to predict the future of artificial intelligence.  Cognitive computing is an evolving space, if only we had AI for the AI!

Thanks for reading this blog and if you’ve enjoyed it, please feel free to share it with your colleagues, friends or anyone else that you think might be interested in. You can also subscribe to our blog to get the latest updates.

Download Our Project Specification Template

Jamie Maguire

Jamie Maguire

Software Architect | Consultant | Developer | Tech Author
Jamie Maguire

Latest posts by Jamie Maguire (see all)