In the recent days, buzz words like Artificial Intelligence – AI, Machine Learning etc. are being used quite often. May be they were already present and I have started encountering them recently …who knows. But I always had a belief or a myth that machine learning is going to replace traditional computer programming as we know it. As the term suggests, if machines start learning by themselves (as depicted in Hollywood movies) then sooner or later all programmers will become redundant. Right? ..
Well!! not so fast. and I am glad I took the Open SAP Course on Machine Learning – Enterprise Machine Learning in a Nutshell. It has helped clear a few of my doubts on this topic. If you have similar doubts then I would recommend you take up this Free and Quick course. But if you are in a hurry, I will try to shed some light on this topic in this blog.
So, initial computer programs where procedural/sequential (and they still are). Such programs are well suited to solve computational problems. However as application of computers evolved, particularly to solve complex business challenges, higher – sophisticated software programs were necessary. This eventually lead to modular thinking in program design architecture and object oriented methodologies were adopted. Also, to solve various challenges different software design patterns were proposed each having its own pros and cons. Programs were being developed as discreate modules and merged together as required, by the use of Application Programming Interfaces also known as APIs.
SAP is one such example of an ERP which has several modules which interact and integrate into one huge software.
This concept in software development has worked exceptionally well and a large number of business challenges can be solved with this approach. However still today there are some challenges which need unconventional ways of thinking because our original rules based thinking does not work well.
What is Rules based programming approach ?
Recently we welcomed our second baby daughter and when I am out to shop baby clothes for her, I see that there is a trend (atleast in the US) to dress up boys in blue and girls in pink. I am sure, newborns don’t care what color clothes they wear, in fact I think at this age they are color blind. But anyway parents think boys = blue and girls = pink. So going with this belief, if I have to write a program to sort baby clothes, the program would simply look up the color of the dress, and apply a simple binary IF..THEN logical rule and sort the clothes accordingly.
This program assumes one thing. That the input to the program is going to be a heap full of baby clothes. But what if the input is not just baby clothes but tons of clothes of all forms and sizes including adult, female, male, XL, XXL, petite sizes as well, or what if the input heap includes not just clothes but toys, kitchen utensils, hardware tools etc. At this point our conventional rule based wisdom would be limited to writing complex rule structures – like some gigantic CASE statement to sort every sort of item.
This is technically very difficult if not impossible to program.
This is where machine learning steps in.
Machine Learning was first proposed by – Tom Mitchell who first pioneered in machine learning , giving us the following definition –
“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P – if its performance at tasks in T, as a measured by P, improves with Experience E.”
In other words a usual C or Java program will always give us the same results consistently be it tested for 10 records, 10000 records or 10000000 records – atleast thats what we expect from a well written conventional program, right!!. However as the record set increases, a Machine learning algorithm understands its intent and becomes better at making approximations and in certain cases even better than its human counter-part. It measures its performance of performing a given task, lets say identifying shirts, and improves its capability at handling the task as it gains more experience E by encountering more samples of the same task T.
So here Machine Learning means that the computer can approximate complex decision functions based on data. It’s about computers learning from data rather than being explicitly programmed. This makes software progressively intelligent. Computers can now work on unstructured information, natural-language text, image and videos rather than neat rows and tables of structure data as we have been traditional known to work with.
The concept behind machine learning is that you first define the challenge and then train the computer to identify the solution. So you would feed the machine learning algorithm with a large set of existing historic data known as training data. This data will help the algorithm to generate a model. You can then use this model to generate useful predictions by feeding it some test and live data. The course I took did not go into the details of the implementation of the model or designing the learning enviornment hence there are a lot of open questions here. But if you are curious to know the internal workings of the model then check out some initial reading pages at https://www.cs.swarthmore.edu/~meeden/cs63/f11/ml-intro.pdf or I found the complete book online at Amazon here.
Are you ready for Machine Learning ?
How would you know that a given business challenge is really suited for a machine learning problem or can it be solved using conventional programming methods ?
In particular, you would look for the following questions.
- Can you formulate the problem clearly?
- Do you have sufficient examples?
- Do you have a regular pattern in the data?
- Can you find a meaningful representations of your data?
- and Finally how will you define the success criteria?
Though machine learning may sound like a cool concept, it may not be the best approach in every case. The question business needs to ask themselves is do you really want to automate a task? Some tasks are better left to be handled manually for example many HR / employee interactions which need interpersonal relationships.
The following graph is a great way to deteremine which use cases are best suited for machine learning, which should be done using a rule based programming approach and which are best left manual.
* screen-shot from course learning slide.
This means that we now have the tools at hand. What system we use for what problem will determine our success.
Machine Learning (ML) vs Artificial Intelligence (AI)
Machine learning is actually just a subset of Artificial Intelligence and it has other subfields like artificial neural networks or deep learning and has a lot of overlap with data mining and knowledge discovery in databases. Then there is computer vision application of machine learning which is used in face recognition or object detection. In natural language processing, machine learning is used for machine translation and sentiment analysis or ecommece recommender which can suggest better products you might want to buy.
Applications of Machine Learning
ML has a huge potential of upwards of $4 Billion by 2020 and transform knowledge work as we know it. Robots and automation are already assisting blue collar jobs where humans and machines are working together. ML is making this kind of cooperation possible in knowledge work as well.
What makes Machine Learning possible today is the fact that we have huge volumes of data for the machine to learn from. In addition to that there are a few more factors
- Huge volumes of data – Big Data, Social media
- Networks with more variables
- Big computing capabilities – GPUs that can crunch massive amounts of sample data sets
- Improvements in machine learning Algorithms, Deep-Learning and reinforcement learning.
Few Examples of Machine Learning as highlighted in the course:
SAP being an ERP solution this course emphasis more on the application of machine learning in an enterprise application landscape. The goal is to transform the enterprise data that they have into a new business value. In a traditional Business Intelligence (BI) context also we try to get insight from the data. But typically the data is structured and it resides in a database. Here we define a query based on a set of rules and the query fetches the data which we can then visualize in a dashboard or another graphical explorer. Also Typical BI Reports are focused on current state of the business and historical data.
However in contrast a machine learning approach for an enterprise application starts with historical training data, with an idea of the kind of problem to solve. Based on this data a learning model is developed. The computer goes through the machine learning training process and the final model that we get can be feed new data to gain future predictions and results. And as the data evolves and new data is encountered, the model goes through retraining to make the model better, rather than just analyzing the data.
In an enterprise context, machine learning models are already available for nontrivial tasks which are already trained by somebody else and data scientists are being trained on how this process would look like. But for the general audience the model is pretty much like a black-box. We can evaluate the predictions and compare the predictions with true answers and evaluate the accuracy of the model.
People leave reviews of products they use, movies they watch, restaurants they visit, service the use etc. If a particular product manufacturer wants to, lets say, find out how the market has perceived their newly launched product, they would have to hire a huge analysis team to gather all online product reviews and social media posts where their product is mentioned and then have them classify each of the reviews into good, bad or neutral. So the task here is to classify the consumer reviews into 3 possible labels – positive, negative or neutral.
So learning here really means that the computer is trying to approximate this function that maps reviews to the respective labels.The experience that we provide to the computer to learn from is a large set of reviews and their correct labels. The machine learning algorithm will try to approximate this mapping from reviews to labels and will give us a model. This model is basically a function that we can use to classify a new review and then predict the most likely label for this review.
Support Ticket Classification
Here the task is to classify a support ticket received from customers and be able to categories them so that they can be routed to the appropriate agent.
If we apply the questions to determine if this is a machine learning problem then we get the following.
- Is the volume of tickets high enough to justify machine learning – Yes ( tickets can also come in from email- social media -phone calls etc )
- Can simple rules be applied to identify the tickets – No ( people may use any terms and words to report the issue )
- Can we formulate a problem statement – Yes ( Given a customer support ticket, predict its correct service category.)
- Do we have sufficient examples for the machine to learn from – Yes ( We already said we have a huge number of tickets coming in)
- Do we have a regular pattern in the data – Yes ( People typically use comma words like bill, payment etc )
- This does identify as a good candidate to apply machine learning.
Another example is that of HR/Recruiting where recruiters have to go through piles of resumes to find out an accurate fit for a job. Machine learning can be used to go through keywords within the resume to match them to the job description to assign a 0 or 1 kind of score. Such kind of problems are called regression models as opposed to classification models where you need to classify the data and not grade them. So a candidate with a perfect 1 would be a perfect match candidate and vice-versa ( Now I understand why my resume would not be picked up for most of the jobs I have applied and if its the same for you — you better add in those key words 😛 )
The course also goes through some more examples of how Computed Vision can be applied for sorting problems to identify if products have been lined up properly on the shop shelf for display and sale and in the fashion industry to identify the latest trends in the market so that the company can apply these learnings in designing their new products.
Finally the Key take aways are
- Machine learning is about computers learning from data without being explicitly programmed.
- This requires large amounts of historical data, but it does not require us to hand-code substantial amounts of rules.
- Not all problems require machine learning but can be solved with our conventional rule based thinking.
- To find out if a particular problem is in fact a ML problem you need to apply the 6 given questions
- Machine learning does come with a separate training and inference phase.
- In the Training phase the machine is feed through a large set of data to get a prediction model
- The Inference phase is about integration into existing transactional applications using microservices and API abstractions .
This is all about Machine Learning in a nut shell. My next goal is to dig deeper into this domain. I have found some good courses on this topic on Coursera and Udamey. I think one needs to learn Python programming to design models in machine learning. Something I was always thinking of doing. Anyway, Check out the links in the bottom of the page for a listing of some of the courses I found interesting.
Hope you liked this post. I would like to know your thoughts on machine learning and how you plan to apply it in your career or any more examples that you may want to suggest.
Important links and Interesting reads
- Enterprise Machine Learning in a Nutshell
- Record of Achievement – Linkin Pereira
- When Did Girls Start Wearing Pink? – Art and Culture – Smithsonian
- Software Design Patterns
- Machine Learning – by Tom M. Mitchell
- Tom M. Mitchell – Home Page at Carnegie Mellon University