Machine learning being a subset of artificial intelligence technology helps make sense out of historical data as well as helps in decision making. Machine learning is a technique set to find patterns in data and build mathematical models around those findings.
Once we build and train a machine learning algorithm to form a mathematical representation of these data, we can use that model to predict future data. For example, in retail, based on historical purchase data, we can predict whether a user will buy a particular product or not using a learned algorithm.
Types of machine learning algorithms
A machine learning algorithm can be divided into three categories:
- Supervised machine learning
- Unsupervised machine learning
- Reinforcement machine learning
In businesses, we mostly use supervised machine learning algorithms for performing tasks like categorical classification (binary and multiclass), activity monitoring, predicting a numerical value, and a lot more. We also use unsupervised machine learning techniques for a few applications like grouping or clustering, dimensionality reduction, and anomaly detection.
While both these approaches have many practical implications for businesses, reinforcement learning (RL) has a very limited business application like path optimization for the transit industry. However, RL is going through extensive research and slowly take over supervised and unsupervised learning. And believe me, RL holds the future for businesses a lot and is super powerful.
A case in point
Why is reinforcement learning so powerful?
Here is a story of AlphaGo and AlphaGoZero.
Go is the world’s oldest board game. It is so complex that if you calculate all the combination from the empty board, it will have combinations of more than the total number of particles in the universe.
DeepMind built AlphaGo algorithm based on reinforcement algorithms that learned by analyzing games and playing against a real player. In Oct 2015, it won against a professional player named Fan Hui by 5-0.
In March 2016, AlphaGo was set to take on the Go champion named Lee Sedol. Every Go expert was sure that it would be very easy for Lee Sedol to beat AlphaGo by 5-0.
Deep Mind invited Fan Hui again to check how good AlphaGo became a trained player with reinforcement learning algorithms at that time and how much it had improved. During the inspection, Fan Hui found a major weakness in AlphaGo, but there was no time to correct it.
To everyone’s surprise, AlphaGo won the game by 4-1. Lee got a clue about the weakness of AlphaGo and won the fourth round against AlphaGo. However, AlphaGo improved its ability with only one game and won the fifth round against Lee despite its weakness.
AlphaGo was taught the Go game using video feed. The next version named AlphaGoZero learned the game just by playing against itself and feeding basic rules. In just three days of training, it surpassed the ability of AlphaGo, which won against the world champion Lee Sedol.
Although this was achieved by reinforcement learning, inside it, they used deep convolutional neural networks (CNN) to process images. CNN is the type of deep learning algorithms that are widely used in business applications.
When to use machine learning
Machine learning is a powerful tool, but it should not be used frequently for it is computationally extensive and needs training and updating of models on a regular basis. It is sometimes better to rely on conventional software than machine learning.
For certain use cases, we can build a robust solution without machine learning, which can rely on rules, simple calculations or pre-determined processes for results and decision-making. These things are easily programmable and do not need any exhaustive learning. Hence, experts suggest using machine learning in certain special cases and scenarios:
There are two scenarios where we can use machine learning solutions:
- Inability to code the rules:
- Tasks which cannot be done by deploying a set of rules
- Difficulty identifying and implementing rules
- Multiple rules to go hand in hand, which are difficult to code
- Other factors making it difficult to code the rules based on those factors
- Overlapping rules rendering inaccurate codes
- Data scale is high:
- When you can define rules from a few samples, but it is difficult to scan millions of data sets for a better prediction.
Machine learning can be used for both the above scenarios as it brings out a mathematical model containing rules and can solve large-scale problems.
Steps for developing machine learning applications
Building a machine learning application is an iterative process and follows a set of sequences. Below are the steps involved in for developing machine learning applications:
This first step is to frame a machine learning problem in terms of what we want to predict and what kind of observation data we have to make those predictions. Predictions are generally a label or a target answer; it may be a yes/no label (binary classification) or a category (multiclass classification) or a real number (regression).
Collect and clean the data
Once we frame the problem and identify what kind of historical data we have for prediction modeling, the next step is to collect the data from a historical database or from open datasets or from any other data sources.
Not all the collected data is useful for a machine learning application. We may need to clean the irrelevant data, which may affect the accuracy of prediction or may take additional computation without aiding in the result.
Prepare data for ML application
Once the data ready for the machine learning algorithm, we need to transform the data in the form that the ML system can understand. Machines cannot understand an image or text. We need to convert it into numbers. It also requires building data pipeline depending on the machine learning application needs.
Sometimes a raw data may not reveal all the facts about the targeted label. Feature engineering is a technique to create additional features combining two or more existing features with an arithmetic operation that is more relevant and sensible.
DOWNLOAD WHITE PAPER
A Complete Guide to Chatbot Development – From Tools to Best Practices
For example: In a compute engine, it is common for RAM and CPU usage to reach 95%, but something is messy when RAM usage is at 5% and CPU is at 93%. We can use a ration of RAM to CPU usage as a new feature, which may provide a better prediction. If we are using deep learning, it will automatically build features itself; we do not need explicit feature engineering.
Training a model
Before we train the model, we need to split the data into training and evaluation sets, as we need to monitor how well a model generalizes to unseen data. Now, the algorithm will learn the pattern and mapping between the feature and the label.
The learning can be linear or non-linear depending upon the activation function and algorithm. There are a few hyper parameters that affect the learning as well as training time such as learning rate, regularization, batch size, number of passes (epoch), optimization algorithm, and more.
Evaluating and improving model accuracy
Accuracy is a measure to know how good or bad a model is doing on an unseen validation set. Based on the current learnings, we need to evaluate how a model is doing on a validation set. Depending on the application, we can use different accuracy metrics. For e.g. for classification we may use, precision and recall or F1 Score; for object detection, we may use IoU (interaction over union).
If a model is not doing well, we may classify the problem in either of class 1) over-fitting and 2) under-fitting.
When a model is doing well on the training data, but not on the validation data, it is the over-fitting scenario. Somehow model is not generalizing well. The solution for the problem includes regularizing algorithm, decreasing input features, eliminating the redundant feature, and using resampling techniques like k-fold cross-validation.
In the under-fitting scenario, a model does poor on both training and validation dataset. The solution to this may include training with more data, evaluating different algorithms or architectures, using more number of passes, experimenting with learning rate or optimization algorithm.
After an iterative training, the algorithm will learn a model to represent those labels from input data and this model can be used to predict on the unseen data.
DOWNLOAD WHITE PAPER
Wind Turbine Fault Detection Using Machine Learning and Neural Networks
Serving with a model in production
After training, the model will do well on the unseen data and now it can be used for prediction. This is the most important thing for businesses. This is also one of the most difficult phases for business-oriented machine learning applications. In this phase, we deploy the model in production for the prediction on real-world data to derive the results.
Machine learning is the enabler technology, but if we do not follow a proper plan and execution for training and learning of models on algorithms, we may fail. Hence, it is always a great idea for businesses that want to build complex machine learning systems to hire AI and Machine learning service providers and focus on their core competency.
eInfochips provides Artificial Intelligence & Machine Learning offerings to help organizations build highly-customized solutions running on advanced machine learning algorithms. We help companies integrate these algorithms with image & video analytics, as well as with emerging technologies such as augmented reality & virtual reality to deliver utmost customer satisfaction and gain a competitive edge over others. Know more about our machine learning expertise.