Machine Learning Algorithms Comparison

Artificial Intelligence and specially, Machine Learning were created to easiest the work of developers and programmers.

Instead of writing many lines of code, you have to choose between Machine Learning Algorithms and then decide on a programming language. That can be tricky.

Related course: Python Machine Learning Course

Why? To start there are four types of algorithms for machine learning.

Machine Learning Algorithms

Supervised Learning

Supervised learning is based on labeled training data.

The base of supervised learning is the data called training data and a set of training examples.

The labeled training set has the function to predict unknown labels on other objects.

It has two types:

regression (if the label is a real number)
classification (if the label is limited and unordered).

supervised learning uses labeled training data

Unsupervised Learning

Unsupervised learning is unlabeled data.

The base of unsupervised learning is having less information about the objects. These test data are out of labeled, classified or categorized.

Unsupervised learning can create groups of objects with similarities in clusters and separate the different objects from all clusters, assuming these objects as anomalies.

Semi-supervised learning

Semi-supervised learning is labeled and unlabeled.

Gathering supervised and unsupervised pros and cons, Semi-supervised learning is especially for those who cant label their data.

The training set has both, labeled and unlabeled, in order to improve accuracy.

Reinforcement learning

Reinforcement learning is taking action.

It is different from the previous ones, because there are no datasets for reinforcement learning.

Reinforcement learning is how software agents should take actions to maximize rewards. This is training to behave on the most effective way.

Algorithms

So, knowing this, lets do a quick resume of six machine learning algorithms.

Linear regression & Linear classifier: If there are simplest algorithms, should be these. Its to be used when you have thousands of features and need to provide a decent quality.

Better algorithms than these, could suffer from overfitting, while regression and classifier will ensure a huge amount of features.

Logistic regression: performs binary classification, so the label outputs are binary. It takes linear combination of features and applies non-linear function to it. This one is the simplest algorithm of non-linear classifier.
Decision trees: branches and leaves save lives. This algorithm is a predictive model to go from observations to conclusions. Real people can make decisions with a decision tree, which make it pretty understandable. The easiest to interpret is commonly used to compose Random forest or Gradient boosting.
K-means: if your goal is to assign labels according to the features of objects, but you dont have any labels is called clusterization task and this algorithm makes it possible. But there are ranges of clustering methods with different advantages and disadvantages should check on consideration first.
Principal component analysis (PCA): you can apply it when you have a wide range of features, highly correlated between each other and models can easily over fit on a huge amount of data. This algorithm is great to reduce dimensionality with the minimum loss of information.
Neural networks: every specific task has a lot of different architectures or a range of layers / components. At the moment of working with images, neural networks are ideals. Their training needs huge computational complexity, but present a new era of algorithms.

If you are new to Machine Learning, then I highly recommend this book.