- What is a perceptron?
- What is the working of a multilayer perceptron (MLP)?
- What are the advantages and disadvantages of an MLP?

In machine learning, the **perceptron** is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.

The perceptron is an algorithm for learning a binary classifier called a threshold function - a function that maps its input x (a real-valued vector) to an output value f(x) (a single binary value):

where w is a vector of real-valued weights, is the dot product , where *n* is the number of inputs to the perceptron, and *b* is the bias. The bias shifts the decision boundary away from the origin and does not depend on any input value, that is a bias value shifts the activation function curve up or down.

A **multilayer perceptron (MLP)** is a class of feedforward artificial neural network (ANN). The term MLP is used ambiguously, sometimes loosely to any feedforward ANN, sometimes strictly to refer to networks composed of multiple layers of perceptrons (with threshold activation). Multilayer perceptrons are sometimes colloquially referred to as “vanilla” neural networks, especially when they have a single hidden layer.

**Working**

**a**. All the inputs x are multiplied with their weights w. Let’s call it k.

**b**. Add all the multiplied values and call them Weighted Sum.

**c**. Apply that weighted sum to the correct Activation Function. In the case of MLP, the decision/activation function is a step function

It is to be noted that weights shows the strength of the particular node. The weights represent the relative importance of each of the weights to the classification decision.

Each layer can have a large number of perceptron, and there can be multiple layers, so the multilayer perceptron can quickly become a very complex system. The multilayer perceptron has another, more common name—a neural network. A three-layer MLP, like the diagram above, is called a **Non-Deep or Shallow Neural Network**. An MLP with four or more layers is called a **Deep Neural Network**. One difference between an MLP and a neural network is that in the classic perceptron, the decision function is a step function and the output is binary. In neural networks that evolved from MLPs, other activation functions can be used which result in outputs of real values, usually between 0 and 1 or between -1 and 1. This allows for probability-based predictions or classification of items into multiple labels.

The **advantages** of MLP are:

- Capability to learn non-linear models.
- Capability to learn models in real-time (on-line learning).

The **disadvantages** of MLP include:

- MLP with hidden layers have a non-convex loss function where there exists more than one local minimum. Therefore different random weight initializations can lead to different validation accuracy.
- MLP requires tuning a number of hyperparameters such as the number of hidden neurons, layers, and iterations.
- MLP is sensitive to feature scaling.

Thanks @tanyachawla it was a great help. If you can send a code with MLP from scratch that would be great.