Softmax non-linearity function

It is because it takes in a vector of real numbers and returns a probability distribution. Its definition is as follows. Let x be a vector of real numbers (positive, negative, whatever, there are no constraints).

Then the i’th component of Softmax(x) is —


 It should be clear that the output is a probability distribution: each element is non-negative and the sum over all components is 1.


Popular posts from this blog

Decision Tree algorithm

Real life example for linear regression: advertising spending and revenue

Real life example for linear regression: drug dosage and blood pressure of patients