Activation Functions and Loss Functions

Two critical components of deep neural networks are activation functions and loss functions. These functions play a crucial role in determining how the network learns and makes predictions. In this section, we will explore the different types of activation functions and loss functions commonly used in deep learning.

Activation Functions

Activation functions introduce non-linearity into the neural network, allowing it to learn complex patterns and make accurate predictions. Here are some commonly used activation functions:

1. Sigmoid Function

The sigmoid function, also known as the logistic function, transforms the input into a range between 0 and 1. It is often used in binary classification problems where the output represents the probability of belonging to a particular class.


import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

2. Rectified Linear Unit (ReLU)

ReLU is a popular activation function for hidden layers. It sets all negative values to zero and keeps positive values unchanged. ReLU is computationally efficient and helps alleviate the vanishing gradient problem.


def relu(x):
    return np.maximum(0, x)

3. Softmax Function

The softmax function is commonly used in multi-class classification problems. It normalizes the outputs such that the sum of probabilities across all classes adds up to 1. This allows the model to estimate the probability distribution over multiple classes.


def softmax(x):
    e_x = np.exp(x - np.max(x))
    return e_x / np.sum(e_x)

Loss Functions

Loss functions quantify the error or discrepancy between the predicted outputs and the true targets. By minimizing the loss, the neural network can learn the optimal set of parameters. Here are some commonly used loss functions:

1. Mean Squared Error (MSE)

MSE is a widely used loss function for regression problems. It calculates the average squared difference between the predicted and true values. The goal is to minimize the mean squared error to achieve accurate predictions.


def mse(y_true, y_pred):
    return np.mean((y_true - y_pred) ** 2)

2. Binary Cross Entropy

Binary cross entropy is commonly used for binary classification problems. It measures the dissimilarity between the predicted and true class labels. The aim is to minimize the binary cross entropy loss to improve classification accuracy.


def binary_cross_entropy(y_true, y_pred):
    epsilon = 1e-15
    y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
    return -(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))

3. Categorical Cross Entropy

Categorical cross entropy is used for multi-class classification problems. It quantifies the dissimilarity between the predicted and true class probabilities. Minimizing the categorical cross entropy loss helps improve the model's ability to correctly classify multiple classes.


def categorical_cross_entropy(y_true, y_pred):
    epsilon = 1e-15
    y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
    return -np.sum(y_true * np.log(y_pred))

Understanding activation functions and loss functions is crucial for building effective deep neural networks. By choosing the appropriate activation functions and loss functions, you can enhance the performance and accuracy of your models. Experiment with different combinations and see the impact on your deep learning projects.

Zone Of Makos

Activation Functions and Loss Functions