Activation Functions and Loss Functions
Two critical components of deep neural networks are activation functions and loss functions. These functions play a crucial role in determining how the network learns and makes predictions. In this section, we will explore the different types of activation functions and loss functions commonly used in deep learning.
Activation Functions
Activation functions introduce non-linearity into the neural network, allowing it to learn complex patterns and make accurate predictions. Here are some commonly used activation functions:
1. Sigmoid Function
The sigmoid function, also known as the logistic function, transforms the input into a range between 0 and 1. It is often used in binary classification problems where the output represents the probability of belonging to a particular class.
import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(-x))
2. Rectified Linear Unit (ReLU)
ReLU is a popular activation function for hidden layers. It sets all negative values to zero and keeps positive values unchanged. ReLU is computationally efficient and helps alleviate the vanishing gradient problem.
def relu(x):
return np.maximum(0, x)
3. Softmax Function
The softmax function is commonly used in multi-class classification problems. It normalizes the outputs such that the sum of probabilities across all classes adds up to 1. This allows the model to estimate the probability distribution over multiple classes.
def softmax(x):
e_x = np.exp(x - np.max(x))
return e_x / np.sum(e_x)
Loss Functions
Loss functions quantify the error or discrepancy between the predicted outputs and the true targets. By minimizing the loss, the neural network can learn the optimal set of parameters. Here are some commonly used loss functions:
1. Mean Squared Error (MSE)
MSE is a widely used loss function for regression problems. It calculates the average squared difference between the predicted and true values. The goal is to minimize the mean squared error to achieve accurate predictions.
def mse(y_true, y_pred):
return np.mean((y_true - y_pred) ** 2)
2. Binary Cross Entropy
Binary cross entropy is commonly used for binary classification problems. It measures the dissimilarity between the predicted and true class labels. The aim is to minimize the binary cross entropy loss to improve classification accuracy.
def binary_cross_entropy(y_true, y_pred):
epsilon = 1e-15
y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
return -(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))
3. Categorical Cross Entropy
Categorical cross entropy is used for multi-class classification problems. It quantifies the dissimilarity between the predicted and true class probabilities. Minimizing the categorical cross entropy loss helps improve the model's ability to correctly classify multiple classes.
def categorical_cross_entropy(y_true, y_pred):
epsilon = 1e-15
y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
return -np.sum(y_true * np.log(y_pred))
Understanding activation functions and loss functions is crucial for building effective deep neural networks. By choosing the appropriate activation functions and loss functions, you can enhance the performance and accuracy of your models. Experiment with different combinations and see the impact on your deep learning projects.