粒子群优化深度神经网络(使用粒子群优化)(1)

神经网络是最著名且使用最广泛的算法之一。在监督学习中,我们标记了输出数据,我们将预测输出与实际标签进行比较,然后计算误差。在神经网络中,我们以同样的方式定义神经网络体系结构,通过比较实际标签和预测标签来计算误差,然后使用某种优化算法来优化该误差。训练神经网络最广泛使用的算法是反向传播和梯度下降。但是,我们可以使用任何优化算法来训练我们的神经网络模型。今天我们将看看如何在Python中使用Numpy用粒子群优化来训练神经网络模型。

神经网络

导入Python库

from sklearn.datasets import load_iris from pso_numpy import * import numpy as np

导入sklearn加载Iris flower机器学习数据集,PSO_numpy使用PSO算法,numpy执行神经网络的forward pass。

加载机器学习数据集

#load iris dataset.. data = load_iris() #Store input & target in X and Y.. X = data.data Y = data.target

从sklearn加载Iris数据集,并将输入数据分配给X,将目标标签分配给Y。

定义架构

#define no of nodes in each layer.. INPUT_NODES = 4 HIDDEN_NODES = 20 OUTPUT_NODES = 3

在神经网络模型中定义输入,隐藏和输出节点数。

One-hot编码

def one_hot_encode(Y): """ create one-hot encoded vectors from target labels(Y). :param Y: int(N, ) :return: int(N, C) Returns an array of shape(N, C) where C is number of classes. """ num_unique = len(np.unique(np.array(Y))) zeros = np.zeros((len(Y), num_unique)) zeros[range(len(Y)), Y] = 1 return zeros

如果我们要计算分类交叉熵损失,则使用One-hot编码。将唯一的向量分配给每个目标标签(类)。该函数将Y作为输入,并为每个类返回One-hot编码向量。

粒子群优化深度神经网络(使用粒子群优化)(2)

Softmax激活

def softmax(logits): """ Apply softmax function on logits and return probabilities. :param logits: double(N, C) Logits of each instance for each class. :return: double(N, C) probability for each class of each instance. """ exps = np.exp(logits) return exps / np.sum(exps, axis=1, keepdims=True)

使用Softmax函数从logits(不应用任何激活的最后一层的输出)计算每个类的概率。

粒子群优化深度神经网络(使用粒子群优化)(3)

损失函数

def Negative_Likelihood(probs, Y): """ Calculates Negative Log Likelihood loss. :param probs: double(N, C) Probability of each instance for each class. :param Y: int(N, ) Integer representation of each class. :return: double Returns value of loss calculated. """ num_samples = len(probs) corect_logprobs = -np.log(probs[range(num_samples), Y]) return np.sum(corect_logprobs) / num_samples def Cross_Entropy(probs, Y): """ Calculates Categorical Cross Entropy loss. :param probs: double(N, C) Probability of each instance for each class. :param Y: int(N, C) One-hot encoded representation of classes. :return: double Returns value of loss calculated. """ num_samples = len(probs) ind_loss = np.max(-1 * Y * np.log(probs 1e-12), axis=1) return np.sum(ind_loss) / num_samples

我们可以根据输入使用这两个损失函数之一。如果我们不将目标标签编码为one hot向量,则使用负对数似然法,如果我们对标签进行编码,则使用分类交叉熵。这这里probs是计算概率,Y是目标类的实际标签。

Forward Pass

def forward_pass(X, Y, W): """ Performs forward pass during Neural Net training. :param X: double(N, F) X is input where N is number of instances and F is number of features. :param Y: int(N, ) | int(N, C) Y is target where N is number of instances and C is number of classes in case of one-hot encoded target. :param W: double(N, ) Weights where N is number of total weights(flatten). :return: double Returns loss of forward pass. """ if isinstance(W, Particle): W = W.x w1 = W[0 : INPUT_NODES * HIDDEN_NODES].reshape((INPUT_NODES, HIDDEN_NODES)) b1 = W[INPUT_NODES * HIDDEN_NODES:(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES].reshape((HIDDEN_NODES, )) w2 = W[(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES:(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES \ (HIDDEN_NODES * OUTPUT_NODES)].reshape((HIDDEN_NODES, OUTPUT_NODES)) b2 = W[(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES (HIDDEN_NODES * OUTPUT_NODES): (INPUT_NODES *\ HIDDEN_NODES) HIDDEN_NODES (HIDDEN_NODES * OUTPUT_NODES) OUTPUT_NODES].reshape((OUTPUT_NODES, )) z1 = np.dot(X, w1) b1 a1 = np.tanh(z1) z2 = np.dot(a1, w2) b2 logits = z2 probs = softmax(logits) return Negative_Likelihood(probs, Y) #return Cross_Entropy(probs, Y) #used in case of one-hot vector target Y...

该函数执行神经网络的forward pass,使用预测标签和实际标签计算误差,然后将该误差返回给PSO,PSO会优化误差并更新权重。它需要神经网络层中神经元之间的连接的X(输入数据),Y(目标标签)和W(权重)。

预测

def predict(X, W): """ Performs forward pass during Neural Net test. :param X: double(N, F) X is input where N is number of instances and F is number of features. :param W: double(N, ) Weights where N is number of total weights(flatten). :return: int(N, ) Returns predicted classes. """ w1 = W[0: INPUT_NODES * HIDDEN_NODES].reshape((INPUT_NODES, HIDDEN_NODES)) b1 = W[INPUT_NODES * HIDDEN_NODES:(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES].reshape((HIDDEN_NODES,)) w2 = W[(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES:(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES \ (HIDDEN_NODES * OUTPUT_NODES)].reshape((HIDDEN_NODES, OUTPUT_NODES)) b2 = W[(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES (HIDDEN_NODES * OUTPUT_NODES): (INPUT_NODES * \ HIDDEN_NODES) HIDDEN_NODES (HIDDEN_NODES * OUTPUT_NODES) OUTPUT_NODES].reshape((OUTPUT_NODES,)) z1 = np.dot(X, w1) b1 a1 = np.tanh(z1) z2 = np.dot(a1, w2) b2 logits = z2 probs = softmax(logits) Y_pred = np.argmax(probs, axis=1) return Y_pred

需要X(输入)和W(在PSO训练完成后的训练权重)。

准确性

def get_accuracy(Y, Y_pred): """ Calcualtes accuracy. :param Y: int(N, ) Correct labels. :param Y_pred: int(N, ) | double(N, C) Predicted labels of shape(N, ) or (N, C) in case of one-hot vector. :return: double Accuracy. """ return (Y == Y_pred).mean() #return (np.argmax(Y, axis=1) == Y_pred).mean() #used in case of one-hot vector and loss is Negative Likelihood.

使用实际标签和预测标签作为输入来计算测试数据的准确性。它使用Y(实际标签)和Y_pred(预测标签),然后计算真实预测的数量,然后取其平均值。

运行

if __name__ == '__main__': no_solution = 100 no_dim = (INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES (HIDDEN_NODES * OUTPUT_NODES) OUTPUT_NODES w_range = (0.0, 1.0) lr_range = (0.0, 1.0) iw_range = (0.9, 0.9) # iw -> inertial weight... c = (0.5, 0.3) # c[0] -> cognitive factor, c[1] -> social factor... s = Swarm(no_solution, no_dim, w_range, lr_range, iw_range, c) #Y = one_hot_encode(Y) #Encode here... s.optimize(forward_pass, X, Y, 100, 1000) W = s.get_best_solution() Y_pred = predict(X, W) accuracy = get_accuracy(Y, Y_pred) print("Accuracy: %.3f"% accuracy)

这里我们只定义了no_solution(PSO中的粒子数),no_dim(PSO中每个粒子的维数),w_range(权重范围),lr_range(学习速率范围),iw_range(惯性权重范围)和一个元组c作为cognitive和social参数。然后初始化Swarm并使用forward_pass(orward pass函数)、X(输入)、Y(标签)、print_step(要查看损失的迭代次数)和迭代次数调用优化函数。优化后,调用方法get_best_solution()与Swarm对象一起获得最佳的一对权重。然后传递该对权重以预测并获得输出,最后计算模型的准确性。

结果

粒子群优化深度神经网络(使用粒子群优化)(4)

正如你所看到的,对于训练集来说,结果很不错。如果PSO的参数值被正确分配和训练了足够多的迭代次数,它也可以为其他应用提供良好的结果。

粒子群优化

现在,我们已经完成有关神经网络的讨论,让我们讨论粒子群优化。

粒子

import numpy as np #This is a PSO(interia weight) variation... class Particle: """ Particle class represents a solution inside a pool(Swarm). """ def __init__(self, no_dim, x_range, v_range): """ Particle class constructor :param no_dim: int No of dimensions. :param x_range: tuple(double) Min and Max value(range) of dimension. :param v_range: tuple(double) Min and Max value(range) of velocity. """ self.x = np.random.uniform(x_range[0], x_range[1], (no_dim, )) #particle position in each dimension... self.v = np.random.uniform(v_range[0], v_range[1], (no_dim, )) #particle velocity in each dimension... self.pbest = np.inf self.pbestpos = np.zeros((no_dim, ))

创建Particle类,该构造函数接受no_dim(维数)、x_range(搜索空间范围)、v_range(每个维数的速度范围)的输入。

class Swarm: """ Swarm class represents a pool of solution(particle). """ def __init__(self, no_particle, no_dim, x_range, v_range, iw_range, c): """ Swarm class constructor. :param no_particle: int No of particles(solutions). :param no_dim: int No of dimensions. :param x_range: tuple(double) Min and Max value(range) of dimension. :param v_range: tuple(double) Min and Max value(range) of velocity. :param iw_range: tuple(double) Min and Max value(range) of interia weight. :param c: tuple(double) c[0] -> cognitive parameter, c[1] -> social parameter. """ self.p = np.array([Particle(no_dim, x_range, v_range) for i in range(no_particle)]) self.gbest = np.inf self.gbestpos = np.zeros((no_dim, )) self.x_range = x_range self.v_range = v_range self.iw_range = iw_range self.c0 = c[0] self.c1 = c[1] self.no_dim = no_dim

创建Swarm类,并将参数传递给其构造函数,no_particle(粒子数,其中每个粒子是一对权重的独立解),no_dim(维数,其中维度是指神经网络模型的单个权重或偏差) ,x_range(搜索空间范围或以神经网络为单位的权重范围),v_range(每个维度的速度或每个权重参数的学习率),iw_range(惯性权重范围)和c是cognitive和social因子对。

优化

def optimize(self, function, X, Y, print_step, iter): """ optimize is used start optimization. :param function: function Function to be optimized. :param X: input Used in forward pass. :param Y: target Used to calculate loss. :param print_step: int Print pause between two adjacent prints. :param iter: int No of iterations. """ for i in range(iter): for particle in self.p: fitness = function(X, Y, particle.x) if fitness < particle.pbest: particle.pbest = fitness particle.pbestpos = particle.x.copy() if fitness < self.gbest: self.gbest = fitness self.gbestpos = particle.x.copy() for particle in self.p: #Here iw is inertia weight... iw = np.random.uniform(self.iw_range[0], self.iw_range[1], 1)[0] particle.v = iw * particle.v (self.c0 * np.random.uniform(0.0, 1.0, (self.no_dim, )) * \ (particle.pbestpos - particle.x)) (self.c1 * np.random.uniform(0.0, 1.0, (self.no_dim, )) \ * (self.gbestpos - particle.x)) #particle.v = particle.v.clip(min=self.v_range[0], max=self.v_range[1]) particle.x = particle.x particle.v #particle.x = particle.x.clip(min=self.x_range[0], max=self.x_range[1]) if i % print_step == 0: print('iteration#: ', i 1, ' loss: ', fitness) print("global best loss: ", self.gbest)

该optimize函数接受函数(由forward pass计算损失),X(输入),ÿ(标签),print_step(print step to log loss),ITER(迭代数或训练模型的epochs数)。在每个epoch,它将通过X、W和Y来进行函数运算并计算更新粒子的损失。

最优解

def get_best_solution(self): ''' :return: array of parameters/weights. ''' return self.gbestpos

通过对函数进行优化,得到最优解。

,