神经网络是最著名且使用最广泛的算法之一。在监督学习中,我们标记了输出数据,我们将预测输出与实际标签进行比较,然后计算误差。在神经网络中,我们以同样的方式定义神经网络体系结构,通过比较实际标签和预测标签来计算误差,然后使用某种优化算法来优化该误差。训练神经网络最广泛使用的算法是反向传播和梯度下降。但是,我们可以使用任何优化算法来训练我们的神经网络模型。今天我们将看看如何在Python中使用Numpy用粒子群优化来训练神经网络模型。
神经网络导入Python库
from sklearn.datasets import load_iris
from pso_numpy import *
import numpy as np
导入sklearn加载Iris flower机器学习数据集,PSO_numpy使用PSO算法,numpy执行神经网络的forward pass。
加载机器学习数据集
#load iris dataset..
data = load_iris()
#Store input & target in X and Y..
X = data.data
Y = data.target
从sklearn加载Iris数据集,并将输入数据分配给X,将目标标签分配给Y。
定义架构
#define no of nodes in each layer..
INPUT_NODES = 4
HIDDEN_NODES = 20
OUTPUT_NODES = 3
在神经网络模型中定义输入,隐藏和输出节点数。
One-hot编码
def one_hot_encode(Y):
"""
create one-hot encoded vectors from target labels(Y).
:param Y: int(N, )
:return: int(N, C)
Returns an array of shape(N, C) where C is number of classes.
"""
num_unique = len(np.unique(np.array(Y)))
zeros = np.zeros((len(Y), num_unique))
zeros[range(len(Y)), Y] = 1
return zeros
如果我们要计算分类交叉熵损失,则使用One-hot编码。将唯一的向量分配给每个目标标签(类)。该函数将Y作为输入,并为每个类返回One-hot编码向量。
Softmax激活
def softmax(logits):
"""
Apply softmax function on logits and return probabilities.
:param logits: double(N, C)
Logits of each instance for each class.
:return: double(N, C)
probability for each class of each instance.
"""
exps = np.exp(logits)
return exps / np.sum(exps, axis=1, keepdims=True)
使用Softmax函数从logits(不应用任何激活的最后一层的输出)计算每个类的概率。
损失函数
def Negative_Likelihood(probs, Y):
"""
Calculates Negative Log Likelihood loss.
:param probs: double(N, C)
Probability of each instance for each class.
:param Y: int(N, )
Integer representation of each class.
:return: double
Returns value of loss calculated.
"""
num_samples = len(probs)
corect_logprobs = -np.log(probs[range(num_samples), Y])
return np.sum(corect_logprobs) / num_samples
def Cross_Entropy(probs, Y):
"""
Calculates Categorical Cross Entropy loss.
:param probs: double(N, C)
Probability of each instance for each class.
:param Y: int(N, C)
One-hot encoded representation of classes.
:return: double
Returns value of loss calculated.
"""
num_samples = len(probs)
ind_loss = np.max(-1 * Y * np.log(probs 1e-12), axis=1)
return np.sum(ind_loss) / num_samples
我们可以根据输入使用这两个损失函数之一。如果我们不将目标标签编码为one hot向量,则使用负对数似然法,如果我们对标签进行编码,则使用分类交叉熵。这这里probs是计算概率,Y是目标类的实际标签。
Forward Pass
def forward_pass(X, Y, W):
"""
Performs forward pass during Neural Net training.
:param X: double(N, F)
X is input where N is number of instances and F is number of features.
:param Y: int(N, ) | int(N, C)
Y is target where N is number of instances and C is number of classes in case of
one-hot encoded target.
:param W: double(N, )
Weights where N is number of total weights(flatten).
:return: double
Returns loss of forward pass.
"""
if isinstance(W, Particle):
W = W.x
w1 = W[0 : INPUT_NODES * HIDDEN_NODES].reshape((INPUT_NODES, HIDDEN_NODES))
b1 = W[INPUT_NODES * HIDDEN_NODES:(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES].reshape((HIDDEN_NODES, ))
w2 = W[(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES:(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES \
(HIDDEN_NODES * OUTPUT_NODES)].reshape((HIDDEN_NODES, OUTPUT_NODES))
b2 = W[(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES (HIDDEN_NODES * OUTPUT_NODES): (INPUT_NODES *\
HIDDEN_NODES) HIDDEN_NODES (HIDDEN_NODES * OUTPUT_NODES) OUTPUT_NODES].reshape((OUTPUT_NODES, ))
z1 = np.dot(X, w1) b1
a1 = np.tanh(z1)
z2 = np.dot(a1, w2) b2
logits = z2
probs = softmax(logits)
return Negative_Likelihood(probs, Y)
#return Cross_Entropy(probs, Y) #used in case of one-hot vector target Y...
该函数执行神经网络的forward pass,使用预测标签和实际标签计算误差,然后将该误差返回给PSO,PSO会优化误差并更新权重。它需要神经网络层中神经元之间的连接的X(输入数据),Y(目标标签)和W(权重)。
预测
def predict(X, W):
"""
Performs forward pass during Neural Net test.
:param X: double(N, F)
X is input where N is number of instances and F is number of features.
:param W: double(N, )
Weights where N is number of total weights(flatten).
:return: int(N, )
Returns predicted classes.
"""
w1 = W[0: INPUT_NODES * HIDDEN_NODES].reshape((INPUT_NODES, HIDDEN_NODES))
b1 = W[INPUT_NODES * HIDDEN_NODES:(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES].reshape((HIDDEN_NODES,))
w2 = W[(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES:(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES \
(HIDDEN_NODES * OUTPUT_NODES)].reshape((HIDDEN_NODES, OUTPUT_NODES))
b2 = W[(INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES (HIDDEN_NODES * OUTPUT_NODES): (INPUT_NODES * \
HIDDEN_NODES) HIDDEN_NODES (HIDDEN_NODES * OUTPUT_NODES) OUTPUT_NODES].reshape((OUTPUT_NODES,))
z1 = np.dot(X, w1) b1
a1 = np.tanh(z1)
z2 = np.dot(a1, w2) b2
logits = z2
probs = softmax(logits)
Y_pred = np.argmax(probs, axis=1)
return Y_pred
需要X(输入)和W(在PSO训练完成后的训练权重)。
准确性
def get_accuracy(Y, Y_pred):
"""
Calcualtes accuracy.
:param Y: int(N, )
Correct labels.
:param Y_pred: int(N, ) | double(N, C)
Predicted labels of shape(N, ) or (N, C) in case of one-hot vector.
:return: double
Accuracy.
"""
return (Y == Y_pred).mean()
#return (np.argmax(Y, axis=1) == Y_pred).mean() #used in case of one-hot vector and loss is Negative Likelihood.
使用实际标签和预测标签作为输入来计算测试数据的准确性。它使用Y(实际标签)和Y_pred(预测标签),然后计算真实预测的数量,然后取其平均值。
运行
if __name__ == '__main__':
no_solution = 100
no_dim = (INPUT_NODES * HIDDEN_NODES) HIDDEN_NODES (HIDDEN_NODES * OUTPUT_NODES) OUTPUT_NODES
w_range = (0.0, 1.0)
lr_range = (0.0, 1.0)
iw_range = (0.9, 0.9) # iw -> inertial weight...
c = (0.5, 0.3) # c[0] -> cognitive factor, c[1] -> social factor...
s = Swarm(no_solution, no_dim, w_range, lr_range, iw_range, c)
#Y = one_hot_encode(Y) #Encode here...
s.optimize(forward_pass, X, Y, 100, 1000)
W = s.get_best_solution()
Y_pred = predict(X, W)
accuracy = get_accuracy(Y, Y_pred)
print("Accuracy: %.3f"% accuracy)
这里我们只定义了no_solution(PSO中的粒子数),no_dim(PSO中每个粒子的维数),w_range(权重范围),lr_range(学习速率范围),iw_range(惯性权重范围)和一个元组c作为cognitive和social参数。然后初始化Swarm并使用forward_pass(orward pass函数)、X(输入)、Y(标签)、print_step(要查看损失的迭代次数)和迭代次数调用优化函数。优化后,调用方法get_best_solution()与Swarm对象一起获得最佳的一对权重。然后传递该对权重以预测并获得输出,最后计算模型的准确性。
结果
正如你所看到的,对于训练集来说,结果很不错。如果PSO的参数值被正确分配和训练了足够多的迭代次数,它也可以为其他应用提供良好的结果。
粒子群优化现在,我们已经完成有关神经网络的讨论,让我们讨论粒子群优化。
粒子
import numpy as np
#This is a PSO(interia weight) variation...
class Particle:
"""
Particle class represents a solution inside a pool(Swarm).
"""
def __init__(self, no_dim, x_range, v_range):
"""
Particle class constructor
:param no_dim: int
No of dimensions.
:param x_range: tuple(double)
Min and Max value(range) of dimension.
:param v_range: tuple(double)
Min and Max value(range) of velocity.
"""
self.x = np.random.uniform(x_range[0], x_range[1], (no_dim, )) #particle position in each dimension...
self.v = np.random.uniform(v_range[0], v_range[1], (no_dim, )) #particle velocity in each dimension...
self.pbest = np.inf
self.pbestpos = np.zeros((no_dim, ))
创建Particle类,该构造函数接受no_dim(维数)、x_range(搜索空间范围)、v_range(每个维数的速度范围)的输入。
群
class Swarm:
"""
Swarm class represents a pool of solution(particle).
"""
def __init__(self, no_particle, no_dim, x_range, v_range, iw_range, c):
"""
Swarm class constructor.
:param no_particle: int
No of particles(solutions).
:param no_dim: int
No of dimensions.
:param x_range: tuple(double)
Min and Max value(range) of dimension.
:param v_range: tuple(double)
Min and Max value(range) of velocity.
:param iw_range: tuple(double)
Min and Max value(range) of interia weight.
:param c: tuple(double)
c[0] -> cognitive parameter, c[1] -> social parameter.
"""
self.p = np.array([Particle(no_dim, x_range, v_range) for i in range(no_particle)])
self.gbest = np.inf
self.gbestpos = np.zeros((no_dim, ))
self.x_range = x_range
self.v_range = v_range
self.iw_range = iw_range
self.c0 = c[0]
self.c1 = c[1]
self.no_dim = no_dim
创建Swarm类,并将参数传递给其构造函数,no_particle(粒子数,其中每个粒子是一对权重的独立解),no_dim(维数,其中维度是指神经网络模型的单个权重或偏差) ,x_range(搜索空间范围或以神经网络为单位的权重范围),v_range(每个维度的速度或每个权重参数的学习率),iw_range(惯性权重范围)和c是cognitive和social因子对。
优化
def optimize(self, function, X, Y, print_step, iter):
"""
optimize is used start optimization.
:param function: function
Function to be optimized.
:param X: input
Used in forward pass.
:param Y: target
Used to calculate loss.
:param print_step: int
Print pause between two adjacent prints.
:param iter: int
No of iterations.
"""
for i in range(iter):
for particle in self.p:
fitness = function(X, Y, particle.x)
if fitness < particle.pbest:
particle.pbest = fitness
particle.pbestpos = particle.x.copy()
if fitness < self.gbest:
self.gbest = fitness
self.gbestpos = particle.x.copy()
for particle in self.p:
#Here iw is inertia weight...
iw = np.random.uniform(self.iw_range[0], self.iw_range[1], 1)[0]
particle.v = iw * particle.v (self.c0 * np.random.uniform(0.0, 1.0, (self.no_dim, )) * \
(particle.pbestpos - particle.x)) (self.c1 * np.random.uniform(0.0, 1.0, (self.no_dim, )) \
* (self.gbestpos - particle.x))
#particle.v = particle.v.clip(min=self.v_range[0], max=self.v_range[1])
particle.x = particle.x particle.v
#particle.x = particle.x.clip(min=self.x_range[0], max=self.x_range[1])
if i % print_step == 0:
print('iteration#: ', i 1, ' loss: ', fitness)
print("global best loss: ", self.gbest)
该optimize函数接受函数(由forward pass计算损失),X(输入),ÿ(标签),print_step(print step to log loss),ITER(迭代数或训练模型的epochs数)。在每个epoch,它将通过X、W和Y来进行函数运算并计算更新粒子的损失。
最优解
def get_best_solution(self):
'''
:return: array of parameters/weights.
'''
return self.gbestpos
通过对函数进行优化,得到最优解。
,