说明:1.1节主要是概括和帮助理解考虑影响因素的BP神经网络算法原理,即常规的BP模型训练原理讲解(可根据自身掌握的知识是否跳过)。1.2节开始讲基于历史值影响的BP神经网络预测模型。
使用BP神经网络进行预测时,从考虑的输入指标角度,主要有两类模型:
如图一所示,使用MATLAB的newff函数训练BP时,可以看到大部分情况是三层的神经网络(即输入层,隐含层,输出层)。这里帮助理解下神经网络原理:
1)输入层:相当于人的五官,五官获取外部信息,对应神经网络模型input端口接收输入数据的过程。
2)隐含层:对应人的大脑,大脑对五官传递来的数据进行分析和思考,神经网络的隐含层hidden Layer对输入层传来的数据x进行映射,简单理解为一个公式hiddenLayer_output=F(w*x+b)。其中,w、b叫做权重、阈值参数,F()为映射规则,也叫激活函数,hiddenLayer_output是隐含层对于传来的数据映射的输出值。换句话说,隐含层对于输入的影响因素数据x进行了映射,产生了映射值。
3)输出层:可以对应为人的四肢,大脑对五官传来的信息经过思考(隐含层映射)之后,再控制四肢执行动作(向外部作出响应)。类似地,BP神经网络的输出层对hiddenLayer_output再次进行映射,outputLayer_output=w *hiddenLayer_output+b。其中,w、b为权重、阈值参数,outputLayer_output是神经网络输出层的输出值(也叫仿真值、预测值)(理解为,人脑对外的执行动作,比如婴儿拍打桌子)。
4)梯度下降算法:通过计算outputLayer_output和神经网络模型传入的y值之间的偏差,使用算法来相应调整权重和阈值等参数。这个过程,可以理解为婴儿拍打桌子,打偏了,根据偏离的距离远近,来调整身体使得再次挥动的胳膊不断靠近桌子,最终打中。
再举个例子来加深理解:
图一所示BP神经网络,具备输入层、隐含层和输出层。BP是如何通过这三层结构来实现输出层的输出值outputLayer_output,不断逼近给定的y值,从而训练得到一个精准的模型的呢?
从图中串起来的端口,可以想到一个过程:坐地铁,将图一想象为一条地铁线路。王某某坐地铁回家的一天:在input起点站上车,中途经过了很多站(hiddenLayer),然后发现坐过头了(outputLayer对应现在的位置),那么王某某将会根据现在的位置离家(目标Target)的距离(误差Error),返回到中途的地铁站(hiddenLayer)重新坐地铁(误差反向传递,使用梯度下降算法更新w和b),如果王某某又一次发生失误,那么将再次进行这个调整的过程。
从在婴儿拍打桌子和王某某坐地铁的例子中,思考问题:BP的完整训练,需要先传入数据给input,再经过隐含层的映射,输出层得到BP仿真值,根据仿真值与目标值的误差,来调整参数,使得仿真值不断逼近目标值。比如(1)婴儿受到了外界的干扰因素(x),从而作出反应拍桌(predict),大脑不断的调整胳膊位置,控制四肢拍准(y、Target)。(2)王某某上车点(x),过站点(predict),不断返回中途站来调整位置,到家(y、Target)。
在这些环节中,涉及了影响因素数据x,目标值数据y(Target)。根据x,y,使用BP算法来寻求x与y之间存在的规律,实现由x来映射逼近y,这就是BP神经网络算法的作用。再多说一句,上述讲的过程,都是BP模型训练,那么最终得到的模型虽然训练准确,但是找到的规律(bp network)是否准确与可靠呢。于是,我们再给x1到训练好的bp network中,得到相应的BP输出值(预测值)predict1,通过作图,计算Mse,Mape,R方等指标,来对比predict1和y1的接近程度,就可以知道模型是否预测准确。这是BP模型的测试过程,即实现对数据的预测,并且对比实际值检验预测是否准确。
图一 3层BP神经网络结构图
以电力负荷预测问题为例,进行两种模型的区分。在预测某个时间段内的电力负荷时:
一种做法,是考虑 t 时刻的气候因素指标,比如该时刻的空气湿度x1,温度x2,以及节假日x3等的影响,对 t 时刻的负荷值进行预测。这是前面1.1所说的模型。
另一种做法,是认为电力负荷值的变化,与时间相关,比如认为t-1,t-2,t-3时刻的电力负荷值与t时刻的负荷值有关系,即满足公式y(t)=F(y(t-1),y(t-2),y(t-3))。采用BP神经网络进行训练模型时,则输入到神经网络的影响因素值为历史负荷值y(t-1),y(t-2),y(t-3),特别地,3叫做自回归阶数或者延迟。给到神经网络中的目标输出值为y(t)。
% ------------------------------------------------------------------------- % Chicken Swarm Optimization (CSO) (demo) % Programmed by Xian-Bing Meng % Updated at Jun 21, 2015. % Email: x.b.meng12@gmail.com % % This is a simple demo version only implemented the basic idea of CSO for % solving the unconstrained problem, namely Sphere function. % The details about CSO are illustratred in the following paper. % Xian-Bing Meng, et al. A new bio-inspired algorithm: Chicken Swarm % Optimization. The Fifth International Conference on Swarm Intelligence % % The parameters in CSO are presented as follows. % FitFunc % The objective function % M % Maxmimal generations (iterations) % pop % Population size % dim % Dimension % G % How often the chicken swarm can be updated. % rPercent % The population size of roosters accounts for "rPercent" % percent of the total population size % hPercent % The population size of hens accounts for "hPercent" percent % of the total population size % mPercent % The population size of mother hens accounts for "mPercent" % percent of the population size of hens % % Using the default value,CSO can be executed using the following code. % [ bestX, fMin ] = CSO % ------------------------------------------------------------------------- %************************************************************************* % Revision 1 % Revised at May 23, 2015 % 1.Note that the previous version of CSO doen't consider the situation % that there maybe not exist hens in a group. % We assume there exist at least one hen in each group. % Revision 2 % Revised at Jun 24, 2015 % 1.Correct an error at line "100". %************************************************************************* % Main programs function [ bestX, fMin ] = CSO( FitFunc, M, pop, dim, G, rPercent, hPercent, mPercent ) % Display help help CSO.m %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % set the default parameters if nargin < 1 FitFunc = @Sphere; M = 1000; pop = 100; dim = 20; G = 10; rPercent = 0.15; hPercent = 0.7; mPercent = 0.5; end rNum = round( pop * rPercent ); % The population size of roosters hNum = round( pop * hPercent ); % The population size of hens cNum = pop - rNum - hNum; % The population size of chicks mNum = round( hNum * mPercent ); % The population size of mother hens lb= -100*ones( 1,dim ); % Lower bounds ub= 100*ones( 1,dim ); % Upper bounds %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Initialization for i = 1 : pop x( i, : ) = lb + (ub - lb) .* rand( 1, dim ); fit( i ) = FitFunc( x( i, : ) ); end pFit = fit; % The individual's best fitness value pX = x; % The individual's best position corresponding to the pFit [ fMin, bestIndex ] = min( fit ); % fMin denotes the global optimum % bestX denotes the position corresponding to fMin bestX = x( bestIndex, : ); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Start the iteration. for t = 1 : M % This parameter is to describe how closely the chicks would follow % their mother to forage for food. In fact, there exist cNum chicks, % thus only cNum values of FL would be used. FL = rand( pop, 1 ) .* 0.4 + 0.5; % The chicken swarm'status about hierarchal order, dominance % relationship, mother-child relationship, the roosters, hens and the % chicks in a group will remain stable during a period of time. These % statuses can be updated every several (G) time steps.The parameter G % is used to simulate the situation that the chicken swarm have been % changed, including some chickens have died, or the chicks have grown % up and became roosters or hens, some mother hens have hatched new % offspring (chicks) and so on. if mod( t, G ) == 1 || t == 1 [ ans, sortIndex ] = sort( pFit ); % How the chicken swarm can be divided into groups and the identity % of the chickens (roosters, hens and chicks) can be determined all % depend on the fitness values of the chickens themselves. Hence we % use sortIndex(i) to describe the chicken, not the index i itself. motherLib = randperm( hNum, mNum ) + rNum; % Randomly select mNum hens which would be the mother hens. % We assume that all roosters are stronger than the hens, likewise, % hens are stronger than the chicks.In CSO, the strong is reflected % by the good fitness value. Here, the optimization problems is % minimal ones, thus the more strong ones correspond to the ones % with lower fitness values. % Given the fact the 1 : rNum chickens' fitness values maybe not % the best rNum ones. Thus we use sortIndex( 1 : rNum ) to describe % the roosters. In turn, sortIndex( (rNum + 1) :(rNum + 1 + hNum )) % to describle the mother hens, .....chicks. % Here motherLib include all the mother hens. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Randomly select each hen's mate, rooster. Hence we can determine % which group each hen inhabits using "mate".Each rooster stands % for a group.For simplicity, we assume that there exist only one % rooster and at least one hen in each group. mate = randpermF( rNum, hNum ); % Randomly select cNum chicks' mother hens mother = motherLib( randi( mNum, cNum, 1 ) ); end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% for i = 1 : rNum % Update the rNum roosters' values. % randomly select another rooster different from the i (th) one. anotherRooster = randiTabu( 1, rNum, i, 1 ); if( pFit( sortIndex( i ) ) <= pFit( sortIndex( anotherRooster ) ) ) tempSigma = 1; else tempSigma = exp( ( pFit( sortIndex( anotherRooster ) ) - ... pFit( sortIndex( i ) ) ) / ( abs( pFit( sortIndex(i) ) )... + realmin ) ); end x( sortIndex( i ), : ) = pX( sortIndex( i ), : ) .* ( 1 + ... tempSigma .* randn( 1, dim ) ); x( sortIndex( i ), : ) = Bounds( x( sortIndex( i ), : ), lb, ub ); fit( sortIndex( i ) ) = FitFunc( x( sortIndex( i ), : ) ); end for i = ( rNum + 1 ) : ( rNum + hNum ) % Update the hNum hens' values. other = randiTabu( 1, i, mate( i - rNum ), 1 ); % randomly select another chicken different from the i (th) % chicken's mate. Note that the "other" chicken's fitness value % should be superior to that of the i (th) chicken. This means the % i (th) chicken may steal the better food found by the "other" % (th) chicken. c1 = exp( ( pFit( sortIndex( i ) ) - pFit( sortIndex( mate( i - ... rNum ) ) ) )/ ( abs( pFit( sortIndex(i) ) ) + realmin ) ); c2 = exp( ( -pFit( sortIndex( i ) ) + pFit( sortIndex( other ) ))); x( sortIndex( i ), : ) = pX( sortIndex( i ), : ) + ( pX(... sortIndex( mate( i - rNum ) ), : )- pX( sortIndex( i ), : ) )... .* c1 .* rand( 1, dim ) + ( pX( sortIndex( other ), : ) - ... pX( sortIndex( i ), : ) ) .* c2 .* rand( 1, dim ); x( sortIndex( i ), : ) = Bounds( x( sortIndex( i ), : ), lb, ub ); fit( sortIndex( i ) ) = FitFunc( x( sortIndex( i ), : ) ); end for i = ( rNum + hNum + 1 ) : pop % Update the cNum chicks' values. x( sortIndex( i ), : ) = pX( sortIndex( i ), : ) + ( pX( ... sortIndex( mother( i - rNum - hNum ) ), : ) - ... pX( sortIndex( i ), : ) ) .* FL( i ); x( sortIndex( i ), : ) = Bounds( x( sortIndex( i ), : ), lb, ub ); fit( sortIndex( i ) ) = FitFunc( x( sortIndex( i ), : ) ); end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Update the individual's best fitness vlaue and the global best one for i = 1 : pop if ( fit( i ) < pFit( i ) ) pFit( i ) = fit( i ); pX( i, : ) = x( i, : ); end if( pFit( i ) < fMin ) fMin = pFit( i ); bestX = pX( i, : ); end end end % End of the main program %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % The following functions are associated with the main program %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % This function is the objective function function y = Sphere( x ) y = sum( x .^ 2 ); % Application of simple limits/bounds function s = Bounds( s, Lb, Ub) % Apply the lower bound vector temp = s; I = temp < Lb; temp(I) = Lb(I); % Apply the upper bound vector J = temp > Ub; temp(J) = Ub(J); % Update this new move s = temp; %-------------------------------------------------------------------------- % This function generate "dim" values, all of which are different from % the value of "tabu" function value = randiTabu( min, max, tabu, dim ) value = ones( dim, 1 ) .* max .* 2; num = 1; while ( num <= dim ) temp = randi( [min, max], 1, 1 ); if( length( find( value ~= temp ) ) == dim && temp ~= tabu ) value( num ) = temp; num = num + 1; end end %-------------------------------------------------------------------------- function result = randpermF( range, dim ) % The original function "randperm" in Matlab is only confined to the % situation that dimension is no bigger than dim. This function is % applied to solve that situation. temp = randperm( range, range ); temp2 = randi( range, dim, 1 ); index = randperm( dim, ( dim - range ) ); result = [ temp, temp2( index )' ];
图2鸡群算法收敛曲线
测试统计如下表所示
测试结果 | 测试集正确率 | 训练集正确率 |
---|---|---|
BP神经网络 | 100% | 95% |
CSO-BP | 100% | 99.8% |
《基于BP神经网络的宁夏水资源需求量预测》