Why is my neural network overfitting?

7 views (last 30 days)
Juan
Juan on 31 Oct 2014
Commented: Greg Heath on 20 Feb 2016
Hello! I'm trying to make a forecast program using neural networks. The training function I'm using is the Bayesian Regularization.
Results are pretty good, but when I see the performance, I notice that the training error has decreased but the Test values didn't.
In fact, when I test the network with additional new values, the results are pretty awful. I believe that this was because the network became overfitted.
My question is,how can I prevent the 'trainbr' function from overfitting? Every time I train the network the error of the values assigned for testing does not decrease.
inputs = tonndata(x,false,false);
targets = tonndata(t,false,false);
net = feedforwardnet([15,13],'trainbr');
net.trainParam.lr = 0.05;
net.trainParam.mc = 0.9;
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};
net.outputs{2}.processFcns = {'removeconstantrows','mapminmax'};
net.divideFcn = 'dividerand';
net.divideMode = 'time';
net.divideParam.trainRatio = 85/100;
net.divideParam.valRatio = 0/100;
net.divideParam.testRatio = 15/100;
net.layers{1}.transferFcn = 'logsig';
net.layers{2}.transferFcn = 'logsig';
net.performFcn = 'mse';
net = train(net,inputs,targets);
outputs = net(inputs);
errors = gsubtract(targets,outputs);
performance = perform(net,targets,outputs)
  1 Comment
Greg Heath
Greg Heath on 20 Feb 2016
What are the sizes of your input and target matrices?
[ I N ]= size(input)
[ O N ]= size(target)
Ntrn = N-round(0.15*N)
Ntrneq = Ntrn*O
You have two hidden node layers. You only need one. No. of unknown weights
Nw = (I+1)*15+(15+1)*13+(13+1)*O
No of training degrees of freedom
Ntrndof = Ntrneq - Nw
Would like this as high as possible
Greg

Sign in to comment.

Accepted Answer

Greg Heath
Greg Heath on 2 Nov 2014
The best approach for regression is to start with FITNET using as many defaults as possible. The default I-H-O node topology contains Nw = (I+1)*H+(H+1)*O unknown weights. Ntrn training examples yields Ntrneq = Ntrn*O training equations with Ntrndof = Ntrneq-Nw training degrees of freedom. The average variance in the training target examples is MSEtrn00 = mean(var(target')). Obtaining a mean-square-error lower than MSEtrngoal = 0.01*Ntrndof*MSEtrn00a/Ntrneq for Ntrndof > 0 results in a normalized DOF adjusted MSE of NMSEtrna <= 0.01 and the corresponding adjusted training Rsquared R2trna = 1-NMSEtrna >= 0.99. That is interpreted as the successful modeling of at least 99% of the variation in the target.
The training objective is to try and minimize H with the constraint R2trna >=0.99. This is usually achieved by trial and error over a double for loop with the outer loop of hidden node candidate values h = Hmin:dH:Hmax and an inner loop of i = 1:Ntrials random weight initializations. I have posted many, many examples. Search NEWSGROUP and ANSWERS using
greg fitnet Ntrials
If Ntrneq < ~2*Nw, validation stopping and/or regularization should be used to mitigate the problem of overtraining and overfit net.
The best approach to avoid overtraining is to use BOTH validation set stopping AND regularization.
HOWEVER FOR SOME STRANGE REASON, using validation stopping with TRAINBR is NOT AVAILABLE IN THE NNTOOLBOX !!!
Your choice of TRAINBR instead of FITNET is not wrong. However, you have made numerous errors, especially by not accepting as many defaults as possible.
Why not just use the syntax in
help trainbr
doc trainbr
with the double loop approach?
Don't forget to initialize the RNG before the first loop so that you can duplicate results.
Hope this helps.
Thank you for formally accepting my answer
Greg

More Answers (1)

Greg Heath
Greg Heath on 31 Oct 2014
Do not use feedforwardnet for forcasting the future.
Use a timeseries net with net.divideFcn = 'divideblock';
Use nncorr to determine significant target-target feedback delays and input-target delays.
Search on
greg narxnet % Note the use of the 'biased' nncorr option to overcome a bug
Hope this helps.
Thank you for formally accepting my answer*
Greg
  3 Comments
Greg Heath
Greg Heath on 1 Nov 2014
In order to test your overfitting hypothesis vary the number of hidden nodes and compare regularization with validation stopping.
To understand nncorr, search NEWSGROUP and ANSWERS using
nncorr
or, better
greg nncorr
Alternate correlation functions are in other toolboxes which I do not have. Use the matlab commands
lookfor autocorrelation
lookfor crosscorrelation
You can also take the inverse fft of the spectral densities. Search on
fft correlation
Hope this helps.
Greg
Juan
Juan on 1 Nov 2014
Edited: Juan on 1 Nov 2014
Hello Greg,
Thank you for your answer. Indeed when I train my function with 'trainlm' the performance is worse.
Do you have any idea why the Bayesian Regularization is doing this?
Regards
Juan

Sign in to comment.

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!