The Test Set and Early Stopping

As the network learns, connection weights are adjusted so that the network can model general rules that underlie the data. If there are some general rules that apply to a large proportion of the patterns in the dataset, the network will repeatedly see examples of these rules and they will be the first to be learned. Subsequently, it will turn its attention to more specialized rules of which there are fewer examples in the dataset. Once it has learned these rules as well, if training is allowed to continue, the network may start to learn specific samples within the data. This is undesirable for two reasons. Firstly, since these particular patterns may never be seen when the trained network is put to use, any time spent learning them is wasted. Secondly, [Pg.38]

Variation of error in the training set (solid line) and test set (broken line) with epoch. [Pg.39]

This method for preventing overfitting requires that there are enough samples so that both training and test sets are representative of the dataset. In fact, it is desirable to have a third set known as a validation set, which acts as a secondary test of the quality of the network. The reason is that, although the test set is not used to train the network, it is nevertheless used to determine at what point training is stopped, so to this extent the form of the trained network is not completely independent of the test set. [Pg.39]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...