February 27, 2018
Increasing stability of training of Deep CNNs with Stochastic Gradient Descent method. Application to Image classification tasks
Jenny Benois-Pineau, University of Bordeaux
Supervised learning for classification of visual data has been the major approach during the last two decades. With adventure of GPUs, the well-known supervised classifiers such as Artificial Neural Networks (ANN) came up for these problems.
A specific case of ANNs represent the Convolutional Neural Networks (CNN) designed specifically for visual information classification, such as object recognition and localization, visual tracking, saliency prediction and image categorization. On the contrary to usual fully connected Artificial Neural Networks, their main characteristic is the limitation of receptive field of neurons using convolution operation and subsequent data reduction by pooling of features. Stacking convolution , pooling and non-linearity layers in deeper and deeper architectures such classifiers as AlexNet, VGG, GoogleNet, ResNet have been built.
Training of Deep CNNs requires a large amount of labelled data and despite the availability of computational resources represent a heavy task. For parameter optimisation first-order methods such as Gradient descent are used, namely stochastic gradient descent (SGD). The conditions of convergence of these optimizers, i.e. the convexity of objective functions is not guaranteed. This is why different forms of SGD have been proposed.
Still, the optimisation process remains instable and therefore , it is difficult to identify stopping iteration number.
In our talk we will present main principles of Deep CNNs for image classification tasks and develop an approach we propose to smooth objective when training. We study it experimentally and present results on a well-known image database from MNIST.
(with A. Zemmari ( LABRI UMR 5800, University of Bordeaux )