Evaluation of Regularization Techniques for Transformers-Based Models

Hugo S. Oliveira, Pedro Ribeiro and Helder P. Oliveira

2023

Abstract

In recent years the great success of transformers-based models initially employed in Natural Language (NLP) tasks has led to the development of several transformers variations to be employed in a wide range of domains, such as vision. With the correct amount of training data and proper training, transformers can perform excellently compared to the Convolution Neural Networks (CNN) counterpart in the vision tasks. However, the main drawback of transformers concerns the know memory requirements that often exceed the available training platform, growing in a quadratic form regarding the input image size, and a great tendency to overfit. Several works address the memory problem by relaxing the model architecture versions, but mainly with reduced prediction capabilities. In this work, we evaluate Random Patch erasing among the image patch level of the transformer model as a regularization technique to reduce ovefitting while at the same time alleviating training time. The evaluated regularization technique achieves competitive results on several image classification medical datasets. The evaluated Visual Transformers (ViT) models allow to be trained in a single GPU, reaching similar results to CNN counterparts, obtaining an accuracy 91.2%, 79.2% in two competitive image datasets, and reducing the training time on average by 22% on the transformers models.

Keywords

Transformers; Regularization; Vision Transformers

Digital Object Identifier (DOI)

doi 10.1007/978-3-031-36616-1_25

Journal/Conference/Book

11th Iberian Conference on Pattern Recognition and Image Analysis

Reference (text)

Hugo S. Oliveira, Pedro Ribeiro and Helder P. Oliveira. Evaluation of Regularization Techniques for Transformers-Based Models. Proceedings of the 11th Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA), pp. 312-319, Springer, Alicante, Spain, June, 2023.

Bibtex

@inproceedings{ribeiro-IbPRIA23,
  author = {Hugo S. Oliveira and  Pedro Ribeiro and Helder P. Oliveira},
  title = {Evaluation of Regularization Techniques for Transformers-Based Models},
  doi = {10.1007/978-3-031-36616-1_25},
  booktitle = {11th Iberian Conference on Pattern Recognition and Image Analysis},
  pages = {312-319},
  publisher = {Springer},
  month = {June},
  year = {2023}
}