Robustness test-time augmentation via learnable aggregation and anomaly detection

What is it about?

Test-time augmentation (TTA) has become a widely adopted technique in the computer vision field, which can improve the prediction performance of models by aggregating the predictions of multiple augmented test samples without additional training or hyperparameter tuning. While previous research has demonstrated the effectiveness of TTA in visual tasks, its application in natural language processing (NLP) tasks remains challenging due to complexities such as varying text lengths, discretization of word elements, and missing word elements. These unfavorable factors make it difficult to preserve the label invariance of the standard TTA method for augmented text samples. Therefore, this paper proposes a novel TTA technique called Defy, which combines nearest-neighbor anomaly detection algorithm and an adaptive weighting network architecture with a bidirectional KL divergence entropy regularization term between the original sample and the aggregated sample, to encourage the model to make more consistent and reliable predictions for various augmented samples. Additionally, by comparing with Defy, the paper further explores the problem that common TTA methods may impair the semantic meaning of the text during augmentation, leading to a shift in the model's prediction results from correct to corrupt. Extensive experimental results demonstrate that Defy consistently outperforms existing TTA methods in various text classification tasks and brings consistent improvements across different mainstream models.

Why is it important?

The following have contributed to this page:

Yu Xiang