Machine learning for anomaly detection. Performance study considering anomaly distribution in an imbalanced dataset

Published in IEEE, 2021

Recommended citation: El Hajjami, S., Malki, J., Berrada, M., & Fourka, B. (2020, November). Machine learning for anomaly detection. performance study considering anomaly distribution in an imbalanced dataset. In 2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech) (pp. 1-8). IEEE. http://doi.org/10.1109/CloudTech49835.2020.9365887

Abstract : The continuous dematerialization of real-world data greatly contributes to the increase in the volume of data exchanged. In this case, anomaly detection is increasingly becoming an important task of data analysis in order to detect abnormal data, which is of particular interest and may require action. Recent advances in artificial intelligence approaches, such as machine learning, are making an important breakthrough in this area. Typically, these techniques have been designed for balanced data sets or that have certain assumptions about the distribution of data. However, the real applications are rather confronted with an imbalanced data distribution, where normal data are present in large quantities and abnormal cases are generally very few. This makes anomaly detection similar to looking for the needle in a haystack. In this article, we develop an experimental setup for comparative analysis of two types of machine learning techniques in their application to anomaly detection systems. We study their performance taking into account anomaly distribution in an imbalanced dataset.

Keywords: Anomaly Detection; Data Analysis; Artificial Intelligence; Machine Learning; Imbalanced Data.

Recommended citation: El Hajjami, S., Malki, J., Berrada, M., & Fourka, B. (2020, November). Machine learning for anomaly detection. performance study considering anomaly distribution in an imbalanced dataset. In 2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech) (pp. 1-8). IEEE.