Machine learning for anomaly detection. performance study considering anomaly distribution in an imbalanced dataset

Date:

Abstract : The continuous dematerialization of real-world data greatly contributes to the important growing of the exchanged data. In this case, anomaly detection is increasingly becoming an important task of data analysis in order to detect abnormal data, which is of particular interest and may require action. Recent advances in artificial intelligence approaches, such as machine learning, are making an important breakthrough in this area. Typically, these techniques have been designed for balanced data sets or that have certain assumptions about the distribution of data. However, the real applications are rather confronted with an imbalanced data distribution, where normal data are present in large quantities and abnormal cases are generally very few. This makes anomaly detection similar to looking for the needle in a haystack. In this article, we develop an experimental setup for comparative analysis of two types of machine learning techniques in their application to anomaly detection systems. We study their performance taking into account anomaly distribution in an imbalanced dataset.

Date of Conference: 24-26 Nov. 2020 Date Added to IEEE Xplore: 02 March 2021

ISBN Information: Electronic ISBN:978-1-7281-6175-4 Print on Demand(PoD) ISBN:978-1-7281-6176-1

INSPEC Accession Number: 20509523 DOI: 10.1109/CloudTech49835.2020.9365887 Publisher: IEEE Conference Location: Marrakesh, Morocco