Isolation Forests: The good, the bad and the ugly

Anomaly detection is one of the main methods behind numerous real life machine learning use cases such as predictive maintenance, network intrusion detection, system health monitoring, fraud detection and novelty detection. Because of the high relevance and sensitivity of many application areas, robustness and reliability are main concerns when designing anomaly detection systems. A good understanding of the mathematical principles behind the algorithms that are used in practice is therefore highly desirable.

In this talk we will investigate a simple algorithm called isolation forest which has gained large popularity over the last decade. Despite of its success, the reasons for the good performance of isolation forest are currently only partially understood. We review some of the recent literature which shows strength and weaknesses of the algorithm and conclude with a few observations which might lead to further research directions.

References

In this series