Název:
Explaining Anomalies with Sapling Random Forests
Autoři:
Pevný, T. ; Kopp, Martin Typ dokumentu: Příspěvky z konference Konference/Akce: ITAT 2014. European Conference on Information Technologies - Applications and Theory /14./, Demänovská dolina (SK), 2014-09-25 / 2014-09-29
Rok:
2014
Jazyk:
eng
Abstrakt: The main objective of anomaly detection algorithms is finding samples deviating from the majority. Although a vast number of algorithms designed for this already exist, almost none of them explain, why a particular sample was labelled as an anomaly. To address this issue, we propose an algorithm called Explainer, which returns the explanation of sample’s differentness in disjunctive normal form (DNF), which is easy to understand by humans. Since Explainer treats anomaly detection algorithms as black-boxes, it can be applied in many domains to simplify investigation of anomalies. The core of Explainer is a set of specifically trained trees, which we call sapling random forests. Since their training is fast and memory efficient, the whole algorithm is lightweight and applicable to large databases, datastreams, and real-time problems. The correctness of Explainer is demonstrated on a wide range of synthetic and real world datasets.
Klíčová slova:
anomaly explanation; decision trees; feature selection; random forest Číslo projektu: GA13-17187S (CEP), GPP103/12/P514 (CEP) Poskytovatel projektu: GA ČR, GA ČR Zdrojový dokument: ITAT 2014. Information Technologies - Applications and Theory. Part II, ISBN 978-80-87136-19-5
Instituce: Ústav informatiky AV ČR
(web)
Informace o dostupnosti dokumentu:
Dokument je dostupný v repozitáři Akademie věd. Původní záznam: http://hdl.handle.net/11104/0236783