|
Java Byte Code Emulator Suitable for Malware Detection and Analysis
Kubernát, Tomáš ; Rogalewicz, Adam (referee) ; Drahanský, Martin (advisor)
The goal of this thesis is to create a virtual machine that emulates a running programs written in Java programing language, which would be suitable for malware analysis and detection. The emulator is able to detect arguments of exploitable methods from Java standard classes, the order of calling these exploitable methods and also execution the test application. Overall functionality was tested on appropriate examples in which held its own measurements. At the end of the paper we describe testing of the emulator, which also contains tables and graphs for better results visualization.
|
|
Java Byte Code Emulator Suitable for Malware Detection and Analysis
Kubernát, Tomáš ; Rogalewicz, Adam (referee) ; Drahanský, Martin (advisor)
The goal of this thesis is to create a virtual machine that emulates a running programs written in Java programing language, which would be suitable for malware analysis and detection. The emulator is able to detect arguments of exploitable methods from Java standard classes, the order of calling these exploitable methods and also execution the test application. Overall functionality was tested on appropriate examples in which held its own measurements. At the end of the paper we describe testing of the emulator, which also contains tables and graphs for better results visualization.
|
| |
|
Feature extraction from Android application packages and its usage in machine learning for malware classification
Smrž, Dominik ; Bálek, Martin (advisor) ; Kofroň, Jan (referee)
In this Thesis, we propose a machine-learning based classification algorithm of applications for a popular mobile phone operating system Android that can dis- tinguish malicious samples from benign ones. Feature extraction for the machine learning is based on static analysis of the application's bytecode with focus on API and method calls. We show various ways to transform the most frequent API and method calls into numeric (histogram-based) features. We further examine the specifics of the extracted features and discuss their importance. The dataset used for experiments in this Thesis contains more than 200,000 samples with approxi- mately half of them malicious and half of them benign. Further, multiple machine learning algorithms are examined and their performance is evaluated. The size of our dataset prevents overfitting and hence provides a reliable basis for training of classification models. The results of the experiments show that the proposed algo- rithm achieves very low false positive rate under 2.9% while preserving specificity over 93.6%. 1
|