RELATED WORK
The role step in employing machine learning for malware
detection is to determine the represented features of
executable files. According to the type of static features,
machine learning methods can be divided into different
models.本文来自辣.文'论,文·网原文请找腾讯32,49114
Schultz et al. [4] were the first to introduce the idea of
applying machine learning methods for the detection of
different types of malware based on their respective binary
codes. They used three different feature extraction
978-1-4577-0536-6/11/$26.00 ©2011 IEEE
943
approaches: Byte-sequence n-grams, system resource
information features and String features. String features
provided the best performance, with 97.11% accuracy, when
compared to other features. N-grams of byte codes were used
in J.Zico Kolter et al.[5]. As many as 3621 files were
gathered as dataset, and resulted in more than 255 million
unique n-grams. In their experiment, boosted decision trees
achieved a true positive rate of 98% for a desired False
Positive Rate of 5%.
Data mining methods (logistic regression, neural
networks and decision tree) were used in Siddiqui et al. [6] to
automatically identify critical instruction sequences that can
classify programs as malicious or benign. In their model,
programs were disassembled at first, and the instruction
sequences can be extracted from the disassembled files. The
collections of malware including virus, trojan and worm
respectively were used and the highest Overall Accuracy of
98.4% was reached. Moskovitch et al. [7] provide an
extensive evaluation using a test collection comprised of
more than 30,000 files. Different settings of OpCode-
sequence n-grams representations and five types of
classifiers yielded an accuracy of up to 99%.
Different from earlier studies, our work employ features
of graph which including information of both codes
sequences and control flow of program. We have extracted
features of graph to achieve accurate virus detection since
they were found have association with the underlying
semantic and syntaxtic information of the file structure. In
addition, the stability of the model is analyzed by testing
different types of malware datasets.
上一页 [1] [2] [3] [4] [5] [6] [7] [8] 下一页
基于控制流程图特点的病毒检测方案英文文献和翻译 第3页下载如图片无法显示或论文不完整,请联系qq752018766