A Virus Detection Scheme Based on Features of Control Flow Graph
Zongqu Zhao
School of Computer Science and Technology
Henan Polytechnic University
Jiaozuo, China
论文网http://www.751com.cn/ ">zhaozong_qu@
Abstract—For the well-known reasons, the virus detection
schemes based on signature manifest unsatisfactory
performance when they dispose the previously unknown
virus. Recently, machine learning methods were本文来自辣.文'论,文·网原文请找腾讯32.49114
introduced to build new ways for virus detection. They
adopted classification algorithms to learn patterns in the
binary code files in order to classify unknown files. In
this paper, we present a graph features based method,
which can be used in the process of machine learning,
and design a virus detection model based on our feature
method. The features are extracted from Control Flow
Graph (CFG) of executable. We follow a threefold
research methodology in our detection model: (1) create
the CFG of the executables, (2) extract features from the
CFG and create training data, (3) generate classifiers
according to specific machine learning algorithms, and
detect virus with these classifiers. For the sake of fixed
sum of features, our model avoids situation that too
much features could be found in other feature methods
and leaves the filter step out of it, so it presents the
efficient and scalability. With our experiments, we were
able to achieve as high as 95.9% detection rate and as
low as 5.9% false positive rate on novel malware.
Keywords- virus detection, data mining, control flow graph)
I.
INTRODUCTION
Traditional signature based virus detection method
usually acquire the signatures such as byte sequences and
specific strings of virus by static analysis, and organize them
as signature set. One of the key techniques is how to generate
signature which should lead to low matching cost and little
false alarm. In addition, how to organize signature set
effectively, so as to produces highly effective scanning
engine, is another key problem.
In order to avoid signature based detection, the virus
designer can employ obfuscation techniques such as insert
garbage code, code equivalent transformation, the register论文网http://www.751com.cn/
equivalent transformation and instruction equivalent
transformation. These technologies can lead signature based
detection to a certain degree failure. For the more, it is a2288
[1] [2] [3] [4] [5] [6] [7] [8] 下一页