Abstract—Software birthmark is a unique characteristic of
program extracted from a program without source code.
Through the comparison of original program and modified
program, code similarity can be measured. Furthermore,
birthmark can be used to measure the similarity of existing
program to detect code theft or malware. Software birthmark
can be mainly divided into static method and dynamic method.
In the related works using dynamic method, birthmark was
extracted by using API function name, call frequency, grammar
structure, opcode, etc. If birthmark is extracted through API
function name or call frequency, resilience can be increased but
it could cause false-positive in similarity. In addition, extraction
method using grammar structure or opcode could increase
similarity but it decreases resilience, thereby causing different
extraction result even for program with same structure. This
paper proposes a method that can simultaneously satisfy
resilience and uniqueness by reflecting unique characteristics
while maintaining the meaning of instruction through the
categorization according to instruction function and the
removal of consecutive duplication for dynamic software
birthmark, which will also be verified through experiment.
Index Terms—Dynamic software birthmark, code theft
detection, information security, dynamic analysis.
The authors are with the School of Information and Communication
Engineering, Sungkyunkwan University, Korea (e-mail:
dhlee@security.re.kr, yschoi@security.re.kr, jwjung@security.re.kr,
jykim@security.re.kr, dhwon@security.re.kr).
Cite: Donghoon Lee, Younsung Choi, Jaewook Jung, Jiye Kim, and Dongho Won, "An Efficient Categorization of the Instructions Based on Binary Excutables for Dynamic Software Birthmark," International Journal of Information and Education Technology vol. 5, no. 8, pp. 571-576, 2015.