Finding third-party components with binary analysis
1University of Oulu, Faculty of Information Technology and Electrical Engineering, Department of Computer Science and Engineering, Computer Science
|Online Access:||PDF Full Text (PDF, 4.3 MB)|
|Persistent link:|| http://urn.fi/URN:NBN:fi:oulu-201504161389
|Publish Date:|| 2015-04-20
|Thesis type:||Master's thesis (tech)
The increased usage of open-source software (OSS) libraries as building blocks in the software industry has introduced numerous problems and vulnerabilities into many popular software suites. As more libraries are used, the risk of the whole software being exposed to the vulnerabilities inherent in these third-party components increases. Vulnerability management is the process of mitigating the impact of these vulnerabilities in software development. Another matter concerning OSS is license management. Violating OSS licenses can lead to legal issues and possibly harm a business. A few commercial tools can be used to manage vulnerabilities and licenses. The implementation introduced in this thesis is developed to improve Codenomicon AppCheck, which is one of these commercial tools.
This thesis introduces a method for detecting software libraries in binary code. Each library is given a unique set of signatures. A signature is a sequence of bytes extracted from the read-only data section of the library. Two new methods are demonstrated for the signature extraction. The signatures are detected from the input binary data using the Aho-Corasick string matching algorithm. The algorithm allows the detecting to be done with a single pass over the input data. The found signatures are evaluated and used to decide which libraries are included in the input data.
The implementation was tested using 14 OSS libraries and 8 OSS applications. Each of the libraries was included at least in one of the applications. For each application the expected libraries, which were supposed to be found, were determined by examining their source code. The found libraries were compared to ones expected to be found, and the accuracy was defined using F measure. The results showed that the new signature extraction methods were valid, and the implementation could detect OSS libraries in binary data. The new signature extraction methods were also able to extend the coverage of Codenomicon AppCheck.
© Antti Väyrynen, 2014. This publication is copyrighted. You may download, display and print it for your own personal use. Commercial use is prohibited.