ApkDiffer: Accurate and Scalable Cross-Version Diffing Analysis for Android Applications
This program is tentative and subject to change.
Software diffing (a.k.a., code alignment) is a fundamental technique to differentiate similar and dissimilar code pieces between two given software products. It can enable various kinds of critical security analysis, e.g., n-day bug localization, software plagiarism detection, etc. To date, many diffing tools have been proposed dedicated to aligning binaries. However, few research efforts have elaborated on cross-version Android app diffing, largely hindering the security assessment of wild apps. To sum up, existing diffing works usually establish scalability-oriented alignment algorithms, and suffer from significant alignment errors when handling the large codebases of modern apps.
To fill this gap, we propose ApkDiffer, a method-level (i.e., function-level) diffing tool dedicated to aligning versions of the same closed-source Android app. ApkDiffer achieves a good balance between scalability and effectiveness, by featuring a two-stage decomposition-based alignment solution. It first decomposes the codebase of each app version, respectively, into multiple functionality units; then tries to precisely align methods that serve equivalent app functionalities across versions. In evaluation, the results show that ApkDiffer noticeably outperforms existing alignment algorithms in precision and recall, while still having a satisfactory time cost. In addition, we used ApkDiffer to track the one-year evolution of 100 popular Google Play apps. By pinpointing the detailed code locations where app versions deviate in privacy collection, we convincingly revealed that app updates may pose ever-evolving privacy threats to end-users.