The static analysis of Android application is the heart of the εxodus platform. The main idea is to detect the presence of known trackers in an application without running it. In fact, εxodus cannot take too much time for each analysis so the detection method has to be efficient in time, reproducible and free of false-positives.
Android applications are developed in JVM compatible languages like Java, Kotlin, etc. Programs running on JVM have numerous differences compared to C or Go programs, the major one is class names are readable directly in the binary file of the program without requiring decompilation. JVM identifies a type by its fully qualified class name like org.exodus.Report
, org.exodus
is the package name and Report
is the class name.
Google provides dexdump
which is a program meant to parse .dex
files. In the Java world, compiled code is contained in .class
files (one per class), in the Android world, compiled code is stored in .dex
files (one or more per program). An Android application is contained in an .apk
file which is a signed ZIP archive. The .apk
file contains the application program (.dex
files) and application assets like fonts, pictures, etc.
When we run dexdump
on an .apk
file, we have such output:
Class #0 -
Class descriptor : 'Landroid/databinding/Observable;'
Access flags : 0x0601 (PUBLIC INTERFACE ABSTRACT)
Superclass : 'Ljava/lang/Object;'
Interfaces -
Static fields -
Instance fields -
Direct methods -
Virtual methods -
#0 : (in Landroid/databinding/Observable;)
name : 'addOnPropertyChangedCallback'
type : '(Landroid/databinding/Observable$OnPropertyChangedCallback;)V'
access : 0x0401 (PUBLIC ABSTRACT)
code : (none)
#1 : (in Landroid/databinding/Observable;)
name : 'removeOnPropertyChangedCallback'
type : '(Landroid/databinding/Observable$OnPropertyChangedCallback;)V'
access : 0x0401 (PUBLIC ABSTRACT)
code : (none)
source_file_idx : 25639 (Observable.java)
Class #1 -
Class descriptor : 'Landroid/databinding/BaseObservable;'
[...]
This output is not really human-friendly but for each class declared in the application we have a Class #x
and a Class descriptor
. The class descriptor is the JVM representation of application name-spaces (fully qualified class name). In this example, Class #0
is android.databinding.Observable
with android.databinding
the package name and Observable
the name of the declared class.
We can cleanup the dexdump
output by selecting Class descriptor
and sorting them:
$ dexdump my.apk | grep "Class descriptor" | sort | uniq
Class descriptor : 'Landroid/databinding/adapters/AbsListViewBindingAdapter;'
Class descriptor : 'Landroid/databinding/adapters/AbsListViewBindingAdapter$1;'
Class descriptor : 'Landroid/databinding/adapters/AbsListViewBindingAdapter$OnScroll;'
Class descriptor : 'Landroid/databinding/adapters/AbsListViewBindingAdapter$OnScrollStateChanged;'
Class descriptor : 'Landroid/databinding/adapters/AbsSeekBarBindingAdapter;'
Class descriptor : 'Landroid/databinding/adapters/AbsSpinnerBindingAdapter;'
Class descriptor : 'Landroid/databinding/adapters/ActionMenuViewBindingAdapter;'
Class descriptor : 'Landroid/databinding/adapters/AdapterViewBindingAdapter;'
Class descriptor : 'Landroid/databinding/adapters/AdapterViewBindingAdapter$OnItemSelected;'
Class descriptor : 'Landroid/databinding/adapters/AdapterViewBindingAdapter$OnItemSelectedComponentListener;'
Class descriptor : 'Landroid/databinding/adapters/AdapterViewBindingAdapter$OnNothingSelected;'
Class descriptor : 'Landroid/databinding/adapters/AutoCompleteTextViewBindingAdapter;'
Class descriptor : 'Landroid/databinding/adapters/AutoCompleteTextViewBindingAdapter$1;'
Class descriptor : 'Landroid/databinding/adapters/AutoCompleteTextViewBindingAdapter$FixText;'
In Java, C++, Python and other languages, name-spaces are really important and when you provide a library, you have to ensure that the library name-space never collides with other libraries.
εxodus digs into .apk
files to find trackers and these trackers can be identified by their own name-space. As an example, Amazon Ads library root name-space/package is com.amazon.device.ads
, this library shares the com.amazon
package name with other Amazon libraries. So, at this point, a tracker signature is a name-space (a package name).
By now, εxodus knows 152 trackers and if we have a look to Flurry we have detection rule com.flurry.
which is a regex.
So, lets use grep
to see if my.apk
contains Flurry’s classes
$ dexdump my.apk | grep "Class descriptor" | sort | uniq | grep -E "com.flurry." | head -n 10
Class descriptor : 'Lcom/flurry/android/Constants;'
Class descriptor : 'Lcom/flurry/android/FlurryAgent;'
Class descriptor : 'Lcom/flurry/android/FlurryAgent$1;'
Class descriptor : 'Lcom/flurry/android/FlurryAgent$1$1;'
Class descriptor : 'Lcom/flurry/android/FlurryAgent$2;'
Class descriptor : 'Lcom/flurry/android/FlurryAgent$3;'
Class descriptor : 'Lcom/flurry/android/FlurryAgent$Builder;'
Class descriptor : 'Lcom/flurry/android/FlurryAgentListener;'
Class descriptor : 'Lcom/flurry/android/FlurryEventRecordStatus;'
Class descriptor : 'Lcom/flurry/android/FlurryGamingAgent;'
as we can see, my.apk
contains Flurry’s classes.
Thus, when you submit an application for analysis, εxodus downloads it from Google Play, extracts the .apk
file, call dexdump
on the .dex
files, stores the filtered output into a file which will be used during reports update in order to speed up the analysis, takes the detection rule of each known tracker and see if class names are matching. If a detection rule matches one or more class names, we consider the tracker embedded, otherwise, we consider it as missing.
If you want to help us to improve the εxodus knowledge base, contact us and ask for an ETIP account. ETIP is a collaborative platform meant to ease investigation and classification on trackers.
If you are an application developer, you can use exodus-standalone to analyse your application before its publication on Google Play.