I just finished pentesting a mobile app for a financial institution. I wrote this mainly as a note for future manual deobfuscation work. I have read a lot of articles and tested tools to deobfuscate Android apps but they are mostly for analyzing malware. Sometimes I need to deobfuscate and test app for the pentesting purpose.
Most of the time it doesn’t matter whether we are analyzing malware or analyzing some apps, but there are differences. For example, when testing a bank or financial app (with a team):
- We can be sure that the app is not malicious, so we can safely use real device
- The obfuscation is usually only up to DEX level, and will not patch the native code (Dalvik VM), because they want to ensure portability
- We need to be able to run and test the app, not just extract strings to guess the capability of the app (on some malware analysis, you just need to extract strings)
- Sometimes we need to modify and repack the app to bypass root checking, SSL pinning, etc and redistribute the APK to team members (you don’t usually repack a malware APK for testing)
You may ask: if this is for pentesting, why don’t you just ask for the debug version of the app? In many cases: yes we can have it, and it makes our job really easy. In some cases, due to a contract between the bank and the app vendor (or some other legal or technical reasons), they can only give a Play Store or iTunes URL.
I can’t tell you about the app that I tested, but I can describe the protection used.
Try automated tools
Before doing anything manually, there are several deobfuscator tools and website that can help many obfuscation cases. One of them is APK Deguard. It only works with APK file up to 16 Mb, so if you have a lot of asset files, just delete the assets to get within the limit. This tool can recognize libraries, so you will sometimes get perfectly reconstructed method and class names. Unfortunately, there are also bugs: some variables are methods just disappear from a class. And sometimes it generates classes with 4 bytes in size (just the word: null).
I tried several other tools that looked promising, such as simplify (really promising, but when I tested it, it’s really slow). I also tried: Dex-Oracle (it didn’t work). JADX also has some simple renamer for obfuscated names, but it was not enough for this case.
Every time I found a tool that doesn’t work, I usually spend some time to see if I can make it work. In the end, sometimes manual way is the best.
Use XPosed Framework
In some cases, using XPosed framework is nice, I can log any methods, or replace existing methods. One thing that I don’t quite like is that we need to do reboot (or soft reboot) every time we update the modules.
There are also modules such as JustTrustMe that works with many apps to bypass SSL Pinning check. But it doesn’t work with all apps. For example, last time I checked didn’t work for Instagram (but of course, someone could have patched it now to make it work again). RootCloak also works to hide root from most apps, but this module hasn’t been updated for quite some time.
Sadly for the app that I tested, both tools didn’t work, the app was still able to detect that the device is rooted, and SSL pinning is still not bypassed.
Use Frida
Frida is also an interesting tool that works most of the time. Some interesting scripts were already written or Frida, for example: appmon.
Both Frida and XPosed have a weakness in tracing execution inside a method. For example we cant print a certain value in the middle of a method.
Unpack and Repack
This is the very basic thing: we will check whether the app checks for its own signature. Initially, I use a locked bootloader, unrooted, real device (not an emulator). We can unpack the app using apktool:
apktool d app.apk
cd app
apktool b
Re-sign the dist/app.apk and install it on the device. In my case: the app won’t run: just a toast displaying: “App is not official”.
Find Raw Strings
We can use:
grep -r const-string smali/
To extract all strings in the code. In my case: I was not able to find many strings. On the string that I did find, it was used for loading class. It means that: we need to be careful when renaming a class: it could be referenced from somewhere else as a string.
Add Logging Code
With some effort, we can debug a smali project, but I prefer debug logging for doing two things: deobfuscating string and for tracing execution.
To add debugging, I created a Java file which I then compile to smali. The method can print any java Object. First, add the smali file for debugging to the smali directory.
To insert logging code manually, we just need to add:
invoke-static {v1}, LLogger;->printObject(Ljava/lang/Object;)V
replace v1 with the register that we want to print.
Most of the times, the deobfuscator method has the same parameter and return everywhere, in this case, the signature is:
.method private X(III)Ljava/lang/String;
We can write a script that:
- Finds deobfuscation method
- Inject a call to log the String
Printing the result string in the deobfuscate method is easy, but we have a problem: where (which line, which file) does the string comes from?
We can add logging code with more information like this:
const-string v1, "Line 1 file http.java"
invoke-static {v1}, LMyLogger;->logString(Ljava/lang/String;)V
But it would require unused register for storing string (complicated, need to track which registers are currently unused), or we could increase local register count and use last register (doesn’t work if method already used all the registers).
I used another approach: we can use a Stack Trace to trace where this method is called. To identify the line, we just add new “.line” directive in the smali file before calling the deobfuscate method. To make the obfuscated class name easier to recognize, add a “.source” at the top of the smali. Initially we don’t know yet what the class do, so just give a unique identifier using uuid.
Tracing Startup
In Java, we can create static initializer, and it will be executed (once) when the class is used the first time. We should add logging code at beginning of <clinit>
.
class Test {
static { System.out.println("test"); }
}
I used UUID here (I randomly generate UUID and just put it as constant string in every class) that will helps me work with obfuscated name.
class Test {
static {
System.out.println("c5922d09-6520-4b25-a0eb-4f556594a692"); }
}
If that message appears in logcat, then we know that the class is called/used. I could do something like this to edit the name:
vi $(grep -r UUID smali|cut -f 1 -d ':' )
Or we can also setup a directory full of UUIDS with symbolic link to the original file.
Writing new smali code
We can easily write simple smali code by hand, but for more complicated code we should just write in Java, and convert it back to smali. It is also a good idea to make sure it works on the device.
javac *.java
dx --dex --output=classes.dex *.class
zip Test.zip classes.dex
apktool d Test.zip
Now we get a smali that we can inject (copy to the smali folder)
This approach can also be used to test part of code from the app itself. We can extract smali code, add main, and run it.
adb push Test.zip /sdcard/
adb shell ANDROID_DATA=/sdcard dalvikvm -cp /sdcard/Test.zip NameOfMainClass
Think in Java level
There are several classes in the app that extracts a dex file from a byte array to a temporary name, and then removes the file. The array is encrypted and the filename is random. First thing that we want to know is: is this file important? Will we need to patch it?
To keep the file, we can just patch the string deobfuscator: if it returns “delete”, we just return “canRead”. The signature of the method is compatible which is “()Z” (a function that doesn’t receive parameter and returns boolean).
It turns out that replacing the file (for patching) is a bit more difficult. Its a bit complicated looking at the smali code, but in general this is what happens:
- It generates several random unicode character using SecureRandom (note that because this is a “secure” random, altering the seed of SecureRandom won’t give you predictable file names)
- It decrypts the built in array into a zip file in memory
- It reads the zip file from a certain fixed offset
- It deflates the zip file manually
- It writes the decompressed result to a random dex file name generated at step 1
- It loads the dex file
- It deletes the temporary dex file
I tried patching the byte array, but then I also need to adjust a lot of numbers inside (sizes and offsets). After thinking in Java level, the answer is just to create a new Java code that can do what we want. So this is what I did instead:
I created a class named: FakeOutputStream, then patched the code so instead of finding java.io.FileOutputStream, it will load FakeOutputStream.
The FakeOutputStream will write the original code to /sdcard/orig-x-y, with x and y is the offset and size AND instead it will load the content of /sdcard/fake-x-y and write it to the temporary file.
Using this: when I first run the app, it will generate /sdcard/orig-x-y, and I can reverse engineer the generated DEX. I can also modify the dex file, and push it as /sdcard/fake-x-y, and that file will be loaded instead.
Time to Patch
After we can decrypt all file contents, we can start patching things, such as removing root check, package signature check, debugger check, SSL pinning check, etc.
Having the dex file outside of the main APK has an advantage: we can easily test adding or replacing method just by replacing the dex file outside the app.
I’m the author of simplify and dex-oracle. Feel free to create an issue on either project if you have problems, *especially* if it’s malware.