Last Friday, I received a URL which used several exploits to spread malware. As always, I started to investigate it. As you may know, these sites use javascript to exploit web browser, ActiveX or third party vulnerabilities, and of course JS obfuscation is used most of the time.
I don’t like using web browsers to de-obfuscate these codes, basically because these js are dangerous and I want to avoid an infection. I know that some researchers use debugging techniques to de-obfuscate these js codes, but I really think there are safer, faster and more automated methods to do the same job.
I prefer to use Rhino to accomplish these tasks. Rhino is “an open-source implementation of JavaScript written entirely in Java“. With this js engine and a Linux system I’m able to de-obfuscate these codes, without using any web browser. I recommend you the CanWest presentation Reverse Engineering Malicious Javascript (Jose Nazario, Ph. D. Arbor).
I’m going to show the process with an example (the same js code I received on Friday).
This is a special case, because it uses a trick to avoid the modification of the code: arguments.callee.toString() (This call returns the code of the funcion where it’s called). Any modification of the code will affect the final result and therefore avoid an automated de-obfuscation of the code.
The js code has two functions: bodipyri(ii) and cynolapy(a1,b1). The Function’s names are dinamically generated every time the page is loaded. Analyzing the second function, I saw there was a return with an eval call:
function cynolapy(a1,b1)
{
if(!b1){
return eval(bodipyri(“ZG9jdW1lbnQud3JpdGUoY[DELETED]2csIiIpKSk7”)); }
This function is first called only with an argument:
cynolapy(‘YSYsMTs5IHAkOGlvIid7ZDZ9IGo5cD4[DELETED]VrY2SVSXZnKzJakIk=’);
so the function cynolapy returns the eval result.
I deleted all the html code, changed the eval call with a print, and executed rhino against the file. This was the result:
document.write(cynolapy(a1,arguments.callee.toString().replace(/s/g,””)));
The js code is recursively calling the same function but with a second argument. This new argument is the known arguments.callee.toString() trick used to avoid code modifications. Since I had modified eval with print, I was modifing the value of the second parameter and therefore changing the final result. That’s bad and avoided an automated de-obfuscation of the code. This example has to be manually de-obfuscated.
The arguments.callee.toString().replace(/s/g,””)) function returns the cynolapy function source code and then strips all white spaces.
Therefore, in order to to get the real js code we have to call this function with this second argument. I got the original js code again (with the eval call), and added the following code at the end of the file:
print cynolapy(‘YSYsMTs5IHAkNGlhIj97aDZ9[DELETED]24viw==’,’functiontumawyzu(a1,b1){if(!b1){returneval(cynolapy(“ZG9[DELETED]returno;}’);
What I’m doing here is calling cynolapy function with the expected second parameter. After parsing this code with Rhino this is the resulting code:
This de-obfuscated code is loading an iframe with a new site. This site is used to exploit several vulnerabilities: ANI/ANR, Java/ByteVerify, ADODB.Stream,…
PandaLabs is developing an automated engine to de-obfuscate js, but sometimes it’s not possible and we need to do it manually.