Obfuscation and (non-)detection of malicious PDF files

More than two months ago I talked at Rooted CON (Madrid) about some techniques to obfuscate and hide malicious PDF files. I gave the same speech at CARO 2011 (Prague) some days ago with updated slides and a demo of peepdf.

The idea is that it's possible to use some malformations in the documents, like those commented by Julia Wolf, and the PDF specification itself in order to keep the files hidden from Antivirus engines and parsers. Bad guys can effectively use it to create an undetectable exploit and use it as an attacking vector. Some of the techniques are the following:


  • Using the /Names and /AcroForm elements of the Catalog object to execute code when the document is opened, instead of the /OpenAction element.
  • If the malicious content is stored in a string object it's possible to hide it thanks to the octal codification.
  • However, if the content is stored in a stream object some unknown filters can be applied, like /JBIG2Decode or /DCTDecode, avoiding the most used, like /FlateDecode and /ASCIIHexDecode. Avast researchers found recently that this is something that cyberdelinquents are already using in the wild.
  • In the case of /FlateDecode and /LZWDecode filters it's possible to define some parameters in order to make the analysis more difficult.
  • Split up the malicious code in several parts and store them in different locations of the document. In the case of Javascript code it's possible to store them in the /Names element of the Catalog. Also some specific functions can be used to retrieve some elements of the document, like getAnnots, getPageNthWord, etc.
  • Avoid the endobj tag at the final of the objects to cheat the parsers.
  • Put null bytes in the header of the document.
  • Compressing the malicious objects in the so-called object streams to add an additional obfuscation level.
  • Encrypt the document with the “default password”.
  • Embed the malicious file in a legit one. It's possible to open the malicious file automatically when the legit document is opened.


In the demo I performed  I modified a detected malicious PDF file (34/43) to decrease its detection rate, being detected only by one Antivirus engine after the modifications. The results of the tests performed in February were even worse, being totally undetectable. Although bad guys are not using all these techniques yet, they will do, therefore it's important to take them into account in the development process of analysis tools and Antivirus products

Ironically, your slide link...

Ironically, your slide link is to a pdf :D

Ironically, your slide link...

Yes, that's a bit ironic, but I swear it's clean!! ;) If you don't trust it, you can take a look at it with peepdf!! :)