Tools

Adding a scoring system in peepdf

Just before the summer I announced that the student Rohit Dua would dedicate his time to improve peepdf and add a scoring system to the output. This was possible thanks to Google and his Google Summer of Code (GSoC) program, where I presented several projects as a member of The Honeynet Project. A beta version was presented during Black Hat Europe Arsenal 2015 last November, where I introduced the new functionalities.

The scoring system has the goal of giving valuable advice about the maliciousness of the PDF file that’s being analyzed. The first step to accomplish this task is identifying the elements which permit to distinguish if a PDF file is malicious or not, like Javascript code, lonely objects, huge gaps between objects, detected vulnerabilities, etc. The next step is calculating a score out of these elements and test it with a large collection of malicious and not malicious PDF files in order to tweak it.

The scoring is based on different indicators like:

  • Number of pages
  • Number of stream filters
  • Broken/Missing cross reference table
  • Obfuscated elements: names, strings, Javascript code.
  • Malformed elements: garbage bytes, missing tags…
  • Encryption with default password
  • Suspicious elements: Javascript, event triggers, actions, known vulns…
  • Big streams and strings
  • Objects not referenced from the Catalog

Black Hat Arsenal peepdf challenge solution

One week before my demo at the Black Hat Arsenal I released a peepdf challenge. The idea was solving the challenge using just peepdf, of course ;) This post will tell you how to solve the challenge so if you want to try by yourself (you should!) STOP READING HERE! The PDF file can be downloaded from here and it is not harmful. No shellcodes, no exploits, no kitten killed. In summary, you can open it with no fear, but do it with a version of Adobe Reader prior to XI ;)

 

Let's start! :) This is what you see with the last version of peepdf:

 

Peepdf Black Hat Arsenal Challenge

 

In a quick look you can spot some Javascript code located in object 13 and also an embedded file in the same object. Checking the references to this object and some info about it we see that it is an embedded PDF file:

 

Black Hat Arsenal peepdf challenge

In one week I will be traveling to Las Vegas to show how peepdf works in the Black Hat USA Arsenal. My time slot will be on Wednesday the 5th from 15:30 to 18:00, so you are more than welcome to come by and say hi, ask questions or just talk to me. I will also be presenting some of the work Rohit Dua is doing during the Google Summer of Code (GSoC), adding a scoring system for peepdf.

 

Black Hat Arsenal Peepdf

 

peepdf news: GitHub, Google Summer of Code and Black Hat

Two months ago Google announced that Google Code was slowly dying: no new projects can be created, it will be read only soon and in January 2016 the project will close definitely. peepdf was hosted there so it was time to move to another platform. The code is currently hosted at GitHub, way more active than Google Code:

 

https://github.com/jesparza/peepdf

 

If you are using peepdf you must update the tool because it is pointing to Google Code now. After executing peepdf.py -u the tool will point to GitHub and it will be able to be up to date with the latest commits. The peepdf Google Code page will also point to GitHub soon.

 

Another important announcement is that Rohit Dua will be the student who will work with peepdf this summer in the Google Summer of Code (GSoC). I initially presented three ideas to improve peepdf through The Honeynet Project:

 

Released peepdf v0.3

After some time without releasing any new version here is peepdf v0.3. It is not that I was not working in the project, but since the option to update the tool from the command line was released creating new versions became a secondary task. Besides this, since January 2014 Google removed the option to upload new downloads to the Google Code projects, so I had to figure out how to do it. From now on, all new releases will be hosted at eternal-todo.com, in the releases section.

 

The differences with version 0.2 are noticeable: new commands and features have been added, some libraries have been updated, detection for more vulnerabilities have been added, a lot of bug fixes, etc. This is the list of the most important changes (full changelog here):

 

  • Replaced Spidermonkey with PyV8 as the Javascript engine (see why here).

Analysis of a CVE-2013-3346/CVE-2013-5065 exploit with peepdf

There are already some good blog posts talking about this exploit, but I think this is a really good example to show how peepdf works and what you can learn next month if you attend the 1day-workshop “Squeezing Exploit Kits and PDF Exploits” at Troopers14 or the 2h-workshop "PDF Attack: A Journey from the Exploit Kit to the Shellcode" at Black Hat Asia (Singapore).  The mentioned exploit was using the Adobe Reader ToolButton Use-After-Free vulnerability to execute code in the victim's machine and then the Windows privilege escalation 0day to bypass the Adobe sandbox and execute a new payload without restrictions.

This is what we see when we open the PDF document (6776bda19a3a8ed4c2870c34279dbaa9) with peepdf:
 

cve-2013-3346_info

 

PDF Attack: A Journey from the Exploit Kit to the Shellcode (Slides)

As I already announced in the last blog post, I was in Las Vegas giving a workshop about how to analyze exploit kits and PDF documents at BlackHat. The part related to exploit kits included some tips to analyze obfuscated Javascript code manually and obtain the exploit URLs or/and shellcodes. The tools needed to accomplish this task were just a text editor, a Javascript engine like Spidermonkey, Rhino or PyV8, and some tool to beautify the code (like peepdf ;p). In a generic way, we can say that the steps to analyze an exploit kit page are the following:
 

  • Removing unnecessary HTML tags
  • Convert HTML elements which are called in the Javascript code to Javascript variables
  • Find and replace eval functions with prints, for example, or hook the eval function if it is possible (PyV8)
  • Execute the Javascript code
  • Beautify the code
  • Find shellcodes and exploit URLs
  • Repeat if necessary

 

 

PDF Attack: A Journey from the Exploit Kit to the Shellcode

 
BlackHat USA 2013 is here and tomorrow I will be explaining how to analyze exploit kits and PDF documents in my workshop “PDF Attack: From the Exploit Kit to the Shellcode” from 14:15 to 16:30 in the Florentine room. It will be really practical so bring your laptop and expect a practical session