peepdf

Static analysis of a CVE-2011-2462 PDF exploit

CVE-2011-2462 was published more than one month ago. It's a memory corruption vulnerability related to U3D objects in Adobe Reader and it affected all the latest versions from Adobe (<=9.4.6 and <= 10.1.1). It was discovered while it was being actively exploited in the wild, as some analysis say. Adobe released a patch for it 10 days after its publication. I'm going to analyse a PDF file exploiting this vulnerability with peepdf to show some of the new commands and functions in action.

As usual, a first look at the information of the file:

I've highlighted the interesting information of the info command: one error while parsing the document, one object (15) containing Javascript code, one object (4) containing two ways of executing elements (/AcroForm, /OpenAction) and one U3D object (10), suspicious for its known vulnerabilities, apart of the latest one.

So we have several objects to explore, let's start from the /AcroForm element (object 4):

Submitted by jesparza on Mon, 2012/01/16 - 18:22

Analysis of a malicious PDF from a SEO Sploit Pack

According to a Kaspersky Lab article, SEO Sploit Pack is one of the Exploit Kits which appeared in the first months of the year, being PDF and Java vulnerabilities the most used in these type of kits. That's the reason why I've chosen to analyse a malicious PDF file downloaded from a SEO Sploit Pack. The PDF file kissasszod.pdf was downloaded from hxxp://marinada3.com/88/eatavayinquisitive.php and it had a low detection rate. So taking a look at the file with peepdf we can see this information:

In a quick look we can see that there are Javascript code in object 8 and that the element /AcroForm is probably used to execute something when the document is opened. The next step is to explore these objects and find out what will be executed:

Submitted by jesparza on Mon, 2011/11/14 - 01:03

Read more

Analysing the Honeynet Project challenge PDF file with peepdf (II)

After the "useless" analysis of the fake objects now we can focus on the objects which will be parsed by the PDF reader:

/Catalog (27)
    dictionary (28)
        dictionary (22)
            dictionary (23)
                dictionary (22)
                /Annot (24)
                    dictionary (23)
                    /Page (25)
                        /Pages (26)
                            /Page (25)
        stream (21)
    /Pages (26)

If we take a look at the Catalog object...

PPDF> object 27

<< /AcroForm 28 0 R
/MarkInfo << /Marked true >>
/Pages 26 0 R
/Type /Catalog
/Lang en-us
/PageMode /UseAttachments >>

There is no presence of any triggers here (/OpenAction) or in the rest of the objects (/AA) so it seems that the /AcroForm element has something to say. Also, the suspicious object 21 (/EmbeddedFile) is related with this interactive form:

PPDF> references to 21

[28]

PPDF> object 28

<< /DA /Helv 0 Tf 0 g
/Fields [ 22 0 R ]
/XFA [ template 21 0 R ] >>

In the dictionary of the form we can see that object 21 is a template and that there is a reference to a field object (object 22). So we continue analysing the field objects:

PPDF> object 22

<< /V

Submitted by jesparza on Tue, 2011/07/26 - 21:30

Analysing the Honeynet Project challenge PDF file with peepdf (I)

In past November The Honeynet Project published a new challenge, this time related to PDF files. Although it's quite old I'm going to analyse it with my tool because I think it has some interesting tricks and peepdf makes the analysis easier. The PDF file can be downloaded from here.

If we launch peepdf we obtain this error:

$ ./peepdf.py -i fcexploit.pdf

Error: parsing indirect object!!

It seems that there is an error in the parsing process. Talking about malicious PDF files it's recommended to add the -f option to ignore this type of errors and continue with the analysis:

$ ./peepdf.py -fi fcexploit.pdf

File: fcexploit.pdf
MD5: 659cf4c6baa87b082227540047538c2a
Size: 25169 bytes
Version: 1.3
Binary: True
Linearized: False
Encrypted: False
Updates: 0
Objects: 18
Streams: 5
Comments: 0
Errors: 2

Version 0:
	Catalog: 27
	Info: 11
	Objects (18): [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 22, 23, 24, 25, 26, 27, 28]
		Errors (1): [11]
	Streams (5): [5, 7, 9, 10, 11]
		Encoded (4): [5, 7, 9, 10]
	Objects with JS code (1): [5]
	Suspicious elements:
		/AcroForm: [27]
		/OpenAction: [1]
		/JS: [4]
		/JavaScript: [4]
		getAnnots (CVE-2009-1492): [5]

Now we can see some statistics and information about the document. We can see some errors too, proof that it's not a normal PDF file:

Submitted by jesparza on Tue, 2011/07/26 - 00:19

Obfuscation and (non-)detection of malicious PDF files

More than two months ago I talked at Rooted CON (Madrid) about some techniques to obfuscate and hide malicious PDF files. I gave the same speech at CARO 2011 (Prague) some days ago with updated slides and a demo of peepdf.

The idea is that it's possible to use some malformations in the documents, like those commented by Julia Wolf, and the PDF specification itself in order to keep the files hidden from Antivirus engines and parsers. Bad guys can effectively use it to create an undetectable exploit and use it as an attacking vector. Some of the techniques are the following:

Submitted by jesparza on Fri, 2011/05/13 - 15:09

peepdf v0.1 released: a tool to analyse/modify malicious PDF files

After some time of inactivity in the blog I return with good news. I released the first version of peepdf last Friday. peepdf is a Python tool to explore PDF files in order to find out if the file can be harmful or not. The aim of this tool is provide all the necessary components that a security researcher could need in a PDF analysis without using three or four tools to make all the tasks. With peepdf it's possible to list all the objects in the document showing the suspicious elements, supports all the most used filters and encodings, it can parse different versions of a file, object streams and encrypted files. With the installation of Spidermonkey and Libemu it provides Javascript and shellcode analysis wrappers too. It is also able to create new PDF files and to modify existent ones. Thanks to the BackTrack team peepdf is included in the last version of this security distribution:

Submitted by jesparza on Thu, 2011/05/12 - 19:48

peepdf - PDF Analysis Tool

Whats is this?

Usage

How does it work?

More info

Releases

GitHub project

Download it!

Follow peepdf on Twitter!

What is this?

peepdf is a Python tool to explore PDF files in order to find out if the file can be harmful or not. The aim of this tool is to provide all the necessary components that a security researcher could need in a PDF analysis without using 3 or 4 tools to make all the tasks. With peepdf it's possible to see all the objects in the document showing the suspicious elements, supports the most used filters and encodings, it can parse different versions of a file, object streams and encrypted files. With the installation of PyV8 and Pylibemu it provides Javascript and shellcode analysis wrappers too. Apart of this it is able to create new PDF files, modify existent ones and obfuscate them.

Read more

Security PDF-related links in 2010: analyses and tools

After one year full of security issues related to the Portable Document Format I've made a little compilation of useful links to analyses and tools:

Analysis

2010-01-04: Sophisticated, targeted malicious PDF documents exploiting CVE-2009-4324 (embedded binaries)
2010-01-07: Static analysis of malicous PDFs (Part #2) (getAnnots, arguments.callee)
2010-01-09: PDF Obfuscation (variable substitution, LuckySploit, CVE 2008-2992)
2010-01-13: Generic PDF exploit hider. embedPDF.py and goodbye AV detection
2010-01-14: PDF Obfuscation using getAnnots() (getAnnots, arguments.callee, Neosploit)
2010-02-15: Filling Adobe's heap (Javascript, ActionScript and PDF Images)
2010-02-18: Malicious PDF trick: getPageNthWord
2010-02-21: Analyzing PDF exploits with Pyew
2010-03-01: Analyzing PDF Files (getPageNthWord, getPageNumWords)
2010-04-08: JavaScript obfuscation in PDF: Sky is the limit (getAnnots,arguments.callee)

Submitted by jesparza on Fri, 2011/01/28 - 18:48

More about the JailbreakMe PDF exploit

Today has been released the source code of the Jailbreakme exploit, so maybe this explanation comes a bit late. In the update of the previous post about this subject I knew that I was right about the overflow in the arguments stack when parsing the charstrings in the Type 2 format, so here is a little more info.

After decoding the stream of the object 13 we can see the following bytes (talking about this file):

The selected bytes are the important ones for this exploit because the overflow occurs when parsing them. Like I mentioned, the Type 2 format is composed of operands, operators and numbers, and use the stack to push and pop values. This stack has a maximum size of 48 elements. We can understand better the meaning of these bytes with this tips:

Submitted by jesparza on Thu, 2010/08/12 - 18:49

About the JailbreakMe PDF exploit

Some days ago Comex published his JailbreakMe for the new iPhone 4 in the Defcon 18. The interesting thing is that in order to root the device he used a PDF exploit for Mobile Safari to execute arbitrary code and after this another kernel vuln to gain elevated privileges. I've being taking a look at the PDF files with peepdf and these are my thoughts about it.

The PDF file itself has no many objects and only one encoded stream:

The stream is encoded with a simple FlateDecode filter, without parameters, and if we decode its content we can see this strings, related to the JailbreakMe stuff:

As this object seems to contain the vulnerability we are looking for we'll take a closer look to this stream and what this is for:

Submitted by jesparza on Wed, 2010/08/04 - 00:40