peepdf v0.1 released: a tool to analyse/modify malicious PDF files

After some time of inactivity in the blog I return with good news. I released the first version of peepdf last Friday. peepdf is a Python tool to explore PDF files in order to find out if the file can be harmful or not. The aim of this tool is provide all the necessary components that a security researcher could need in a PDF analysis without using three or four tools to make all the tasks. With peepdf it's possible to list all the objects in the document showing the suspicious elements, supports all the most used filters and encodings, it can parse different versions of a file, object streams and encrypted files. With the installation of Spidermonkey and Libemu it provides Javascript and shellcode analysis wrappers too. It is also able to create new PDF files and to modify existent ones. Thanks to the BackTrack team peepdf is included in the last version of this security distribution:

The main functionalities of peepdf are the following:

Analysis

Decodings: hexadecimal, octal, name objects
More used filters
References in objects and where an object is referenced
Strings search (including streams)
Physical structure (offsets)
Logical tree structure
Metadata
Modifications between versions (changelog)
Compressed objects (object streams)
Analysis and modification of Javascript (Spidermonkey): unescape, replace, join
Shellcode analysis (sctest wrapper, Libemu)
Variables (set command)
Extraction of old versions of the document

Creation / Modification

Basic PDF creation
Creation of PDF with Javascript executed wen the document is opened
Creation of object streams to compress objects
Embedded PDFs
Strings and names obfuscation
Malformed PDF output: without endobj, garbage in the header, bad header...
Filters modification
Objects modification

The available commands in this first version, related to the previous functionalities, are shown in the following image:

PPDF> help

Documented commands (type help <topic>):
========================================
bytes           encrypt  js_analyse        modify     references    set  
changelog       errors   js_code           object     replace       show 
create          exit     js_join           offsets    reset         stream
decode          filters  js_unescape       open       save          tree 
embed           help     log               quit       save_version
encode          info     malformed_output  rawobject  sctest     
encode_strings  js       metadata          rawstream  search

Instead of executing it as an interactive console it is also possible to execute it in batch mode in order to perform automatic analysis. So after creating a file with the commands we want to execute we should execute the tool this way:

./peepdf.py -s command_file.txt sample.pdf

If we only want to know the information about objects, streams, vulnerabilities, etc, without executing any commands the correct way of execution would be the following:

./peepdf.py sample.pdf

In all these types of execution it's possible to specify some parameters to ignore errors and continue with the analysis and deal with malformed objects:

-f: ignores errors and continues with the analysis of the document. It's useful to analyse malicious documents.
-l: does not search for the endobj tag during the parsing process, so it can be useful when the analysed document is malformed.

I will publish some analysis of malicious PDF files with peepdf to show how to use it. Meanwhile, if you are interested you can take a look to the webpage of the tool. All comments and bugs are welcomed! ;)

Submitted by jesparza on Thu, 2011/05/12 - 19:48

Español

#1Submitted by Anonymous (not verified) on Wed, 2011/05/25 - 22:45.

Great tool!, its is all in

Great tool!, its is all in one!, but i have a question.. how to install spidermonkey and libemu to run the tool full.

thanks for your work!

#2Submitted by jesparza on Thu, 2011/05/26 - 00:07.

Very easy!

Hi,

thanks for the comments! ;) It's very easy to install these dependencies:

Spidermonkey

You can download the package from this URL and follow the instructions:

http://code.google.com/p/python-spidermonkey

Sctest (libemu)

It's not so easy as Spidermonkey, but it's not difficult. You can download the package from here:

http://sourceforge.net/projects/nepenthes/files/libemu development/

The next step is to compile it and copy the sctest executable (tools/sctest) to the peepdf directory.

With these steps you are done! :)

If you have any doubts tell me without any problems!

Un saludo! ;)

#3Submitted by hamilton (not verified) on Wed, 2013/09/11 - 10:04.

Thanks for sharing this

Thanks for sharing this peepdf v0.1 tool! Being a business man, I was in search of a pdf editor tool for last few months and happy to land on this page. Really impressed by the features of the tool! You can modify the filters and objects in your document using this tool!