tisdag 4 oktober 2011

How can I ever trust a filename again?!

Recently there has been quite some discussion (e.g. Brian Krebs) on malware using special unicode characters to obfuscate the file-type. The source of the problem is the unicode character \u202E "Right-to-Left Override" (RLO) which changes the order in which characters are displayed. It is used in conjunction with languages which are read from right to left, but can appear anywhere in a text to temporarily override how text is displayed. This is the commonly used example to describe this problem:
CORP_INVOICE_08.14.2011_Pr.phyl\u202Ecod.exe

which would display as:
CORP_INVOICE_08.14.2011_Pr.phylexe.doc

This example felt a bit unconvincing to me, since the fact that the actual file extension now appears just before the dot. So, is the conclusion that you now need to pay attention to what comes before the dot? I decided to do some research of my own. The result is that I will never trust a filename ever again! What about you? Would you trust any of the following files?

Children.Of.Men-DVD‮iva.DIVX-RENEE‭.SCR
Chrome‮zg.rat.baT‭
Windows‮tnerrot.MOOD-noitidE.evituc‭.7.Exe

Another important, but not as discussed character, is \u202D "Left-to-Right Override" (LRO), having the opposite effect of RLO. Using combinations of RLO and LRO, we can switch back and forth between adding characters to the end or the beginning of the string. As an example, the following obfuscated text

"\u202Et\u202Di\u202Eo\u202Dn\u202En\u202Dt\u202Er\u202De" (tionntre)

would display as

‮t‭i‮o‭n‮n‭t‮r‭e

With this technique you can completely obfuscate the file type by integrating the extension into what appears as the file name (as can be seen in the file names above). This means you need to be weary whenever an executable file extension (or the reverse of one) is contained in the displayed file name in conjunction with a dot.

Here's a list of examples of what you need to look out for if they appear anywhere in the file name:
.bat or tab.
.com or moc.
.exe or exe.
.scr or rcs.
.pif or fip.
.jar or raj.
...

These are just the obvious examples. If we were to include any file extension which opens up in a vulnerable program, you'd quickly realize that just about any filename could be potentially harmful.


/internot

onsdag 4 maj 2011

Invisible address, a security problem? Let me know you opinion!

Recently when I was researching a completely different topic (which I will disclose in the near future), I came across a curious behaviour in Google Chrome. When visiting an URL longer than 32768 characters, the address bar would only display a fragment of the URL. What is displayed in the addressbar depends on the protocol of the URL.


These are the behaviors for some different protocols:

http://victim.com/#aaaa...aaaa ==> victim.com

https://victim.com/#aaaa...aaaa ==> https://victim.com/

data:text/html,aaaa...aaaa ==> data:

view-source:http://victim.com/#aaaa...aaaa ==> view-source:


Even though the characters are not correctly displayed in the address bar, they are correctly processed in the request. The issue is easily reproducible by creating a link longer than 32768 characters. Such as this: Click me in Chrome


The number 32768=32*1024=2^15, indicating that this could be an overflow of a 16-bit signed integer. I haven't dug deeper into this, but that is my intuition.


The issue was reported to the Chromium issue tracker, but is not considered as a security issue and won't be fixed. At least not until someone comes up with a scenario where this is a major problem. So, what do you think? Is this a problem or not?

fredag 1 april 2011

Further Optimizing Blind MySQL Injections

Just a short post, since 140 chars are sometimes not enough.

After reading websec's brilliant post on optimizing data retrieval from mysql, I thought of further optimizations. The advantage of websec's "find_in_set"-method is that the binary encoding of ascii characters is effectively reduced from 7 bits to 1-6 bits. The reason is that the "find_in_set" function returns the index at which the character appears in a string, meaning that the character 'd' in the set 'a,b,c,d' is encoded as 11 instead of 1100100.

It is possible to further optimize this by ordering the characters in the set by frequency. Meaning that the most commonly used characters will be found with the least number of requests. Let's redo the example from websec's article using the letter frequency of the english alphabet as a guide line:

FIND_IN_SET(MID(table_name,1,1), 'e,t,a,o,i,n,s,h,r,d,l,
c,u,m,w,f,g,y,p,b,v,k,j,x,q,z,_,0,1,2,3,4,5,6,7,8,9,$, ,
[,],!,@,#,%,^,&,*,(,),-,+,=,\,,",\',~,`,|,{,},:,;')






String:character_set
Alphabetical:3811813205183719520
By frequency:128  3  9  3  122  1  9  277  1  2  


Even this can be further optimized by using individual frequencies for each position in the string. Even though the letter 'e' is the most common in english text, it is not the most common letter for the first character in the word. Another optimization can be achieved by gathering the character frequencies for the target text we are attempting to extract (in websec's example the name of a table). Also, when working with table names we can exclude disallowed characters ('/', '.', '\'...), otherwise we need to include all characters that can be expected in the string.