This is the sixth part of the Flare-On 6 CTF WriteUp Series.
6 - bmphide
The challenge reads
Tyler Dean hiked up Mt. Elbert (Colorado's tallest mountain) at 2am to capture this picture at the perfect time. Never skip leg day. We found this picture and executable on a thumb drive he left at the trail head. Can he be trusted?
We have a .NET executable named bmphide.exe and an associated BMP image which looks like Figure 1. Running the binary, it quits with an IndexOutOfRangeException
.
Let us load the binary in dnSpy and check the decompiled code.
In Main
we can see that it accesses args[0]
, args[1]
and args[2]
. The reason for the
is because we hadn't provided any argument. These three arguments are file paths. Among them IndexOutOfRangeException
args[0]
and args[2]
are path to BMP files with the former being an input file and the latter an output file. We can infer this from the use of the Bitmap
on the paths. The other argument viz. args[1]
can be any existing file.
To test our understanding, lets try on a sample bmp shown in Figure 4.
D:\> echo ABCDEFGH123456 > test.txt
D:\> dir /b *.txt *.bmp
test.txt
input.bmp
D:\> bmphide.exe input.bmp test.txt output.bmp
D:\> certutil.exe -hashfile input.bmp MD5
MD5 hash of file input.bmp:
ff 51 29 a5 26 01 6c 1f 5c c8 b5 63 8c f1 8a 88
CertUtil: -hashfile command completed successfully.
D:\> certutil.exe -hashfile output.bmp MD5
MD5 hash of file output.bmp:
db 8c 13 1c 33 cf f6 df ad 87 6a 43 d9 f1 d0 41
CertUtil: -hashfile command completed successfully.
output.bmp
is visually identical to input.bmp
although their md5 hashes differ. This is a typical steganographic challenge where we need to recover the hidden information encoded in the image. Let's resume our analysis from where we left.
From Main
we have a call to Init
which looks like Figure 5. It loops over all the methods in the Class A while calling RuntimeHelpers.PrepareMethod
on each of them.
The CLR (Common Language Runtime) manages the execution of .NET programs. Unlike typical native executables, .NET binaries doesn't contain machine code. Instead, they contain a platform independent representation of the instructions (bytecode in Java parlance) known as CIL (Common Intermediate Language) which is understood by the CLR. At runtime the CLR converts the CIL to machine code for the processor it is running on. This process is called JIT (Just In Time). For performance reasons, the CLR does not converts the entire CIL to machine code at one go. Instead it does it on demand as more and more parts of the program are executed for the first time.
The PrepareMethod
function provides a way to force JITing a method before it's executed. In normal programs calling this method is not necessary and it's best to leave the decision of JITing on the CLR. However obfuscators, code protectors and malware may rely on this for their work.
Next from Init
we have a call to CalculateStack
. This method identifies the framework version on which it is running by querying the Environment.Version
property.
The IdentifyLocals
method does something interesting.
Based on the framework version its loads a library and obtains the address of an exported function which is then converted to a Delegate. Much of this code has been taken from the open source .NET protector ConfuserEx. The above snippet can be found in the AntiTamperJIT class. We can refer to that for better understanding.
Depending on the version of .NET, it loads clrjit.dll or mscorjit.dll and calls the exported function getJit
as shown in Figure 8.
getJit
returns a pointer to the ICorJitCompiler
interface as we can see from coreclr source.
The returned pointer points to the vtable of the class (CILJit
) implementing the interface. The first member in the vtable is the address of the compileMethod
function. This function is called by the Execution Engine to convert the IL code of a method to native code.
Back in IdentifyLocals
, it overwrites the vtable entry of compileMethod
with the address of IncrementMaxStack
. Thus, from now on any call to the former will call the latter instead. The original compileMethod
function is saved in originalDelegate
.
IncrementMaxStack
is thus the compileMethod
hook.
The hook will be called by the runtime for every method when JITing. The code above checks whether the MetadataToken
of the method to JIT equals 100663317 or 100663316. If so, it proceeds to make some modifications to the IL. Finally, it calls the original compileMethod
on the modified(or unmodified) IL. This is a clever way to modify the functionality of a program at run-time. Let's go ahead and make the necessary changes to the IL of the corresponding methods so we can do away with the hook.
Undoing the compileMethod hook
100663317 (0x06000015) is the token for method Program.h
as shown in Figure 13.
The compileMethod
hook writes the value 20 at offsets 23 (0x17) and 62 (0x3e). Let's check the offsets in the IL editor.
Both of the offsets point within a call
instruction. The ECMA-335 specification say this instruction takes a single parameter - the metadata token of the method to call.
Originally it calls Program.f
which has a token 0x06000013
. Writing 20 (0x14) changes the token to 0x06000014
which is for Program.g
(the operands are stored in little endian byte order). Let's go ahead and change the target of the call to Program.g
in the IL editor. Our decompiled code of Program.h
now looks like Figure 17.
The second change done by the hook is in method Program.g
with a token of 100663316 (0x06000014).
At offsets 6 and 18 (0x12) it writes the Int32
value of 309030853 and 209897853 respectively (Figure 12). At both of these offsets we find a ldc.i4
instruction.
As per the ECMA-335 standard this instruction pushes a int32
numeric constant onto the stack.
By writing to offsets 6 and 18, the hook is thus modifying the operand of the ldc.i4
instruction. In the same way as we did before, we can make the changes in the IL editor to make the decompiled code as in Figure 21.
The second "trick"
Let's go back to Init
from where we started. After the call to CalculateStack
(Figure 5) the code loops over all the methods in Program
. It obtains the method body as a byte array and calls d.a
on it.
The d.a
method calculates a hash value of the byte array passed to it as shown in Figure 23.
The returned hash value is then compared to 3472577156, 2689456752, 3040029055, 2663056498. So all in all its searching for 4 methods with the said hashes. Finally it calls VerifySignature
.
VerifySignature
forces the methods to be JIT compiled by calling RuntimeHelpers.PrepareMethod
on the method handles.
The last three lines changes the method pointer of m1 to point to m2. As a result calling m1 from now on will call m2 instead. For details check this blog post.
At this point we don't yet know the two methods whose pointer's are changed. Let's debug the binary in dnSpy as shown in Figure 27.
However it immediately crashes with a stack overflow exception.
The reason for the exception is because of the compileMethod
hook. To debug successfully, we need to remove the hook which can be done by bypassing the call to A.CalculateStack
in Init
. Set a breakpoint on the line before debugging. When the breakpoint hits, right click on the next line and use "Set Next Statement" to bypass the call.
We can set another breakpoint on the call to A.VerifySignature
. This will help us to find out the arguments passed to the method.
m
, m2
, m3
, m4
are methods a
, b
, c
and d
respectively. The method pointer of a
is replaced with that of b
, and c
is replaced with d
.
We can make all of the above changes in the same way using the IL editor by changing the call target to b
and d
whenever we spot a call to a
and c
respectively. There are the only two places where we need to make the change and both of them are in the method h
.
The calls to a
and c
are on lines 12 and 15 needs to be changed to b
and d
respectively. That's all of the tricks in the binary. Let's analyze the main logic.
Reversing the encoding logic
From main
we have a call to method i
which takes a bitmap and a byte array as parameters. This method encodes the data within the given bitmap hiding the data in the red, green and blue color values of the pixels. The logic is slightly obfuscated with the constants replaced by a call to Program.j
which calculates them at runtime. We can easily get back the original constants by debugging the binary. Ensure that the binary is free from the compileMethod
hook and all the IL modifications "tricks" are correctly applied.
Set a breakpoint on the first line in the method Program.i
and start debugging with the necessary command line parameters(input.bmp test.txt output.bmp). Once the breakpoint hits, we can go to the watch window and enter the expressions whose value we want to calculate as in Figure 33.
Our updated Program.i
method now looks like
public static void encode(Bitmap bm, byte[] data)
{
int num = 0;
for (int i = 0; i < bm.Width; i++)
{
for (int j = 0; j < bm.Height; j++)
{
if (num > data.Length - 1)
{
break;
}
Color pixel = bm.GetPixel(i, j);
int red = (pixel.R & 248) | (data[num] & 7);
int green = (pixel.G & 248) | (data[num] >> 3 & 7);
int blue = (pixel.B & 252) | (data[num] >> 6 & 3);
Color color = Color.FromArgb(0, red, green, blue);
bm.SetPixel(i, j, color);
num += 1;
}
}
}
Studying the code, we can understand how it encodes the data in the pixels. It encodes the information in the least significant bits of the RGB values so that it's not visible to the naked eye. A byte is comprised of 8 bits. The last 3 bits goes into red, next 3 to green and the remaining 2 to blue.
We have been provided with a bitmap image.bmp. We can recover the encoded data stored in it using the following cs-script.
//decodeimage.cs
using System;
using System.Drawing;
using System.IO;
class decodeimage
{
public static void Main()
{
string image_file = "image.bmp";
Bitmap bmp = new Bitmap(image_file);
byte[] ct = new byte[bmp.Width*bmp.Height];
int num = 0;
for (int i = 0; i < bmp.Width; i++)
{
for (int j = 0; j < bmp.Height; j++)
{
Color pixel = bmp.GetPixel(i, j);
byte bits_2_0 = (byte)(pixel.R & 7);
byte bits_5_3 = (byte)(pixel.G & 7);
byte bits_7_6 = (byte)(pixel.B & 3);
ct[num] = (byte)((bits_7_6 <<6) | (bits_5_3 <<3) | bits_2_0);
num++;
}
}
File.WriteAllBytes("ct.bin", ct);
}
}
The recovered data doesn't represent anything meaningful as it's encrypted as shown in Figure 34.
Reversing the encryption
To get back the original data we need to decrypt the recovered data. One way to do that is by studying the encryption logic and try to develop a decrypter. However, we will take a shortcut and bruteforce the original data instead. In other words we will encrypt all possible values that a byte can take (256 values) reusing the logic from the decompiled code. If the encrypted value matches, then our byte is correct.
//decrypt.cs
using System;
using System.Drawing;
using System.IO;
class decrypt {
public static byte b(byte b, int r) {/*snip*/}
public static byte d(byte b, int r) {/*snip*/}
public static byte e(byte b, byte k) {/*snip*/}
public static byte g(int idx) {/*snip*/}
public static void Main() {
byte[] ct = File.ReadAllBytes("ct.bin");
byte[] pt = new byte[ct.Length];
int num_backup = 0;
for (int i = 0; i < ct.Length; i++) {
for(byte xxx = 0; xxx <= 255; xxx++) {
int num = num_backup;
int num2 = g(num++);
int num3 = xxx;
num3 = e((byte)num3, (byte)num2);
num3 = b((byte)num3, 7);
int num4 = g(num++);
num3 = e((byte)num3, (byte)num4);
num3 = d((byte)num3, 3);
if(ct[i] == (byte)num3) {
pt[i] = xxx;
num_backup = num;
break;
}
}
}
File.WriteAllBytes("pt.bin", pt);
}
}
To keep it readable, I have not shown the entire code above. Running our decrypter we get a file with a BMP header which looks like Figure 35.
Since the encoded data is a bitmap, it's possible that some data has been hidden inside it as well. In the same way as we did earlier, let's recover the encoded data in it.
This time we get another BMP containing the flag.