Reversing an exploit? botw-ryanlrussell-2
(To potentially save you some time: I'm not showing you anything useful about the vulnerability in question here. I'm showing binary analysis process. I'm showing some pretty common binary analysis stuff, and just using this scenario as the story.)
Somewhere in the last several days, while at BSidesSF or RSAC, I saw a tweet (an X?) to the effect that someone had discovered a vulnerability in Microsoft's Windows telnet server. Had to do with reversing the check or challenge for the account, and yadda yadda, you could login with regular user creds, but it would give you admin.
I have to yadda yadda that, because I can't find the tweet anymore. Pretty sure I had bookmarked it, but it's no longer there. This tweet:
https://x.com/hackerfantastic/status/1917645961188229156
If you look at it and scroll up the chain, suggests that it has been removed or retracted. Which would explain why it's gone. And yet, here it is on github still?
https://github.com/gavz/hfwintelnet
Maybe? Since I can't find the tweet, I can't tell if this is the original github copy, and so on. I've lost the thread by not jumping on it quickly enough.
All of which is ok for my purposes today. I don't care about the exploit per se, I care about the binary. And that's still sitting there in that github repo. I'm using the binary as an example of a scenario.
telnetbypass.exe
SHA256 3DC981041B81A98AF7E78F74A27B3B55E4EDB4DC6685370BDAF12CF3E5946CFA
Now, one would be likely to care about this binary BECAUSE of the exploit supposedly contained inside. Here's a partial list of reasons why one might find themselves wanting to look at this in a timely manner:
This researcher is trying to "hide" a hack in a compiled binary and This Shall Not Stand
I'm a vulnerability researcher, and I want to port the exploit to my language and framework of choice. Or I need to write detections for it as a defender.
I suspect someone is trying to trick people into running a binary because it's malicious. (Yes, tricking security people into running malware masquerading as an exploit has worked many times for many years.)
In this particular case, probably the most expedient path would actually be to set up a supposedly vulnerable Windows telnet server victim, run the exploit in a sacrificial VM client, and capture the network traffic. Telnet, being a cleartext protocol, ought lend itself well to that style of the analysis.
But I'm here to disassemble.
Load it up in IDA Pro. First thing you see:
Not trying to fully explain PDB files in this particular article, but just looking to see if there's anything worth noting in the build path that is being displayed here. Only thing I'm curious about is maybe the \usa\ portion of the path. My assumption is that has to do with US builds of Windows as opposed to other country versions. And the i386 suggests 32-bit code, which it is. But it generally seems to match what one would expect for an exploit as described. There have been occasions where authors have been leaked via this kind of metadata.
Here is my biased expectation going in if it's just an exploit. Either I will be dealing with some packing, obfuscation, or other trickery since it is stated up front that something sensitive is being hidden in the form of an executable. Or, if they didn't bother to play binary games, I expect an exploit to generally be not a ton of code. A main and maybe some small number of subs. Often they're the minimum necessary to prove the concept.
Here's what we got:
32-bit code, appears to be well-formed, there's the start point.. and a large number of subroutines. The call graph looks pretty reasonable, and I see lots of pink, which means that our imports are probably all intact. More on that in a sec.
Speculation at this point: Maybe the Windows authentication stuff (which is on my list of things I least like to reverse engineer) has added a bunch of complexity, and therefore size, and therefore reverse engineering work.
Continuing, the first chunk of start code:
If you get used to reversing a lot of Windows code, especially if a lot of it is malware or binaries that are packed, obfuscated, and so on, then certain things jump out at you. In this case, I mean the 5A4Dh and 4550h hex numbers. Here, let me mark those as strings:
They now say ZM and EP, aka MZ and PE in Intel's little endian order. This means that the very first thing this binary is doing is parsing out a PE file header. That's weird. That's suspicious.
Side note: I am pretty much always immediately suspicious when I'm looking at a binary. That's because I self-select my files from a population of suspicious contexts, because I am looking for something bad, and go in predisposed to find it. And oh look, a lot of the time I'm right. I am aware on some level that I live inside this bias feedback loop. Moving on.
Let me take a little time to explain WHY this is suspicious. As I said, I spend a lot of time reversing malware as well as what I will broadly call "shellcode". For purposes of this discussion, I'll define "shellcode" as code segments that have arrive in some way lacking the usual executable file format, such as a PE file on Windows.
This might be something shoved into a buffer intended to exploit a buffer overflow, which is where we get the "shellcode" term from. And I'm also using it to mean code chunks that have been maybe decrypted, decoded, downloaded, etc. into memory. Under these circumstances, these chunks of code often don't know in advance what memory address they will be running at, and don't have the addresses for system and library functions they want to call to perform their job. Those things are normally all handled by the OS loader function when loading an executable file format such as PE.
So when a chunk of code finds itself executing in an unknown context, it has to figure out the information it needs. Usually, that's done by locating kernel32.dll in memory (which is always mapped into every running userland process in Windows), and finding the LoadLibrary functions, then the GetProcAddress function, and bootstrapping the rest from there. You will also often see things locating VirtualAlloc and VirtualProtect to make itself a chunk of memory to shove shellcode into, which then proceeds as described.
Every PE file or DLL file that gets loaded into Windows memory keeps these MZ and PE headers around in memory, which can then be parsed to find locations of functions in them.
I haven't fully explained the process for this bootstrapping here yet. And I should, because I want to understand it better myself, too. But suffice it to say that when I see this, my brain immediately says "Hm, games are being played."
Note that I do not know in this case that anything untoward is actually taking place here, I'm writing this up more-or-less real time as I do this. But I've got my eye on it.
I said all that, because I wanted to cover the MZ/PE thing. That said, as I just spent 5 or 10 minutes looking at this function, I now think it's probably an initializer for the C RunTime (CRT). IDA will often identify library components like this for you and then drop you at WinMain instead, but it didn't in this case. Inside this start function I find the call that I believe is actually a WinMain. That's actually a clue, if IDA drops you at something it has labelled start, it's starting with the program entry point, and it couldn't automatically identify the CRT. If it had, it would usually drop you at something it would label Main or WinMain.
At this point in the exercise, I spend 10-20 minutes looking around the rest of the binary. I'm seeing a lot of functions that have to do with command options, keypress mapping, terminal types.. Oh, this is a whole telnet client.
At this point I leave it alone for a day or so, until the shower thoughts hit. This makes sense.. the exploit supposedly logs in via telnet as an admin user. You have to implement some portion of a telnet client. And unless it's just a hard-coded exploit that drops c:\pwned.lol (Why did LinkedIn turn that into a link? There's a .lol TLD? Dang, pwned.lol is taken.) then it has to give you a way to implement an interactive shell, right? So whole telnet client. There's probably no great way to connect via telnet, auth, and then hand the IO off to a different telnet client.
Well, but wait. You not only need to implement a telnet client, you need to implement one that knows how to speak whatever weird proprietary auth Microsoft added to telnet that enables this vulnerability in the first place. There's probably just one telnet client codebase that implements this, the ones that Microsoft ships with Windows. I'm thinking maybe the exploit author got ahold of the leaked Windows source, and modified it, and compiled... no, wait. I think to look at the telnetbypass.exe properties:
Ok, that IS the Windows telnet client. As in, the binary. Meaning, this exploit author patched the Microsoft binary to do the job. It's a valid technique, I've done it myself. But it does put a certain spin on the claim as to why source code wasn't released.
Looking just slightly further into that, I look up what version of Windows generally goes with version 5.2.3790.0. This page:
https://en.wikipedia.org/wiki/List_of_Microsoft_Windows_versions
It says Windows XP Pro x64, from April 25 2005. (And it's also the x64 version of Windows Server 2003. XP x64 is just Server 2003 with an XP skin.) And then of course I want to see if I can find the matching unmodified binary. To avoid grabbing and unpacking many XP x64 ISOs looking for it, I used Discmaster to look up some:
And I find this one, which is pretty close:
Manually scanning the two binaries in IDA, I can see that, yes, I'm looking at the same program.
I also discovered that BinDiff doesn't work correctly anymore with IDA version 9.x, and the effort to make something work there easily exceeds my curiosity level in this instance.
If I were doing a task where it was super important to me that I know precisely what was changed in the binary, then I would hunt harder for the exact matching version. If you didn't notice it, the versions in the two above are 5.2.3790.0 for the telnetbypass.exe versus version 5.2.3790.1830 for the telnet.exe I found. Then I could just compare bytes to find any changes.
Of course, those could be lies. Those fields are easily modified in an unsigned binary like these. For that matter, telnetbypass.exe could just be a completely unmodified copy of a random Windows telnet.exe client, designed to waste our time.
And no, I still don't know why the CRT for a legit Windows binary is looking at the PE headers like that.
Nice. btw You said "why one mind find". I think you meant 'why one might find.' Was your prose hacked?! ;)