Analysis of CVE-2021-40444

Analysis of CVE-2021-40444

In the month of September 2021, a famous 0-day vulnerability has been reported in many blogs and forums, it was CVE-2020-40444 MSHTML Remote code execution vulnerability and its attack cycle included Microsoft Word as a victim process that initiates the attack. The information on the internet concerning 0-day exploits is always so vague therefore I decided to play with this CVE. On 14th September 2021, Microsoft released a patch for this CVE. Since I am using Windows 10 version 20H2 for x64-based systems, so I download the relevant version of Microsoft update having the code KB5005565 and an immediate previous version with the code KB5005101. Both these updates can be found on the Microsoft Update Catalog.

I installed the latest patch updates and found that two files have been modified in Windows System that look suspicious, I extracted files mshtml.dll and urlmon.dll. After that I installed an immediate previous version of that update and extracted the same files. Used IDA Pro to disassemble these binaries and calculated the Binary difference using BinDiff plugin and started looking at all the matched and unmatched functions to find out exactly what the patch was.

In mshtml.dll binary diffing, I found few functions with lower similarity. I quickly suspected the ShellExecURL(IURi *) function because in CVE-2021-40444, MS word is actually rendering html webpage using mshtml.dll and this function looks relevant.

Looked up at the control flow graph of this function and found there are some extra conditional jumps added in the patched binary of mshtml.dll.

I decompiled the code to understand what the patch is. It looks like Microsoft had added two new functions mainly GetSchemeName and IsValidSchemeName. It checks if a scheme is valid or not using conditions like checking if the URI scheme starts with non-alphabet and ascii character for example a “.”. If it starts with a dot or some other non-alphabet and non-ascii character, then the scheme is invalid. Furthermore, the patch also checks if it contains any non-ascii, non-alphabet or non-digit character in the rest of the URI string and invalidates if such character exists, which didn’t really make much sense at first. So, I tested the decompiled code in my Visual studio environment and dry run the malware sample URI which is invalidated due to numerous reasons. The first character of that URI is a “.” which invalidates the scheme but let’s assume the attackers used control.exe instead of .cpl which bypassed this limitation because now the first character is an alphabetic and an ascii character. But still this URI is invalidated because in the later conditions, the patch is checking all characters from the URI one by one restricting the use of characters so that means “.” is still present in the URI like in original malware sample .inf file is executed, hence the condition breaks us out of loop and scheme is invalidated again.

Now in the binary diff of urlmon.dll, I found another interesting thing. There was only a single function that have changes in it called CatDirAndFile. Every other function is 100% similar in this binary. I viewed the control flow graph and find many different conditional jumps in the assembly.

We can see in the assembly of this function that there are many conditional jumps in CFG. That means that I have found an accurate patch. Now I need to understand what has been changed here. I decompiled the code in IDA pro to get better understanding of this patch. I found that an extra do while loop has been added in the patch which simply replaces a forward slash “/” with the backwards slash “\” in the path.

How did this simple conversion of forward slash to the backwards slash had patched a 0-day vulnerability? To understand this, I’ve looked up at the original malware sample and PoC’s of the CVE-2021-40444. In the original sample, an html file is rendered through Microsoft Word which loads a JavaScript that downloads a temporary .cab file on the system using urlmon.dll. The .cab file extracts .inf extension file which contains attackers arbitrary code in a fixed path that is /AppData/Local/Temp/. Now this extracted .inf file is executed by mshtml.dll using the .cpl scheme as shown in the original sample JavaScript file.

So what Microsoft has patched is the directory traversal vulnerability, which means that using a “../” an attacker can traverse to the parent directory and with this, urlmon.dll can extract .inf file to that specified path. Now if the forward slash is converted to backwards slash that means directory traversal is not possible and no file will be extracted to this specified folder for the execution using ShellExecURL(IURi *) in the mshtml.dll. Moreover, there is a second layer of security added in the patch that I discovered above in the mshtml.dll function which invalidates any URI that contains a “.” in it, meaning that if a start of URI is a non-ascii and non-alphabet character like dot in .cpl and a file with an extension like dot in .inf can’t be executed by ShellExecURL(IURi *).

To summarize all this, Microsoft had a 0-day vulnerability which is weaponized into an attack that uses Microsoft Word. If the document is opened by victim, the ActiveX object in the document refers to an html webpage. When mshtml.dll renders the html webpage it contains a JavaScript that downloads a temporary .cab file on the system using urlmon.dll. The urlmon.dll then extracts the .cab file into a fixed path by traversing the parent directories. The file which is extracted is a driver file .inf which contains malicious code. The mshtml.dll then executes the .inf file using ShellExecURL via .cpl in which it now checks if a URI Scheme is valid or not in the patched version. Due to patching the directory traversal vulnerability, attacker can’t extract files on the victim system using relative path with “../”, hence no malicious code will be executed if no file is extracted. Even if an attacker somehow bypassed this limitation, Microsoft had added another layer of security by invalidating every URI that contains a “.” in it, hence no file type will be executed.

Note: While diffing some patched systems I found that even though the patch updates have been installed in a system, the urlmon.dll file has not been updated and loophole still exists. It is an unlikely condition but still happens sometimes. So, it is better to be sure if the patch is really installed in a system or not.

Are you looking for