This information is provided exclusively for the purposes of legitimate penetration testing, education, and further security research. The only way to improve security is by testing it. It's only once we acknowledge a problem (e.g. in a security solution) that we can take steps towards fixing it. Being aware of a potential false sense of security is equally important. In addition, this article contains nothing unique from an offensive perspective that hasn't already been covered elsewhere. Leveraging our newfound knowledge and some methodical thinking, we come up with a strong new defense for a common attack.
In this article, we will take a look at the security network appliances (sometimes referred to as next-generation firewalls or NGFWs) provide: what they're good at and where cracks their start to show. We will set up a small experiment in which we create a JavaScript payload which won't be picked up by these network appliances. Then, talk about the current state of defense for these kinds of attacks and where there's room for improvement.
A Common NGFW Goalπ
File introspection with the goal of blocking/detecting EXEs, scripts, and other downloads is a common feature of next-generation firewalls (NGFW) prevalently deployed by enterprises. To sidestep this security measure, attackers commonly deploy a technique known as HTML smuggling. It works by hiding they desired file in JavaScript code which is downloaded by the target and then decoded into the final payload. A target will simply see a file is being downloaded and while being none the wiser to how it got there.
Image courtesy of Microsoft Threat Intelligence.Email attachments from sources external to an organization are, of course, often stripped. Hence why a website, with a link in the email, is usually used to deliver a payload in the first place. Most large email providers like Gmail or Outlook also always strip attachments with troubling extensions.
As a defense against this, some security solutions started incorporating signatures for these HTML smuggling JavaScript tools so they could be detected over the network (as opposed to the final payloads themselves):
Above are the detections of one such HTML smuggling project: EmbedInHTML. In practice, a NGFW's detection for this type of payload would be much higher. The anti-malware software on VirusTotal aren't as geared towards detecting this type of threat. Even still, there are some detections.
Where They Failπ
At their core, NGFWs work by looking for specific byte sequences, which is typically implemented as a YARA signature. Despite the marketing hype of most NGFWs (a term which some may in itself regard as a buzzword) and what some non-technical salespeople might claim, this is (as myself and others know from experience very well) what most detections come down to, especially for a network appliance. There's some more impactful protections in terms of host reputation filtering; it's common for NGFWs to disallow direct IP access and restrict access to newly registered or cheap domain names (also dependent on some other heuristics). But passing that, there's not much more a network appliance can do.
With this in mind, there are countless ways (infinite, really) for an attacker to throw off such a naive signature. The first one that comes to mind is throwing the JavaScript payload into obfuscator.io, enabling RC4 encryption, and clicking "Obfuscate".
Important Tip: Ensure you select the RC4
encryption option under String Array Encoding
. It's this encryption which causes the payload to come out undetected by all scanners. Most scanners can see right through obfuscation or detect on obfuscation itself (VirusTotal scanners aren't configured to do that). Although, I find this practice questionable because there are legitimate reasons to obfuscate. In actual fact, it would be better to drop the obfuscation entirely as it serves no purpose but to draw unnecessary attention. Instead, build your own XOR/RC4 decryptor and dynamically execute your code in a few lines of JS. I'll leave this as an exercise to the reader to prevent potential skiddies who hardly know how to program from getting their hands on it.
We should surely strive to make evading detections more difficult than this...
Detectionπ
As we saw previosuly, detections, of course, already exist for this JavaScript obfuscator. Additionally, signatures are typically made in a way that would persist through code transformations. But, of course, all that changes when you use encryption to cryptographically (read: mathmatically) ensure no network appliance could possibly introspect on the contents of what's being transmited. Okay, here it's only weak RC4 encryption but the point stands; no network appliance can afford to bruteforce RC4 when people expect a responsive Internet. To clarify, I'm talking about application-level encryption; not transport layer encryption (referring to the OSI model) like what TLS/SSL provides - NGFWs have no problem getting insight into the latter because enterprises typically install a self-signed digitial certificate created by their on-premise certificate authoritity on to all their devices.
Signatures could also be written specifically for the encrypted output of an obfuscator like obfuscator.io, however, that's really besides the point when someone could easily write their only incredibly simple JS XOR decryptor in what amounts to a few lines of code. I just used that website as a quick and easy example but, like I said, there's no shortage of ways for accomplishing the same goal.
To me, detecting on the use of an obfuscator on its own isn't sufficient to classify a payload anyway, because there are legitimate reasons organizations might choose to obfucate their code. For instance, they may seek to protect their proprietary code thus slowing down people reverse engineering it. This practice is especially normalized on the web, where code transformations such as minification (which naturally also serves to somewhat obfuscate) are practically expected to keep files sizes as low as possible.
It's worth noting that my ideas regarding the usefulness of NGFWs from a security perspective aren't at all new. The latest hot topic with NGFWs is, to no surprise, all about AI and machine learning. I gave it a chance by reading Palo Alto's "Inline Machine Learning Solution Brief" document and came out utterly unconvinced that this is a "paradigm shift in cybersecurity". I'm not saying NGFWs are completely useless when you start talking about analyzing different protocols within the local area network and places where hiding within application layer encryption isn't so doable, but in the case of protecting against HTML smuggling, they have no utility beyond host reputation filtering (which can be made to not really matter in a targeted attack).
As you can see, common web practices coupled with people's high expectations for network performance greatly limits what a network appliance can do from a detection standpoint. Common detection methods that may be applicable to executable files (e.g. EXE/DLL) don't transfer well.
Response (UPDATE)π
Florian Roth (@cyb3rops), a well-known detection engineer in the security space, acknowledged this hole in detection two days after this post went live.
Okay, I just checked and so far we have postet 96,387,836 comments on @Virustotal
β Florian Roth (@cyb3rops) August 17, 2023
( and we have deactivated some very good but very noisy rules, e.g. for JavaScript obfuscation, which would have triggered on this sample https://t.co/3niDn2OQLf ) pic.twitter.com/3eos9Gyv1a
A few years ago, when I first came across this method for evading detection, the original output from obfuscator.io did indeed get caught by VirusTotal. However, after enabling RC4 encryption as I said in my tip, the sample had zero detections.
Trying this method out again today, I wondered why the sample was undetected by VirusTotal even without RC4 encryption. After all, detection is supposed to get better over time, not worse! I'm glad to have this question answered now.
Regardless, the point I was trying to make wasn't about obfuscators; those are a dime a dozen (although, this JS-compatible esoteric language is pretty interesting). My point is that anyone can easily utilize application layer encryption by whipping up a custom, tiny XOR decryptor to fly straight past current detections with certainity. I probably shouldn't have diluted my point in the original work by introducing obfusctor.io to prove this; I just thought it was a cool open source project and I like open source so I used it as an example.
I respect Florian's goal of improving the state of JavaScript detection. However, it appears that the current approach for achieving this is through static detection, which is unlikely to yield a salient defense given everything I've put forth so far. So, in the next section I'll contribute so my own ideas toward closing the HTML smuggling blindspot.
A Different Defensive Approach (UPDATE)π
As we know, HTML smuggling is an attack that occurs on the application layer of the OSI model. The best way to secure against a threat on the application layer is to implement a defense that also exists on the application layer (i.e. not statically scanning HTTP respones on the network layer).
This is currently a work in progress. Please check back later!
Detection software (such as antivirus, endpoint detection and response, and next-generation firewalls) should only be used as part of a more holisitc defense-in-depth security strategy. This is primarily because security through detection is a cat-and-mouse game. If 'real' security is what you want then have a look at my blue team Qubes OS projects on GitHub or otherwise.