Next.js middleware was completely optional until 2 weeks ago
I break down a CVE where an attacker could bypass middleware in every Next.js version. What was it trying to do? How did it break? Why do we use middleware at all?
Two weeks ago, Next.js had a critical vulnerability that affected every version that has ever been released. This has a 9.1/10 CVSS score, meaning that the attack is easy to perform and has severe consequences. An attacker can craft a HTTP request that causes Next.js to skip running any middleware that should have run.
Researchers Yasser Allam and Rachid Allam put out an in-depth explanation explaining the vulnerability that they had discovered. I highly recommend reading it.
They identified three different sets of vulnerabilities that were in Next.js’ handling of recursive calls. It looks like these are subrequests that are intended to be made after the middleware has executed. So Next.js adds these headers to notify the handler that the middleware can be skipped. However, the system didn’t treat these headers like they could be supplied by an untrusted user. So it blindly accepted its request to disable middleware.
In versions of Next.js prior to 12.2, simply passing the header “
x-middleware-subrequest: pages/_middleware
”, causes the middleware to be skipped and the page to be executed.After version 12.2, it gets a little more complicated. Now it could either be “
x-middleware-subrequest: middleware
” or “x-middleware-subrequest: src/middleware
” depending on whether the application placed the middleware in the first or second location.In recent versions, Next.js allows calls to execute recursively up to 5 times before skipping the middleware. So now you can simply call “
x-middleware-subrequest: middleware:middleware:middleware:middleware:middleware
” to trick Next.js into believing that the middleware had recursed 5 times already and simply did not execute it.
This bug checks all of my boxes for “is this a fun CVE?” As someone who often works on hidden load-bearing layers of applications, I love it when hidden infrastructure layers of an application turn out to have juicy bugs. I love security vulnerabilities that are simple enough to explain in a casual blog post. And who doesn’t love a complete bypass?
Why is the impact so high?
Many news publications advertised this as an “auth bypass.” It’s a lot worse than that, but it’s also true that this can be an auth bypass. It’s common for authorization and authentication middleware to redirect to login screens when the user is not logged in, or to read the user ID from a secure cookie and provide it within the request’s context. Normally you’d expect the endpoint to try to access this user object and then break because of a null pointer exception, but it depends on how the endpoint is written.
You can imagine the appeal of auth libraries that work exclusively with middleware, especially in the frontend environment where Developer Experience is king. You can just tell users “Installation is soooo easy. Just install this middleware and it runs everywhere! Add a React login form in minutes!” And then you hope nobody notices that the Next.js docs have heavy-handed recommendations that put middleware as an optional up-front step, backstopped by architectural decisions.
For both cases, we recommend:
Creating a Data Access Layer to centralize your authorization logic
Using Data Transfer Objects (DTO) to only return the necessary data
Optionally use Middleware to perform optimistic checks.
If the CVE were just an authentication or authorization bypass, it would still be a severe bug. But this can bypass all middleware. The researchers give another motivating example: is a pesky Content Security Policy getting in your way? Well, run the bypass and that CSP goes away. My own motivating example is more fun-oriented: is a middleware script enforcing a max field size and preventing you from stuffing the entire contents of the Bee Movie script into every text field in your app? Put a bypass on it.
There was another interesting bit of drama, which was that Cloudflare attempted to block requests in this format and then Cloudflare had issues with their fix. It turned out that a lot of requests had valid reasons to include the header, so there was no way for them to filter attacker traffic instead of valid use cases. So at the moment, Cloudflare’s implementation is opt-in.
The internet is held together with magic intercepting scripts. I’ve worked for several web companies, and every company has a magic and hidden implementation of critical functionality that completely abstracts all of the hard stuff from their endpoint. It turns out that you just don’t want to repeatedly write the same integration code
In practice, writing endpoints is tedious. You need to declare the route and the HTTP method used to reach the endpoint. Any authentication code needs to run. Any authorization code needs to run. If the user is not authenticated they need to get redirected to /login and if they’re not authorized they get an error. Then you need to unpack and validate the request. If there are any JWTs, gotta check those. Was this larger than the max request size? That sounds important. Let’s also make sure that exceptions get caught and logged somewhere. Maybe something with distributed tracing. And now that you’ve performed all of these steps, we can now start serving the request.
Or, hear me out: you can just write middleware that lets you handle it in every endpoint that needs it.