InfoQ Homepage Presentations Speed at Scale: Optimizing the Largest CX Platform out There

Speed at Scale: Optimizing the Largest CX Platform out There

View Presentation

Speed:

Download

50:06

Summary

Matheus Albuquerque shares strategies for optimizing a massive CX platform, moving from React 15 and Webpack 1 to modern standards. He discusses using AST-based codemods for large-scale migrations, implementing differential serving with module/nomodule, and leveraging Preact to shrink footprints. He explains how to balance cutting-edge performance with strict legacy browser constraints.

Bio

Matheus Albuquerque is a Staff Front-End Engineer at Medallia, building their surveys platform and helping them shape the customer experience market with React. He's also a Google Developer Expert specializing in Web Performance.

About the conference

Software is changing the world. QCon London empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

Matheus Albuquerque: This is the title of the session, "Speed at Scale: Optimizing the Largest CX Platform Out There". CX as in Customer Experience. I'm Matheus Albuquerque. I'm a staff software engineer at Medallia. I'm GDE in performance. I'm part of the program committee for React Summit New York City. You can find me pretty much everywhere as ythecombinator, including X. Now that you know me, I want to know a little bit more about you, so, who here is doing frontend? Who here works with React?

Preamble

I wanted to start by sharing a bit of my motivation for putting together this talk, so a little bit of a preamble. There's this post by a guy called Vadim, it's called Silent Engineers, and it's a very interesting one. There's this paragraph I really like that goes like this, "While browsing Hacker News, I sometimes get the feeling that every developer out there is working for a FAANG, as there's always posts from those people doing some hyped stuff. Or you might think that PHP is never used nowadays because whenever it's mentioned, everyone is hating on it in the comments". He goes on, "But let's be straight, that's like 1% of all the developers out there. The rest of them are just lurking and coding with their language of choice and being content about it. Be it Fortran, COBOL, Perl, or PHP".

One thing I wanted to share is that this talk is not about the last version of React, or the last version of Svelte, or all of those new cool ways of rendering things. I do have a talk about all of these topics that you can check online, but that's not what we're here for. We're here to discuss the challenges of optimizing performance at work, because I love this analogy when they say that optimizing performance is just as simple as drawing a horse. You have those four steps, and then you just got to add some details and you're good to go. This was supposed to be the funny part of the session. Talking about performance at work, the one last thing I got to share with you is what I work with. Medallia is a customer/employee experience company. We are one of the biggest players in this area. I specifically work in their surveys platform. That's basically what you get if you booked a stay at a hotel, or if you're flying some airline, or if you're banking and they want to know how your experience was.

Then you have different ways of inputting. That's what this talk is about, modernizing dependencies within a legacy frontend codebase. We'll discuss some techniques like code splitting. We'll talk about Preact. We'll talk about how to do all of that while you still got to target old browsers because your users could be running on Internet Explorer 10. All of these challenges with legacy codebases plus our experiences, what we're still playing with, and some lessons learned.

Act I: Modernizing Dependencies

Starting with modernizing dependencies, just for you to have an idea, this is where we were at the beginning of this whole thing. React 15, Node 10, and webpack V1, so not really great. We started with webpack because the web is flooded with all of these cases how, like, we started using webpack 5 and we got smaller chunks, we got all of these caching strategies, and a lot improved. The thing is, we were on version 1, so a lot of things that, for example, used to be plugins, they were turned into built-in configuration. A lot of things that were configuration, they were turned into different sorts of configuration. Not to mention some plugins that died or some plugins that didn't have webpack 5 support because they were followed up by different plugins. We faced a bunch of got you moments, and the first of them was that we tend to think that you just want to make your plugins compatible with a certain version of webpack, but actually you have to ensure that they're compatible amongst themselves with the versions that you have.

The second thing, and it can be obvious to a lot of people, but you cannot go from V1 to V5. You go from V1 to V2, and then V2 to V3, and then sequentially all the way to V5. That's a lot of effort, and that's before LLMs were a thing as they are these days. The third thing is that, for example, you might have a minimum requirement, like you have Node 10 was the minimum requirement for webpack V5, but your different plugins are going to have different requirements, so you also need to make sure that you're meeting, ecosystem-wise, like Node, all of the requirements from all of your plugins, which in our case were applied. We were then upgrading Node from 10 to 18, which was the LTS at the time, but the thing is, some of you probably went through this thing, but there's something called Node fibers, and Node fibers was a key part of a lot of the Node.js ecosystem, including Node-sass. A lot of things were just written on top of Node fibers, and there was this change in V8 a couple of years ago that basically made Node fibers deprecated in Node 14 onwards. A lot of the versions of the things that we were using, like sass, we also had to adapt to find new dependencies to do those things. The sass team, they have a really interesting post about the discontinuation of Node fibers.

We are getting to the interesting part that is going from React 15 to 16. You might be wondering, why is this really a thing if you're thinking just about performance? I would understand if the case was maintenance. It turns out that sometimes, going from one major to another, the React team, they introduce performance improvements besides the obvious ones, so bundle size reductions. This is the post where they announced React 16 a couple of years ago, and you can see that they, for example, had a 32% decrease in the bundle footprint if you're counting React plus React-dom. This was exactly the kind of thing we were looking for. Then there are some tricks. This codebase, for example, was not 100% TypeScript, so we also had prop-types all over the place. The trick is, prop-types is a part of the React package in React 15, but it's a whole different package in React 16. We needed to revisit all the components where we had different usages of prop-types. We started thinking of great strategies to doing that.

The first one was, we could use some diff library, like Node diff, and I started writing functions that would go and try to find occurrences of prop-types, and then we had some predefined diffing, so these functions would make changes to these files. Suddenly, it got super complex, and there were a bunch of edge cases we were not covering. For example, when you're importing with a different name, or when you import but then reassign it, or when you deconstruct, and so on.

Then the second thing was patch-package. How many of you know patch-package? I bet you've done some React Native in the past. That was super common if you're doing React Native. Patch-package basically allows you to write git-alike diffs to your dependencies, and then you can run them up on npm install, for example. Then you can have your own versions of third-party dependencies. What I wanted to do with patch-package was use some of its internal diffing primitives, and write something that could do our case, but then the thing is, first of all, I saw myself using a lot of their internals, not their APIs, so not really a great practice. The second thing is, soon enough, I was rewriting a lot of patch-package.

Then I realized, this is not the first time I'm going from React 15 to 16. At the time React 16 was released, I was working at a different company, and we had to go through this whole migration story. I even had a session at the time back in 2019. In this session, I was talking about jscodeshift. This is the answer, jscodeshift. jscodeshift is a tool by Facebook or Meta, and the short definition is, jscodeshift is a tool that runs a transformation script over one or more JavaScript or TypeScript files.

Then if you expand that, you're going to see, it reads all the files you provide to it at runtime, and then analyzes and compiles the source into the AST, or abstract syntax tree, as part of the transformation. It then looks for matches specified in the transformation script, deletes them, or replaces them with the required content. Then it regenerates the file from the modified AST. If you're not usually working with compilers and static analysis, it might sound a little bit mind-blowing, but it turns out that it plays really well with other open-source libraries we have out there, like react-codemod. The React team maintains this thing called react-codemod, and a bunch of other open-source teams, they maintain those CodeMods, and basically you can plug them in jscodeshift to do these massive migrations.

One of the things they had covered out of the box was exactly the prop-type things we needed. Here comes the second challenge. We needed to do all of that behind a feature flag, or some on and off mechanism, because we could not simply roll out such a big change for more than 3,000 tenants with millions of survey takers out there. We needed to allow clients to, ok, there's this performance enhancement, do you want to have it or not? There could be risks.

That's when we started digging more into the codegen thing. To do a little bit of a recap, we had to support, in parallel, different versions of developer dependencies like Node.js, webpack, Babel, and so on. We had to support different versions of the app dependencies themselves, like React and react-dom, and different strategies for bundling and serving. Different config files for Babel, different config files for webpack, all toggleable on and off. This is what it looked like. We had this monorepo where we had, for example, the core of the application, and then another package with a suite of functional tests, and so on. What we got was this new next box, and next as in next iteration, not to be confused with Next.js. Basically, if we expand, so under what's called legacy, we had the default bundle that served if you don't have the feature flag. That's where the active development happens, so if we needed to fix bugs, or if we needed to introduce new features, and all of that. This was the code that was versioned in git. This is where everything we pushed. This was the code that was built with the legacy dependencies, so 15, Node 10, webpack 1, and so on.

Then, under this next wrapper, it was hidden behind a feature flag, we had all the modern versions of the dependencies. It was mostly composed of code transformers using jscodeshift. Other automation scripts that would copy and move files to different places. The new version of the config files, like new webpack config, new Babel config, and so on. Also, some parts of the code that we did not manage to code generate, like some very specific unit tests, or functional tests. Then, we had the code that was generated itself. It was generated at build time, so that means in our CI, Jenkins, but if you wanted you could also run the scripts locally. That included all the React components, utilities, and other business logic that we had in the legacy version. Then, it was rewritten in a way that it could be bundled with React 16. It looked like this if you ran locally. We had a bunch of nice messages describing each step.

Then, exploring the transformers more. There is this site called AST explorer. It's super helpful if you need to do anything that has to do with static analysis. You can just copy your code here. This is one example where I'm copying just four import statements. For example, if you select one of them, like the first one, then you can see the whole tree, how these are turned into import declarations and so on. This is the basic anatomy of a jscodeshift transformer. This is kind of a Hello World. These transformers, they take a file, and then you read the file with jscodeshift. You have root, and then you go to the jscodeshift file.source. Then, what we're doing here is we're finding variable declarations that are called foo and are renaming them with bar.

Then, you just return the new source. That's what a very basic transformer would look like. Then, we created this abstraction that we were calling transformer groups. Each group had a label to describe what it was doing. It had a pattern of files that we would match using globby. Then, it had the path to a transformer that we could then run on the files matching that pattern. There's a bunch of posts out there about jscodeshift and conducting massive migration using transformers and codemods.

Another thing we also needed to codegen was the gitignore file, because we didn't want to have two very similar versions of the same app. We didn't want all that duplicated on GitHub. Basically, we needed to code generate the gitignore file as well, because it's impossible to maintain more than a thousand lines of gitignore. That also made a process of understanding how inclusions and exclusions work in gitignore. Even though they sound super simple, they can be tricky. We also had this thing automated that, don't ignore this folder, but do ignore this file, then this at all the levels of the project. It was actually just a few lines of code to do that. Again, using globby to match patterns and all that.

One last thing was WebDriver. WebDriver is this tool for running end-to-end tests. Basically, they also have codemods. We were using version 4 and we needed to go to version 7 because of the same Node fiber thing, because we had all of their callback-based sync APIs. After the Node change, we needed to use the async promise-based APIs. Not to mention the whole version 4 to 5, 5 to 6, and so on. We had the same issue we had with webpack. We've got to go sequentially. They had codemods for that as well, again, officially supported by the WebDriver team. Running them was super smooth. We had the transformers, for example, for the config files. Also, the transformers for the step definitions that you create when you're automating tests with that.

Then, analogously for version 7. They all claim in their docs no further changes are necessary. We know it's not always like that. Because of the Node thing, we had to do all the sync to async API migration. They also had a codemod for that that would basically go through your functions that were calling the sync APIs. It would turn your functions into async functions and turn all the WebDriver API calls into await calls. This was super helpful because otherwise we would have to do this by hand. Again, this whole thing is a pre-LLM era thing. They had a bunch of edge cases documented. For these we needed to manually address them.

This whole codegen thing is actually quite a thing for a lot of projects. In this specific project, because you're probably wondering, why did you guys go from 15 to 16, why not 17 or 18 or even 19 these days? For this specific project, we were blocked by Enzyme. It tells how old the project is. All of our unit tests for components are using Enzyme. We just didn't want to throw away thousands of tests. We are a huge company and we have a lot of other codebases like our dashboards, reporting tools, and all of that, that are using different versions of React, including 18 mostly. For a lot of our internal tools, we've been migrating to 19.

If you've conducted a React 19 migration, you know that some APIs that were flagged as deprecated, they have now removed it. There's a huge list of APIs they just removed. They provided codemods for that. You can also use the same codemods in the prop-types fashion, but with those modern APIs. There's this company called Codemod. They're one of the main open-source maintainers of codemod. They have a bunch of case studies that are also interesting with companies like Cal.com, and even large-scale Netlify migrations. It's super interesting to check these out. You will find this with other communities as well. Nuxt, which is super popular in the Vue ecosystem, they also have codemods for migrations. Even pnpm, they also have official codemods as well.

The company running most of these codemods, they even built a whole editor that they call Codemod Studio. It's like AST explorer, but a very enhanced version of it. Of course, these days they've been even adding AI support to just pretty much everything else. It's a very interesting thing to explore if you're into static analysis, or if you have a challenge you want to address with static analysis.

Act II: Code Splitting

At this point you're probably noticing that those are very different phases of continuous improvement to performance. The second part was catching up with modern stuff like code splitting. We had this one huge chunk that was loaded for all of our survey takers. That was quite unnecessary. We thought, we can split this into smaller JavaScript and CSS chunks and then load them as necessary. If you have worked with React, you probably know React.lazy and React Suspense. Code splitting with these both has been supported for a lot of years at this point, since React 16.3 or 16.6. That's where we started our experiments. We got really good results initially. We had this one baseline chunk and then we had a bunch of other chunks that would be loaded depending on the kinds of questions you needed to answer. Because if you're answering a survey, at least that page of the survey probably doesn't have all the possible question types. We don't need to load them all for you. That's what it looked like. You would see in the network tab that basically we're downloading only the chunks that are necessary to answer the questions that are being asked on that survey. Just like any other thing you do with optimization, it might also have a cost.

In our case was, as a platform, we allow customers to write arbitrary JavaScript and CSS to modify the look and feel and the whole experience of the survey. People can get really creative with the kind of thing they're going to do. One thing is, we suddenly started code splitting and dynamically loading things. Some of the scripts that our clients had written, they were expecting DOM elements to be there when they actually were not there. Runtime errors and regressions, which is not great. For that, we needed to take a couple of steps back for some of the things and roll them back. Also start sandboxing and come up with an orchestration strategy where basically we would have those scripts wait at least until everything is loaded and available, and only then we run them.

Act III: Preact

Another thing is Preact. Who here has heard of Preact? Who here has used Preact? Preact is this smaller alternative to React created by Jason Miller. It's way smaller because of many different reasons. One of them is that they don't have as much legacy code as React does. For example, they were just born at a different point in time. Also, their events, they don't have React synthetic as an event system. They have a way lighter event system. You can see the bundle size difference is huge.

One thing they invested a lot is in the Preact compat layer. That's a stable officially supported compatibility layer that is supposed to allow you to just out of the box use Preact in the next existing React codebase. The way you do this is by configuring your bundler, in our case webpack, to alias wherever there is React imports to Preact compat. The results were also really impressive. This is with no code splitting. You're seeing just one big chunk. We went from 205 kilobytes to 175 kilobytes, but just using Preact instead of React. There are all the niceties. A way smaller bundle, faster VDOM implementation and more efficient memory usage. Again, we also had problems. There is this very funny thing that started happening where basically elements would load out of order. Imagine like the question number five was at the beginning of the page and then question number two and then four. This was super random, so non-deterministic at all. I found actually a bunch of open issues on GitHub describing this very same behavior.

All of them, what they were doing had nothing to do with what we were doing. Same problem, but not the same context. After a lot of trial and error, I figured that the relation was Preact 10 with our code splitting thing with React Suspense and React.lazy. We experimented with loadable components, that is how we used to do code splitting before React.lazy, and Suspense. The tradeoff here was actually worth because loadable components added 2 kilobytes to our bundle footprint. We were already saving 30-something kilobytes because of Preact. Also, we ran very extensive QA and all of that, and we had no regressions. That's what we went with.

Act IV: jQuery

jQuery 2025. It's actually funny that if you check Wikipedia, you will see that a huge percentage of the web still runs with jQuery. Seventy-seven percent as of 2022. Again, we were not using jQuery. A couple of years ago, we thought it was a good idea to just allow you to out of the box write jQuery code in your customizations to define the look and feel of surveys. It's like it was available and you didn't have to bother importing it. Again, this was a decision taken a long time ago. The thing is, it's there so we couldn't simply take it because people wrote a lot of customizations assuming it's there. We created a feature flag that would allow you to turn jQuery on and off for different tenants, different instances. For the simple fact that first we had the performance side of jQuery, so you're loading a few extra dozens of kilobytes just for the sake of having jQuery.

The other thing is, depending on which version of jQuery you're using, you also have a bunch of open CVEs that can be flagged during security audits. We needed to remove jQuery, but where and how exactly would we turn the feature flag on? For that, we used static analysis again. We built a complex pipeline that would go through those customizations written by clients using JavaScript. That pipeline was using Babel, mostly Babel, the as utilities, and traversing utilities and all of that. We had, as far as we know, all the possible cases of how you could be using jQuery from the very simple stuff like calling the base function with some selectors to defining your own jQuery plugins or importing some open-source jQuery plugins and all of that. We're talking here hundreds of static analysis to find all these cases.

This allowed us to understand not only who was using or not using, but how people were using, because then we can see, for example, which are the most used jQuery methods and APIs? Who is using them and how? Because then if we want to draw a migration path for different customers, we can tailor for that customer's needs. This is build-time detection, but we wanted to be super safe. Who here knows about proxies from ES6? Who here has built something with proxies? Proxies for me are one of the most underrated things that we got in ES6. They're super powerful. Actually, there's this session by Michel, the creator of MobX and Immer. It's 2018, the session. It's called, The Wonderful World of Proxies. He's basically sharing how he optimizes a lot of things in Immer using proxies.

In this case, we basically used proxies to extend jQuery in runtime, and each time you had a survey that called a jQuery function that would be reported to our observability and monitoring, because then we would be able to, for example, spot any issues in the previous step, any misdetected tenants or people that we detected to be using, but in the last six months actually had zero jQuery calls. Super powerful. The other kind of thing you can do, and this is just a demo, we never rolled this out. You can even go one step further and do a compiler that translates jQuery code into code that uses the native APIs from the web. This is one example converting an ajax call to using Fetch API. We did this with a bunch of other use cases like replacing those selectors with document.getElementById, or all of that, this kind of thing.

Act V: Targeting Different Browsers

It's all super interesting and it's really nice to modernize things. At the same time, we need to move fast and make sure we're not breaking things. One thing I noticed, myself included, is we often don't ask ourselves which browsers are using our apps. For example, this is top 10 from a snapshot I took from our RUM solution a long time ago. You'll find the obvious things, so Safari, Chrome, then Firefox and all of that, and then the iOS version, the Android version. Then, if you go away to the bottom, there's Opera and Internet Explorer. The moment you start scrolling is when things get interesting because you see BlackBerry devices, Nokia devices, Samsung Tizen devices, and even a PlayStation browser. I don't know how this happened. We need to think about polyfilling. We need to think about strategies for not breaking things.

At the same time, we don't want to get this kind of warning from Lighthouse that says, avoid serving legacy JavaScript to modern browsers. We cannot overdo it. Basically, if you're thinking about polyfilling, yes, you don't have to send polyfills for things that are already supported, and you should not do them preemptively. At the same time, you need to ensure that essential functionality is there because you cannot risk runtime errors, especially in a survey. Because let's say you open a survey, it crashes, you're not revisiting it. You're not interested in filling it out in the first place. You cannot risk this kind of thing. That's when you start thinking about polyfilling strategies.

The first one we considered at the time was Polyfill.io. It was actually a very popular service two years ago and even some months ago. The way it works is it basically inspects the user agent of the browser of your visitor, and then it injects only the necessary polyfills. The DX of it is really great because you're just adding one extra script in front of your main application bundle. It also allows you to pick a subset of polyfills. It was really good. You could just head there and pick the ones you consider necessary or you could be super safe and add all the necessary. It would build a URL for you. Then that's how you add it. You would have your bundle and then you add the polyfill with the subset you decided to.

Then, it's a different server. For some companies, this might be an ok risk. Especially if you have complicated compliances and things like this, this is probably a no-go. First for performance, because if that's a different server, then you have to establish connection to that different server and you might be adding some extra milliseconds to some metrics like TTI. The second thing is if their server goes down, then either your application is going to be delayed or even it's going to crash if you really depend on that polyfill. Some time ago, people would basically work this around by self-hosting polyfill because it is an open-source project. You will find even posts how you can replicate this in Cloudflare, for example. For us, it was really complex to roll our own infra around this and we were not going to use the external server. Summer last year, we were really happy that we didn't go this way because there was actually a supply chain attack. Basically, a Chinese company acquired the domain and they were injecting something via the domain. If you were relying on Polyfill.io, then you got attacked. That means 100k plus sites. That was crazy.

Another thing you can do is use BABEL/PRESET-ENV. Basically, if you're using Babel, you have this useBuiltIns options. It has two possible values, it can either be entry or usage. The trick with the entry one is that it might not be very useful if you're targeting really old browsers like Internet Explorer, which we were. The other thing is it might remove some polyfills, but some of them might still be downloaded by everyone. Then you have a more aggressive approach that is useBuiltIns usage. The first thing, it doesn't add polyfills for your dependencies unless you're piping your Node modules through Babel, which might not be a good idea. Because it's more aggressive, you might end up with runtime errors in the legacy browsers.

Plus, at the same time, it is a more aggressive approach. You can still end up with excessive polyfills because of the strategy they use. Let's say you have Array.includes and you want to polyfill that, it will add polyfills for pretty much anything that has an includes method, like String.includes, and so on. Symbol.toStringTag, it will add polyfills to Math.toStringTag as well. Summarizing, if you're going the Babel way, you can go with entry mode to safely reduce things or usage mode with a more aggressive approach. Actually, this could work for you depending on the amount of regression tests you have, and this kind of thing.

Then there is a third approach that is called differential serving. I first heard of differential serving via this podcast interview with Jason Miller, the creator of Preact. It works on the assumption that if you're creating two bundles to begin with, you don't have to ship polyfills because one of the bundles is already including the polyfills. Also, you don't have to transpile everything down to ES5 or whatever. For example, if you're writing classes, they will remain classes. They're not going to be converted into huge templates based on prop-type. The same thing for Async/Await/Generators, they will remain Async/Await, they will not be converted to state machines that work on top of ES5 compatible things. There are some open-source projects that are implementing the differential serving and differential bundle for webpack and other bundlers. At the time we ended up writing our own thing, which was actually super small.

Then, basically, you have to serve these two bundles. The way we went with is based on the nomodule attribute and on the type attribute that can have the value module. This means that old browsers, for example, the ones that do not support ES6, they will not load what's marked with type="module" scripts and they will load what's nomodule, and vice versa. You can use nomodule to serve those polyfills for the ones that need them. Again, a bunch of got-you's. There is this differential serving project on GitHub that they basically map how this strategy works across different browsers. It's super tricky because some browsers, they will download both bundles and execute both bundles. You're missing the whole point. Some of them will download both bundles, but only execute one of them.

Some of them, there's one edge case where actually it downloads the ES, the modern bundle twice and the legacy bundle one time. You might end up downloading the same thing three times and you're nuking your whole optimization. Definitely not going with simply that. One way you can do to work this around is with user agent detection. There are plenty of examples using Node servers out there. Basically, this is a very basic implementation if you're using Express, or Koa, or Fastify. You basically get the user agent of the request and then you have some decision logic to understand, should I send this bundle or the other one? The main thing here is you might not be in control of the server that's serving your frontend bundle. That's exactly what happened to us because our React application was served by a very huge Java monolith that was code owned and still is by a whole different team, a whole different division. They didn't want to increase the size of the monolith adding complex user agent detection.

The second thing is, if you load it the wrong way, then you're missing some of the benefits that it would have like string parsing and off the main thread compilation. Then the third thing that might sound super hacky but it turned out to be the best thing for us is runtime detecting. Basically, you create a DOM script tag, and you use it to see if the property that you're looking for is actually there or not. Depending on whether it is or not, you download one version or the other. This is like the first thing that runs in your application and it's going to tell which bundle to download later.

Summing that all up, so, yes, Polyfill.io is easy and it doesn't ship anything to modern browsers. It can add to some Core Web Vitals like TTI and FCP. At the time the whole thing happened it could also mean some security problem in your application. You can use BABEL/PRESET-ENV. It's also easy to set up, because at the end of the day you're just configuring Babel. It can end up being not very helpful for very old browsers or requiring you to compile your Node modules for Babel. There's this whole module/nomodule thing that's very easy as you saw, and has very wide support, but you're only stripping a certain size. The code you're making is ES6 versus non-ES6. That's what we went with, so module/nomodule, and then runtime detection. Different apps could benefit in different ways, so do check if you have this kind of challenge and experiment a lot.

Results

I wanted to share some results. We ended up with a bunch of very huge PRs. These are just some of the two poster boys we had. Yes, terrible. It's maintenance nightmare. It all worked out. All of our unit tests were all passing, which is a huge number. Basically, this is the bundle footprint reduction we got. For modern browsers we were delivering a 37% smaller bundle footprint, that meant going from 280 kilobytes to 176-ish kilobytes. Then for legacy browsers, mostly Internet Explorer, we went from the same 280 kilobytes to 223 kilobytes, so 25% smaller. Basically, we draw the line like this. Modern browsers meaning everything that supports ES modules. As of the September last year snapshot, this means almost 96% of the global usage.

Then legacy meaning everything that doesn't fit the previous criteria. If you check caniuse.com, this is what the line looks like. Everything in green means it supports the smaller bundle. Everything in red means it doesn't. When we cross this data with our RUM data, with our actual users, again a very old snapshot, but it basically looks like this. Only 0.2% of survey takers would get the bigger version, and most people would benefit from the one that's 37% smaller. Some other metrics. Core Web Vitals, most of them improved. This is where we were before. This is where we got. If we delta that, FCP, Speed Index, LCP, and TTI all got a couple of seconds faster. We also checked other things that are not just Core Web Vitals. Things that are measured, for example, with web page tests. A lot of visual completeness and things like this, you can see that they slightly improved with the feature flag on versus with the feature flag off. Same thing about visual progress of building the page. You can see when the feature flag is on, you get it all a few milliseconds before, so improvement.

Our Radar

A couple of other things that are also in our radar. Performance is a lot about internal culture and tooling. Small things like, for example, showing your team that they can use IDE extensions to measure the bundle footprint impact. Having them use things like bundle phobia before they add a new dependency to your app. Also, using all the resources we have with the tools we're already used to using, like the Coverage tab in Chrome Developer Tools is super useful for figuring out what can be taken out or not. Also, things like Median. Median is a very popular tool that started with Median.js and now they're building Median Linked, which uses a lot of static analysis. Now they're even combining this with AI to basically give you insights as you're building your React app on things that can be optimized like wasted renders and things like that. Windowing. We've been leveraging a lot of windowing in some of our features.

For example, a question that has a dropdown and some clients get creative and they add 6, 10, 15k units in the dropdown to have people scroll through them. We've heavily optimized this with windowing. There are more things we want to experiment with in the future like offloading things with Partytown, using different compression strategies for text and image because we all know about Gzip and Brotli, but there's Zopfli, Guetzli, Zstandard, and many other tools built by companies like Meta and Google. Even QUIC, for example. QUIC is this part of the HTTP/3 spec built by Google a couple of years ago, and it fixes some of the HTTP/2 issues like reducing roundtrip time and things like that. For me, that's all extremely mind-blowing.

Closing Thoughts

Now, finally, some closing thoughts. I really think that understanding the internals and the rationales behind the tools we use can come in really handy when we have to build our own abstractions. From this talk, for example, as a takeaway, I would say the jscodeshift tool for running massive codebase migrations, and even using Babel and static analysis to understand how people were using jQuery. There's this post by Tom Dale from 2017 that's called, Compilers are the New Frameworks. He says, "So here's my advice for anyone who wants to make a dent into the future of web development: time to learn how compilers work". I'm not even talking about the hundreds of languages that we have that compile to JavaScript. That's a non-extensive list. I'm talking about what we saw, like using static analysis and compiler technology to build a DSL to solve actual problems at work. That's impressive.

A huge part of our tooling, like our linters, formatters, bundlers, transpilers, type checkers, minifiers, CSS things, all of that, that's a lot of static analysis. Still making those references, there is this post by Ryan Carniato, the creator of SolidJS, and he used to be part of the Marko team at eBay. It's called, Compiling Fine-Grained Reactivity, it's from 2022. He says, "Reactive systems have existed even in this space for years. In fact, reactivity was seen as a bad thing for a while with the rise of popularity of React. The thing that has made reactive programming interesting, again, are compilers". He says, "Static analysis and compilation let us take what we know of your code structure and optimize the creation path as we already know what you're trying to create. We can see what parts of the template is static. We can infer from where dynamic sessions are used how to run the most optimal code".

We've seen that with things like Prepack, a tool that Facebook used to have, and now Million. We have Solid, we have Svelte, and now we have the React compiler too. If you check the blog from Sathya, one of the core maintainers of the React compiler, he has a lot of interesting posts on the programming languages theory behind the React compiler. You would never imagine that amount of computer size in frontend. It's really interesting. Back to the whole polyfill thing, yes, you have to ship polyfills for all the browsers your users might have.

At the same time, it's a bad idea to theoretically ship everything they need. Always, whenever possible, try to correlate business metrics with your performance metrics. Because at the end of the day, these are the ones that matter way more than just Lighthouse scores or web page test scores. Don't take fast networks, CPUs, and RAMs for granted. Test on real phones and networks, and leverage a lot of RUM. Bandwidth means different things across the globe. Even something like 4G and LTE, that is something we understand to be one thing, actually has different meanings across the globe. LTE can be very slow in some places and very fast in some other places. Check things like, The Performance Inequality Gap, by Alex Russell. He publishes this every year. Super interesting. It will give you a really good feel of things. The cliche is true, no silver bullet. Always identify the things that work for you.

See more presentations with transcripts

Recorded at:

Apr 17, 2026

Matheus Albuquerque

InfoQ Software Architects' Newsletter