Skip to main content

Command Palette

Search for a command to run...

The one thing that no one properly explains about React — Why Virtual DOM

Updated
5 min read
The one thing that no one properly explains about React — Why Virtual DOM
S

I have been in love with computers and everything related since I played my first game of Dangerous Dave back in the 90s.

Graduating with a master's degree in Biology, and a bachelor's in Pharmacy; I have been fortunate enough to dabble my hands in things ranging from copywriting to computer networks.

I love pixel art!

I've written a few more things about me, here and here!

The other day a friend had this React question for me: “Composition through components, one way data binding; I understand all that, but why Virtual DOM?”.

I’ve given him the usual answer. "Because, direct DOM manipulation is inefficient, and slow."

“There’s always news on how JavaScript engines are getting performant; what makes adding something directly to the DOM slow?”

That is a great question. Surprisingly, I’ve not found any article that properly pieces it all together, making the case for the need of a Virtual DOM rock solid.

It’s not just the direct DOM manipulation that makes the whole process inefficient. It is what happens after.

To understand the need for a Virtual DOM, lets take a quick detour, a 30000 feet level view on a browser’s workflow, and what exactly happens after a DOM change.


A Browser’s Workflow

NOTE: The following diagram, and the corresponding explanation uses Webkit engine’s terminology. The workflow is almost similar across all browsers, save for a couple of nuances.

WebKit Main Flow

Creation of the DOM tree

  • Once the browser receives a HTML file, the render engine parses it and creates a DOM tree of nodes, which have a one-one relation with the HTML elements.

Creation of the Render tree

  • Meanwhile, the styles both from external CSS files, and inline styles from the elements are parsed. The style information, along with the nodes in the DOM tree, is used to create another tree, called the render tree

Creation of the Render Tree — Behind the scenes

  • In WebKit, the process of resolving the style of a node is called “attachment”. All nodes in the DOM tree have an "attach" method, which takes in the calculated style information, and return a render object (a.k.a. renderer)

  • Attachment is synchronous, node insertion to the DOM tree calls the new node "attach" method

  • Building a render tree, consisting of these render objects, requires calculating the visual properties of each render object; which is done by using the calculated style properties of each element.

The Layout (also referred to as reflow)

  • After the construction of the render tree, it goes through a “layout” process. Every node in the render tree is given the screen coordinates, the exact position where it should appear on the screen.

The Painting

  • The next stage is to paint the render objects — the render tree is traversed and each node’s “paint()” method is called (using browser’s platform agnostic UI backend API), ultimately displaying the content on the screen.

Enter the Virtual DOM

So, as you can see from the above flow of steps, whenever you make a DOM change all the following steps in the flow, right from the creation of the render tree (which requires recalculation of all the style properties of all the elements), to the layout, to the painting step, all are redone.

In a complex SPA, often involving a large number of DOM manipulations, this would mean multiple computational steps (which could be avoided) which make the whole process inefficient.

This is where the Virtual DOM abstraction truly shines; when there’s a change in your view; all the supposed changes that are to be made on the real DOM, are first made on the Virtual DOM, and then sent on to the real DOM, thus reducing the number of following computational steps involved.

Update: The following comment from redditor ugwe43to874nf4 does more justice to the prominence of Virtual DOM 👏🏼

The real problem with DOM manipulation is that each manipulation can trigger layout changes, tree modifications and rendering. Each of them. So, say you modified 30 nodes, one by one. That would mean 30 (potential) re-calculations of the layout, 30 (potential) re-renderings, etc.

Virtual DOM is actually nothing new, but the application of "double buffering" to the DOM. You do each of those changes in a separate, offline DOM tree. This does not get rendered at all, so changes to it are cheap. Then, you dump those changes to the "real" DOM. You do that once, with all the changes grouped into 1. Layout calculation and re-rendering will be bigger, but will be done only once. That, grouping all the changes into one is what reduces calculations.

But actually, this particular behaviour can be achieved without a virtual DOM. You can manually group all the DOM modifications in a DOM fragment yourself and then dump it into the DOM.

So, again, what does a Virtual DOM solve? It automates and abstracts the management of that DOM fragment so you don't have to do it manually. Not only that, but when doing it manually you have to keep track of which parts have changed and which ones haven't (because if you don't you'd end up refreshing huge pieces of the DOM tree that may not need to be refreshed). So a Virtual DOM (if implemented correctly) also automates this for you, knowing which parts need to be refreshed and which parts don't.

Finally, by relinquishing DOM manipulation for itself, it allows for different components or pieces of your code to request DOM modifications without having to interact among themselves, without having to go around sharing the fact that they've modified or want to modify the DOM. This means that it provides a way to avoid having to do synchronization between all those parts that modify the DOM while still grouping all the modifications into one.


Further Reading

The above Browser workflow has been excerpted from this document on the internals of browser operations. It delves deeper into a browser engine’s hood, explaining everything in detail; definitely worth your time to read it from end to end. It helped me a great deal in understanding the “why”, and justifying the the need for a Virtual DOM abstraction.

Hope this was of help. Let me know if you have any questions in the comments.

A
A A Karim8y ago

Easy enough to understand. Thanks for this.

1
M
Mev-Rael9y ago

Have seen this "explanation" and diagram many times. But still, useless and actually doesn't explain everything.

  • When a Virtual DOM wants to what you call "rerender", it needs to use native DOM API itself because otherwise it is just impossible to communicate with the browser. Why I need a lot of extra layers here when I can do just A -> E instead of A -> B -> C -> D -> E with document.createDocumentFragment() for example?

  • When exactly a browser is doing "rerendering"? What about "forced reflow", "read first and write second"? Why you think modern browsers are not doing a lot of optimization already.

  • Any real business code examples where everyone can compare good vanilla JS DOM manipulation and same code with Virtual DOM? Please no more useless innerHTML in the loop. I've been loading and rerendering 10000+ comments with plain JS easily.

  • Any benchmarks?

These simple questions in normal situations developer can answer quickly, however in this "virtual" problem even core author of React couldn't answer any of them.

DOM isn't slow, you are - https://korynunn.wordpress.com/2013/03/19/the-dom-isnt-slow-you-are/

1
S
  • A complex SPA would involve a bunch of DOM changes whenever you perform an action on the UI, batching all of them together as one, and then sending it to the DOM, to reduce the browser computations (as explained in the story) is Virtual DOM’s USP.

  • Of course, if you’re only having a single node change mapped to all possible actions in your UI, you are indeed adding more layers; and the concept of a Virtual DOM is useless. As far as benchmarking is concerned; you can validate this fact with the following benchmarking tool: https://localvoid.github.io/uibench/ As an instance, you can see the sort operations, (which generally involve “more” DOM manipulations) are faster by a factor of at least 2x (The benchmark I’ve performed was VanillaJS vs React)

  • The benefit of createDocumentFragment() is visible when you are appending new nodes to the DOM; but a general SPA use case calls not for adding new nodes to the DOM, but manipulating the ones that are already present in the DOM tree. If you say you would use it to dump the changes then you would have to write your own reconciler to know what old content to remove, to be replaced by this new fragment.

  • It is not that the browsers are already doing a lot of optimisation; they are, whenever a reflow/repaint happens the browsers already optimise for it. Read the “How Browsers Work” document linked in the story for the specifics. But the fact that every DOM manipulation is followed by all the steps and calculations given above; along with the fact that you don't have to keep track of which piece requested what modification, is what calls for the need of a Virtual DOM.

  • Lastly, Virtual DOM is just a cherry on the React cake. What makes React (not VirtualDOM, but React) prominent is this neat idea on how you would go about having your UI as a function of your data, that is brought together with the concepts of unidirectional data flow, and composition of your UI parts through components.

Hope this helps. Since you say you're "loading and rerendering 10000+ comments with plain JS easily" (you meant components?), may I know how exactly are you doing this; is this a part of app you are building? If yes, how about performing a benchmark on this, and a React alternative. That would be cool.

3
M
Mev-Rael9y ago
  • I understand the idea of vDOM. Any app today has changes in the UI but you don't need to for that to remove original body/section and place a modified body/section instead all the time. With vDOM you have a bit of less UI computations, but you now have new computations - you need to store in memory a huge document and each time search in it for changes. After that when vDOM will render into DOM browser still will do it job, you can't avoid it.
  • Can you provide a specification for sorting use case - What do you have, what do you want to do and what results are you expecting? I will write a vanilla JS sorting example and then we can compare. It's hard for me to browse so many .ts files to find what exactly it is doing :) In any case in real world you will never have so many operations at the same time and 2) screen size is very limitted, you don't need to do anything with 100 posts above I already scrolled, only when I can see changes +- some space, only then UI needs to change.
  • As a developer I always know what exactly must be changed, when and where. Yes createDocumentFragment() does the job when you add nodes but when you modify... I again need real examples of these complicated SPAs to be able to answer this. For each use case there is a simple solution. What exactly must be done? ...and it's not sorting a huge table every 10ms on my screen. Avarage human reaction is about 300ms. No more then one click per 300ms, no more then one action per 300ms.
  • What you call a React's idea to separate app into components is noway related and invented by React. It is a common and general way in software architecture and it was for many many years. It's hard to say who was first, but we can be sure that this idea is at least 25 years old since it is one of the core concepts in what is called a UNIX Philosophy. In frontend we actually had these components at least for 10 years of jQuery where each jQuery plugin or a subset from jQuery UI library - is a component. Talking about the data every app today inherits from 3-tier architecture (MVC ot MVwhatever) and again it is a very old principle and not connected to React. Every software has separation of business logic and presentation logic. User clicks on button, calls for action, XHR made, server reads DB, returns data, client puts this data in DOM. It always was unidirectional and Angular's 2-way data binding just blowed up the Internet with another buzzword and useless technic. In my vanilla apps I have a very small window.Api object after that I have a models directory with small ajax/API-speaking objects composed with window.Api and my usual vanilla reflow looks like:
    // like a comment
    btn.addEventListener('click', () => {
    const commentId = btn.dataset.id;
    // waiting for AJAX response, if there will be errors 
    // PostComment based on Api object will show a small alert
    PostComment.like(commentId).then(() => {
      // and if everything is ok, here we are doing our rerendering,
      // it's up to PostCommentUI object how to do that, 
      // in this case it could be just 
      // 1) incrementing .textContent (and NEVER InnerHTML) in some <span> where like count is stored
      // 2) mark a like button as active, something like btn.classList.add('active'), that's all
      PostCommentUI.like(commentId) 
    });
    });
    
3
A

you need to store in memory a huge document and each time search in it for changes.

It's not true, because it do not store whole VDOM in memory forever. On each render (setState() call) it make lightweight DOM representation and efficiently compare it with real DOM. Then after render garbage collector remove VDOM from memory.

The main profit in VDOM: it do efficient DOM patches. Second profit is scheduling DOM updates. First DOM render will be slow as innerHTML.

Why you think modern browsers are not doing a lot of optimization already.

We have legacy and mobile browsers that can't do work effectively.

K

Of course it stores the VDOM in memory. What would it compare against when it does a rerender?

The virtual dom will always be slower than manual dom updates. Just try making a list of 10,000 items and appending to the list. The virtual dom will have to do an expensive diff each time only to see that one element has changed. In the end it will call the same appendChild method that you would of called in Vanilla JS. A few times I've had to bail out of the virtual dom because it was just too slow.

The virtual dom is about making things easier for developers. It is not faster and it doesn't unleash some magical API that only the virtual DOM can access. If you don't believe me React core developers have said the same thing. Namely that doing what the virtual dom does in vanilla js will always be faster.

Batching updates doesn't make it faster either. Browsers wait for all tasks and microtasks to finish before rendering the next frame. So even if the virtual dom does dom updates all at once and you do it in vanilla JS mixed in with your business logic, it wont make a difference unless you are using certain properties and methods that trigger a reflow. If you use fragments you can beat the virtual dom every time.

T

So, does the vDOM do the WHOLE DOM? or just a specified singular element? I assume vDOM is based on something like MutationSummary (https://github.com/rafaelw/mutation-summary) for picking up changes to it? Furthermore is the vDOM essentially just a JSON/similar representation of the real DOM?
Is it possible to view/access the vDOM in react?

3
S

Yes, a copy of the whole DOM tree is maintained in the form of plain JavaScript objects. It seems MutationSummary is a library just to keep track of the DOM changes. VirtualDOM is different. Think of it as a proxy to your DOM.

It is more like this library here: https://github.com/Matt-Esch/virtual-dom

Instead of updating the DOM when your application state changes, you simply create a virtual tree or VTree, which looks like the DOM state that you want. virtual-dom will then figure out how to make the DOM look like this efficiently without recreating all of the DOM nodes.

Why would you want to access the virtual DOM? I'm not sure, but there is no direct access to it that I know of.

Also, I recommend reading this article for the "How-it-works" part of the Virtual DOM: http://calendar.perfplanet.com/2013/diff/

1
T

@saiki Essentially I want to serialise the DOM and send it to a server. However, what I also want to do is save any changes to the DOM and send only the changes too. But it feels like the vDOM isn't exactly what I'm looking for?
I say this because I need to access/select the virtual DOM in order to send it somewhere. Where as, with react it feels like it does this all in the background and then just outputs the results for the browser engine to process.

I hope I explained myself well enough there.

1
S

@hipkiss91 Ah, I see! Probably the dom-serialize library along with MutationSummary, will be of help. This is a super interesting use case; let us know what solution you come up with, when you do. :)

1
T

@saiki Thanks for the pointer! Unfortunately I did actually already create a DOM serialiser! Thanks for the interest too :). I just thought I might be barking up the wrong tree (see what I did there). Thanks again

2

More from this blog

Sai's Blog

10 posts