Skip to content

[WIP][lexical] Feature: GenMap - A generational copy-on-write NodeMap implementation #7674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

etrepum
Copy link
Collaborator

@etrepum etrepum commented Jul 6, 2025

Description

Replace the naïve Map implementation of NodeMap with a version optimized for structural sharing when the Map is over a size threshold (>= 1000 entries currently).

GenMap has two "modes" based on the _mutable flag:

  • !mutable - The nursery is considered readonly, on first mutation it will either do a Collection (create a new _old from the previous _old and _nursery) or make a Shallow Copy of _nursery before changing to mutable mode. This is the default state.
  • mutable - The nursery is considered writable, it has been modified since construction or clone.

Each GenMap contains two maps (both created on demand):

  • _old - A read-only snapshot of some previous state (equivalent to an old Map based NodeMap), generated by the last collection
  • _nursery - The working set of nodes (or tombstones) that have changed from _old. Tombstones are used to mark deletions of nodes that are still present in _old but will be removed at the next collection. Whether or not this is safe to mutate directly is based on the _mutable flag.

A GenMap can be cloned from another GenMap. This will mark the previous GenMap as not mutable (so its nursery must be copied before use) and reference its _old and _nursery directly.

  clone(): GenMap<K, V> {
    this._mutable = false;
    const clone = new GenMap<K, V>();
    // clone is already _mutable = false
    clone._old = this._old;
    clone._nursery = this._nursery;
    clone._size = this._size;
    return clone;
  }

Terms

Collection is when the next _old is generated. Pseudocode:

this._old = new Map(this.entries());
this._nursery = new Map();
this._mutable = true;

Shallow Copy is when a copy or the nursery is made (_old is not changed). Pseudocode:

this._nursery = new Map(this._nursery);
this._mutable = true;

Notes

This is an optimization because the cost of copying a Map is linear in both memory and time. In large documents, most nodes are not updated on each update cycle, so we are only paying the cost to copy the smaller nursery. Occasionally we pay the full cost to do a collection, but this is amortized since we don't do it often.

Related:

Test plan

All existing suites pass

TODO

  • Benchmarks
  • More specific tests

Copy link

vercel bot commented Jul 6, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
lexical ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 6, 2025 11:06pm
lexical-playground ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 6, 2025 11:06pm

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 6, 2025
@etrepum etrepum added the extended-tests Run extended e2e tests on a PR label Jul 6, 2025
Comment on lines 541 to 543
for (const [key, node] of editorState._nodeMap) {
nodeMap.set(key, $cloneWithProperties(node));
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't figure out why it was done this way, there aren't really any tests that exercise this properly. I could only guess that this was a bodge that would be better fixed by updating _cloneNotNeeded accordingly?

editor._dirtyLeaves = new Set();
editor._dirtyElements.clear();
} else {
editor._editorState = editorState;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code didn't do anything at all if there was no active state, this seemed like the right thing to do? Not really sure without good tests either way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. extended-tests Run extended e2e tests on a PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants