-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[charts-pro] Add new SankeyChart
#18895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Deploy preview: https://deploy-preview-18895--material-ui-x.netlify.app/ Updated pages: Bundle size report
|
CodSpeed Performance ReportMerging #18895 will not alter performanceComparing Summary
Footnotes |
Great speed, good job 💪
We can make it so that the links are the source of truth, and the const data: SankeyValueType = {
nodes: [
{ id: 'A', label: 'Node A' },
{ id: 'B', label: 'Node B' },
{ id: 'D', label: 'Node D' },
],
links: [
{ source: 'A', target: 'B', value: 5, color: 'red' },
{ source: 'A', target: 'C', value: 3 },
{ source: 'B', target: 'D', value: 3 },
{ source: 'C', target: 'D', value: 1 },
],
}; The benefit of such approach is that all sources and targets in
Another potential improvement is to define the nodes as an object, rather than an array: const data: SankeyValueType = {
nodes: {
'A': { label: 'Node A' },
'B': { label: 'Node B' },
'C': { label: 'Node C' },
'D': { label: 'Node D' },
},
links: [
{ source: 'A', target: 'B', value: 5, color: 'red' },
{ source: 'A', target: 'C', value: 3 },
{ source: 'B', target: 'D', value: 3 },
{ source: 'C', target: 'D', value: 1 },
],
}; That way we have O(1) access to nodes and we ensure there aren't multiple nodes with the same ID. Do you see any downside? Finally, what's the plan for dealing with cycles? It seems d3-sankey just throws. Do we just want to let that bubble up? |
I think the downside is that you can't have "nodes" without links. You could technically want that for, in the example of energy, display untapped potential. Though the current code doesn't support that too.
I thought about it, but the underlying d3 func needs nodes as array
I had a fix for it ready, it's just checking if the |
I see. How would that work, though? If the value is in the links, there isn't a way to define a "untapped potential", right? How would you imagine that working in terms of API?
We could convert it, right? IMO, the array API looks better, but I also like the idea of making impossible states impossible to represent, which in this case would mean using an object keyed by node ID.
There can also be indirect loops, right? Something like: const data: SankeyValueType = {
links: [
{ source: 'A', target: 'B', value: 5 },
{ source: 'A', target: 'C', value: 3 },
{ source: 'C', target: 'A', value: 3 },
],
}; Creates a link A -> B -> C -> A. Looking at a single link's I'm not saying we need to implement this cycle detection algorithm ourselves. My question was more like: do we want to catch d3's |
🤦 |
Node formatAbout getting nodes as an object instead of an array, I'm more in favor of the opposite direction. Because the array as an order and this order can be used to sort nodes. Just by using the D3 methods, you get
If you already know the fix order you're looking for, providing the CyclesOf course, D3 needs to compute the Topological ordering of DAG to slice nodes per column. I think we can raise the error. Because even if you catch the cycle, you don't know which element should be removed, so we don't know what to display |
// Calculate layout based on data and dimensions | ||
const layout: SankeyLayout = React.useMemo( | ||
() => calculateSankeyLayout(data, drawingArea, nodeWidth, nodeGap, iterations), | ||
[drawingArea, data, nodeWidth, nodeGap, iterations], | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more a comment for latter. WHile playing with Echarts, it looke like the sa,key ayout is a state that get saved in a state, such that external modification can have different behaviors:
- drawing area update => reshape the current structure with linear interpolation between the previouse and next area available
- data update => do a full re-computation
This is especially visible when you play with the drag and drop
Capture.video.du.2025-08-01.11-36-24.mp4
About choice between array/Map strucutre, the internal frequently convert from array to either In case you wonder how the circular get find, It's a brutal |
An object also has consistent sorting. We can allow the provided object to serve as the order, though I would prefer we just accept a sorting function instead for custom orders.
We can just return the error for now I guess. |
<defs> | ||
<clipPath id={clipId}> | ||
<path d={`M${x0},${y0} L${x1},${y0} L${x1},${y1} L${x0},${y1} Z`} /> | ||
</clipPath> | ||
</defs> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you seen this alternative solution to avoid clipPath
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah but his solution creates this weird shape

It gets more pronounced the more the Y axis is moved apart.

I have current solution that creates a shape without these issues though.
I'm just not sure the styling drawbacks were worth it, though it is up for discussion. I still have the implementation in a stash.
Mainly the line can be dashed and the browser does all the work for us. The code to generate good lines is complex.
In short it:
- Gets the current link data
- Creates a curve based on the given parameters
- Split this curve into multiple paths
- Calculate the angle of each path
- Create a new point above+below the original point using the angle (extruding the line)
- Ensure no point/shape is crossing the boundaries
- Transform into a path.d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about
- if link height > link width => D3+clip-path
- else => naive aproach
I spent to much time trying to find a formula that tells us when D3 link will do weird stuff or not without success
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a "math solver"? 🤔
Thanks for the formula, will try it out. I also spent some time trying to figure it out, but couldn't.
I've noticed I forgot to share the repo I was trying stuff on, if you want to play. 😓 https://github.com/JCQuintas/sankey-link-path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used sympy to run the maths. The exact terms might be "symbolic computation" since I did not use the solver method :)
If you want to have a look
from sympy import *
init_printing()
H, W, h, t = symbols('H W h t', real=True)
# Besier curve starting at (0, 0) going to (W, H) with control points at (W/2, 0), (W/2, H)
x = t**3*W - 3/2*W*t*t + 3*t*W/2
y = -t**3*2*H + 3*H*t*t
# The derivative of the besier curve
xd = diff(x, t)
yd = diff(y, t)
# The norm of the derivative
nd = sqrt(xd**2 + yd**2)
# The parallel curve at a distance h/2 from the initial curve
xp = x + h/2 * yd / nd
yp = y - h/2 * xd / nd
# compute x'(t) for the parallel curves
result = simplify(diff(xp, t))
# If a weird shape occurs, at the end the curve will go from right to left instead of left to right.
# So x'(t=1) will be negative
print(simplify(result.replace(t, 1)) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen this 😮
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you've simple maths to do you can try
https://www.wolframalpha.com/input?i=cos%28atan%28x%29%29
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up using the naive approach and adding a prop instead to better control the control points of the bezier curve. This offers nice looking curves, paths, and an option for the user to adjust the looks if they want the links to look nicer/fuller
Screen.Recording.2025-08-21.at.14.36.45.mov
packages/x-charts-pro/src/SankeyChart/SankeyTooltip/SankeyTooltipContent.tsx
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor concern about item identifier type. The rest looks good 👍
|
||
return { | ||
identifier, | ||
color: node.color!, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All those color!
and value!
are sign of a type issue. If you're sure those are with default value TS should be aware, because user doing their own tooltip will not know be able to guess it.
IMO the root cause is your bending of the notion of identifier.
The initial idea was to provide the minimal set of data needed to find an item. Most of the time the
- series type
- series id
- data index
such that if one value is different you know it's not the same item.
Your SankeyItemIdentifier
contains much more information. For example it can get the color, width, ..
For me the SankeyItemIdentifier
should only contains
- series type
- series id
- (type="node", node id) | (type"link", source id, target id)
And from this identifier the tooltip should get the full data.
For example when user will want to control the highlight, the highlightedItem
prop will get type SankeyItemIdentifier
which allow them to provide color
and other useless properties.
Providing a more complete object for onClick
events looks good, but we should have two distinct types:
SankeyItemIdentifier
: A dry type that only focus on the necessary data. Used for our internal comparisonsSankeyItemIdentifierWithData
: A type that also contains all the data we know about the link/node related to this item. Used for user facing callback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The root cause is the sankey types in d3, the identifier types have nothing to do with the !
assertions. The d3 sankey types assume the link/node are both initialised and not initialised, so the types are always the "non-defaultized" way, to put it into our internal language.
This causes this issues, I can change that by making everything required, and it should fix the types here, I just didn't think much of that at the time.
But regarding the identifier overloading, I'm not sure you suggestion is a good approach for sankey.
Both us and the user would have to traverse both the node and links array and re-create the data back in order to display the tooltip, while all these values are already calculated. We could store this and provide a hooks for that, but for what reason if the data needed can be readily available?
IMO it adds a lot more overhead for no real value except the idea that the identifier should be "simple".
It works for the other types because the data is simple, but as we start to work towards more complex visualisations, I believe we will need to change this concept anyways.
Eg: how would an identifier of node in a treeview look like? We could argue a path of ids like ['451ad', '1212b', 'jki40']
, or the node entity itself with a parent ref node.parent
are both valid, but which one would be best?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both us and the user would have to traverse both the node and links array and re-create the data back in order to display the tooltip, while all these values are already calculated. We could store this and provide a hooks for that, but for what reason if the data needed can be readily available?
I think there is a miss-understanding. If the data is available, I'm in favor of providing it to the user. Having an onClick?: (event, fullNodeData) => void
is ok. I even think it could be an improvement for other charts in next majors.
My issue is with the typing. If you reintroduce the highlight feature, you will get itemHighlighted?: SankeyItemIdentifier | null
. And for me it feels wrong that the autocomplet in itemHighlighted
proposes to provide the node value, color, and other fields.
For example you have a dashboard about energy with multiple charts, and want you have a feature allowing you to select between electricity, gaz, or fuel to highlight/modify charts. In that case I would expect to write something as follow, and don't see any other property supported by TS
<SankeyChart highlighterItem={{type: 'sankey', subType: 'node', id: energyType }} />
Idealy we would have SankeyItemIdentifierWithData extends SankeyItemIdentifier
to simplify controll
const [nodeItem, setNodeItem] = React.useState<SankeyItemIdentifierWithData | null>(null);
<SankeyChart
highlighterItem={nodeItem} // I pass more data than needed but that's ok
onNodeClick={(_, newNode) => setNodeItem(newNode)} // I get all the data available
/>
It works for the other types because the data is simple, but as we start to work towards more complex visualisations, I believe we will need to change this concept anyways.
Eg: how would an identifier of node in a treeview look like? We could argue a path of ids like ['451ad', '1212b', 'jki40'], or the node entity itself with a parent ref node.parent are both valid, but which one would be best?
I don't get your example. There are multiple ways to represent data, but that's not an issue. We can pick the one we prefer.
In your case have an internal representation with
/** The unique identifier **/
interface TreeMapItemIdentifier {
type: 'tree-map';
seriesId: SeriesId;
path: string[];
}
/** The item identifier used for callback**/
interface TreeMapItemIdentifierWithData extends TreeMapItemIdentifier {
value: number;
color: string;
formattedValue: string;
...
}
interface TreeMapProps {
highlightedItem?: TreeMapItemIdentifier | null;
onItemClick?: (event, item: TreeMapItemIdentifierWithData) => void;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SankeyItemIdentifierWithData extends SankeyItemIdentifier
There is no way to implement this right now I guess. Since we require the itemIdentifier
from the ChartSeriesConfig
.
I can make the type have an optional node/link, so the identifier becomes the nodeId/targetId/sourceId
export type SankeyItemIdentifier = {
type: 'sankey';
/**
* Unique identifier for the series
*/
seriesId: SeriesId;
} & (
| {
/**
* Subtype to differentiate between node and link
*/
subType: 'node';
/**
* The id of the node
*/
nodeId: SankeyNodeId;
/**
* The node object with all the calculated properties
*/
node?: SankeyLayoutNode;
}
| {
/**
* Subtype to differentiate between node and link
*/
subType: 'link';
/**
* The id of the source node
*/
sourceId: SankeyNodeId;
/**
* The id of the target node
*/
targetId: SankeyNodeId;
/**
* The link object with all the calculated properties
*/
link?: SankeyLayoutLink;
}
);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way the link/node are optional in the identifier if you are passing that in. At least for now 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way the link/node are optional in the identifier if you are passing that in. At least for now 🤷
Looks like the worst of both worlds 😿
- for controlled values, TS tells you to pass some data that will not be used
- for custom components, TS tells you something is optional, whereas we know it's populated
There is no way to implement this right now I guess. Since we require the itemIdentifier from the ChartSeriesConfig.
Technically speekiing it's feasible. You introduce an attribute itemIdentifierWithData
in the config with same type as the itemIdentifer
for all series except the sankey.
Then you define this helper
export type ChartItemIdentifierWithData<T extends ChartSeriesType> =
ChartsSeriesConfig[T]['itemIdentifierWithData'];
And now you can update the definition of the tooltip getter to be
export type TooltipGetter<TSeriesType extends ChartSeriesType> = (params: {
series: ChartSeriesDefaultized<TSeriesType>;
axesConfig: TooltipGetterAxesConfig;
getColor: ColorGetter<TSeriesType>;
- identifier: ChartItemIdentifier<TSeriesType> | null;
+ identifier: ChartItemIdentifierWithData<TSeriesType> | null;
}) =>
The TS passes if in SankeyItemIdentifier
the node
can be undefined. If we mark it as mandatory (which is the interesting part) it fails for a more interesting reason. The result of the selector can not be assigned to the tooltipGetter.
If I resume types in a scheme, I get
- The user provides two distinct source of truth. The data and the controlled identifier.
- We can store the identifier with or without data:
- identifier without data comes from controlled values
- identifier with data comes from our own interaction management
- In case the identifier has no data, the selector (or another piece of code) should add it

Proposal
Since the charts has no controlled value for now, we can have an identifier with node
and link
required.
Before adding a controlled highlight or tooltip, we should introduce this distinction between identifiers with/without data and the piece of code that transforms an identifier without data to an identifier with data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented it on this PR, I was under the impression it would need more changes, but was quite simple
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually moved the changes here so it is easier to review: #19295
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks clean. I left minor comments
key={`${link.source.id}-${link.target.id}`} | ||
link={link} | ||
opacity={linkOptions?.opacity} | ||
onClick={onItemClick} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing onItemClick
by onNodeCLick
/onLinkClick
would allow to simplify the type of the context (i.e. onNOdeCLick
only getting node context).
Plus the cursor={onClick ? 'pointer' : 'default'}
would only be applied when needed. No pointer on link if only nodes have an onclick interaction
Co-authored-by: Alexandre Fauquette <[email protected]> Signed-off-by: Jose C Quintas Jr <[email protected]>
Are dropoffs included in the first release? |
@dberardo-com they are in our todo list as a sub-issue from the main issue, though there is no timeline for it yet: #7930 |
Thanks for the fast reply, I couldn't find the list of requested features that's why i asked (I guess notions having troubles at the moment). I believe dropoffs are quite a differentiating factor, but that depends on what your typical user's use cases are. Looking forward for the official docu to come up to see what the current API can achieve. Cheers |
Resolves #7930
I've changed the API to only need links array to be defined. The node is an object that can be used to provide configuration to the nodes if needed.
By default the link/source/target order is used as the node's order. Otherwise, the node array is used when present.
To improve
There are two props used for styling.
nodeOptions
andlinkOptions
. Currently they have a mix of styling and configuration. Eg:nodeOptions.color
style,nodeOptions.padding
configuration (not real padding, used in the sankey layout fn). It might not be the clearest 🤔Another possible option is to have a
nodeStyle/linkStyle
and move configuration to the root series, eg:{ nodeStyle: { fill: 'blue', ...(accept css props maybe?) }, nodePadding: 10 }
Questions
Usage
Examples
A more complex sankey, with custom color for each node/link.
Changelog
SankeyChart
under theUnstable_
key.