The key challenge in integrating D3 code into a React app is how to break up responsibility for DOM manipulation, so the two frameworks don’t step on each others’ toes yet updates are efficient and reliable. There are a lot of ways to go about this and which is best may depend on the specifics of your application. The key is to draw a very clear line between the responsibilities of React and D3 and never let either one cross into the other’s territory. React will always provide the overarching structure, D3 the details of the chart, but the exact boundary can be drawn in several places.
One other note - most of the discussion below (except for example react-faux-dom
which is tailored to D3) applies just as well to integrating other packages or JS components inside a React app.
- Have React create a container element but put nothing in it
- Attach D3 (or any other javascript) code to React lifecycle methods to manipulate the contents of the container
- This is what the official
react-plotly.js
component does - It's good practice to create your D3 code separately, with an API you can call from the React component. This isn't strictly necessary though, a simple chart could be coded entirely within the React component.
- Example: http://nicolashery.com/integrating-d3js-visualizations-in-a-react-app/
Pros:
- Can use arbitrary code & packages from outside the React ecosystem
- Can reuse the D3 code outside React
- Easy for developers already familiar with D3
- Good performance - potentially the best especially for partial updates, but at a complexity cost
Cons:
- Significant code in lifecycle methods. For simple use this is essentially boilerplate, but once you start worrying more about performance and complex interactions it's more than boilerplate, it can require tricky logic and operations.
- Not React-idiomatic - doesn’t benefit from React diffing inside the plot
- No Server-side rendering (SSR)
- Have D3 manipulate a fake DOM, let React render it, using the
react-faux-dom
package - Example: https://medium.com/@tibotiber/react-d3-js-balancing-performance-developer-experience-4da35f912484
Pros:
- Can use D3 idioms
- Can use D3 code built outside of React (mostly - some references to the faux DOM end up sprinkled in with the D3 code)
- Allows SSR
Cons:
- Slower (two fake DOMs) although some clever usage can mitigate this at least partially.
- Only pure D3 is intended to work - not all of the DOM API is supported, so arbitrary JS may or may not succeed.
- React owns the elements
- D3 owns their attributes
- Example: https://medium.com/@sxywu/on-d3-react-and-a-little-bit-of-flux-88a226f328f3
Pros:
- Managing element creation/deletion is often easier with JSX than D3
- But you can use more of D3 than just the math - in particular transitions (with caveat about exit transitions).
- Good performance
Cons:
- Hard to separate code cleanly - React & D3 mixed together
- Can be tricky to know which parts of D3 you can/can’t use
- No SSR
- Use the mathematical parts of D3 to calculate attributes
- Then pass those attributes to React for actual rendering
- It's completely orthodox to use D3 this way. This is one of the big reasons D3 reorganized from one big package in v3 to many subpackages in v4, not just to reduce your bundle size. In fact the great majority of D3's subpackages don't touch the DOM, they're just there to help with all the little manipulations and edge cases needed to turn data into visual attributes.
- Example: https://www.smashingmagazine.com/2018/02/react-d3-ecosystem/ (which also contains examples of several other of these strategies, as well as an appraisal of a few react-specific charting libraries)
Pros:
- Pure React output
- Allows SSR
- Good performance
Cons
- No reuse of outside D3 code, unfamiliar to D3 devs
- Need to use D3 at a fairly low level
- Need to reimplement the pieces of D3 that do create/manipulate DOM elements (which are some of the toughest pieces, like drawing axes)
The rest of this discussion will delve into the first approach, lifecycle method wrapping, as it’s the most general and flexible (and the only real option for incorporating packages like plotly.js that use generic JS as well as D3). The articles above do a thorough job explaining the other options, and in particular the D3-only-for-the-math approach should already be quite familiar to a React developer.
The general pattern for this approach is:
- Create a container element in
render
, that the D3 operations will be constrained to operate within. - Use
ref
to pass this element to D3. - Create the D3 visualization in
componentDidMount
- Tear it down in
componentWillUnmount
- Update it in
componentDidUpdate
- Make sure the D3 component has its dynamic appearance (including all user interactions except maybe transients like hover effects that you never want to impact any other components) fully specified by its state object(s)
- Pass these state objects down from the React props
On the React side this looks like:
class RadarPie extends Component {
constructor(props) {
super(props);
this.getRef = this.getRef.bind(this);
}
componentDidMount() {
RadarPieD3.create(this.el, this.props.figure);
}
componentWillUnmount() {
RadarPieD3.destroy(this.el);
}
componentDidUpdate() {
RadarPieD3.update(this.el, this.props.figure);
}
render() {
return (
<div ref={el => this.el = el} />
);
}
And on the D3 side, something like:
const RadarPieD3 = {};
RadarPieD3.create = (el, figure) => {
// Create any structure and attributes that are independent
// of the chart's attributes
const svg = d3.select(el).append('svg');
svg.append('text')
.classed('title', true)
.attr({
'text-anchor', 'middle',
y: 30
});
RadarPieD3.update(el, figure);
};
RadarPieD3.update = (el, figure) => {
const width = figure.width || 400;
const height = figure.height || 500;
const title = figure.title || '';
const xCenter = width / 2;
const yCenter = (height + (title ? 50 : 0)) / 2;
const maxRadius = Math.min(xCenter, height - yCenter);
const svg = d3.select(el).select('svg')
.attr({
width: width,
height: height
});
svg.select('.title')
.attr('x', xCenter)
.text(title);
const len = figure.data.length;
const slices = svg.selectAll('path').data(figure.data);
slices.enter().append('path');
slices.exit().remove();
const arc = d3.svg.arc() // this is for d3v3, it moved to just d3.arc in d3v4
.innerRadius(0);
const colors = c20 = d3.scale.category20();
const angularScale = d3.scale.linear()
.domain([0, figure.data.length])
.range([0, 2 * Math.PI]);
const radialScale = d3.scale.sqrt()
.domain([0, d3.max(figure.data)])
.range([0, maxRadius]);
slices.each(function(d, i) {
d3.select(this).attr('d', arc({
startAngle: angularScale(i),
endAngle: angularScale(i + 1),
outerRadius: radialScale(d)
}))
.attr('fill', colors(i));
})
.attr('transform', 'translate(' + xCenter + ',' + yCenter + ')');
};
RadarPieD3.destroy = (el) => {
// Nothing to do in this case, but if you create something disconnected,
// like a WebGL context or elements elsewhere in the DOM (plotly.js does
// this as an off-screen test container for example) it should be
// cleaned up here.
};
Now we can call our RadarPie
component with something like props: {figure: {data:[5, 1, 3, 4, 10], title: 'Sectors'}}
, and updates to any of the figure
options will be reflected on screen.
All of that is fairly straightforward. The challenges come from performant incremental updates to the D3 component, events generated inside the D3 component, and large data sets vis-a-vis mutable/immutable data structures.
The normal D3 enter/exit/update pattern is already a good start at ensuring high performance. Reusing elements efficiently is important - if you’re using animation for object constancy you don’t have a choice about this - you must use a .data
key function that uniquely identifies the object for modification. But if you aren’t animating, you can often see gains by omitting the key function entirely (which results in the array index being used as the key, ie maximal element reuse) and just restyling the same elements with each update.
The next level of performance improvements comes from short-circuiting updates or pieces of the update that won’t do anything. For example if you’re just changing color there’s no need to resize the elements; if you have several sets of bars and only one has new data, only that one needs to be updated. plotly.js
accomplishes this by running its own diffing algorithm (within the Plotly.react
method) that determines the minimal update path needed.
There are caveats to this particularly with charts: often a change in one object will have ripple effects on the others that aren’t really apparent in the data structure. For example with an autoranged axis, adding a new high point to one data series will require rescaling all points in all series. Or a stacked bar chart, changing data in a series in the middle of the stack will require shifting all the higher series but not the lower ones. This kind of coupling is much less common in the regular HTML portions of a React app. This is partly a result of the explicit layout required for SVG, but largely it’s inherent in the fact that encoding data in visual attributes needs to place that data in the context of all other related data.
A D3 chart component can generate a lot of internal events: hover, selection, zoom/pan, toggling visibility… the list goes on. There are broadly speaking three approaches to dealing with these in a React app. In all cases you want to keep the state changes associated with these events in sync with the app state:
- Have React bind to these events without the D3 component doing any DOM manipulation in the event handler, and use them to update the props at the app level, which then passes them back down to the D3 component, which then updates through
componentDidUpdate
. This is the most React-idiomatic method, but it can take some extra effort to ensure adequate performance of the D3 component’s update. It’s also generally not possible to do this with 3rd-party components or those written for use outside React. - Have React bind to these events without the D3 component doing any DOM manipulation in the event handler, use them to update props that are then applied to a React (non-D3) sibling element, so the D3 component does not update at all. This can be a good solution for high-rate updates like hover effects that can be overlaid on the D3 output rather than integrated with it, but again is generally only possible in D3 components that are purpose-built for integration with React.
- The D3 component updates its own state (and resulting DOM) and then emits an event. The React component binds to the event and reads or calculates the updated state. This gets incorporated into the React app state and passed back down to the D3 component via componentDidUpdate. The trick then is to ensure the D3 component recognizes this state as unchanged from the state it already prepared for itself, so it doesn’t re-render (in the worst case leading to an infinite loop of events and DOM updates - it can be necessary to include some basic identity checks in
shouldComponentUpdate
to prevent this).
Approaches 1 & 2 are great for new D3 code you are writing explicitly for a React app - they fit the one-way data flow paradigm, making it easy to clearly and predictably update both the component that generated them and any other coupled components, for example down-selecting the data in one chart based on selection in another. For a non-React-specific integration like Plotly.js though these approaches won’t work, so we use the third approach and put significant effort into ensuring the component knows what constitutes a real change vs its own change feeding back in. Which brings us to our last point:
Immutable data structures, or at least immutable usage, are very common in React apps, because it makes the diffing process - central to efficient DOM updates - a simple identity check at each node of the state tree. But if you have large data sets that change quickly (streaming data or user edits, for example) immutable updates may not be feasible, either for speed or memory reasons. But you also can’t do a full element-by-element diff of these large data arrays with each update. This concern isn’t really specific to D3 at all, it could come up in a pure React app, but it’s more likely when D3 or other data visualization packages get involved since in pure HTML it’s difficult to display this much data on a single screen.
React’s declarative data model doesn’t allow us to annotate specific changes - all you have is the old state and the new state, so you can’t insert a flag like “the y data changed” or repeated updates with that same state would erroneously tell us to keep updating. The solution that plotly.js
uses is a datarevision
property. The value of this property is arbitrary - it could be a hashed version of the data or a serial number that gets incremented whenever the data changes for example. We just know that if this property changed there is an update somewhere in one of the data arrays in the plot, and if it didn’t change, the data arrays are the same as in the previous state.
This concept can be extended to whatever optimized update pathways your component makes available. If, for example, your component can update more efficiently when new data is appended to the end of the data arrays than if existing data have been altered, you could make two properties like datarevision
and dataextent
. You would increment datarevision
only when previously existing data is changed, and dataextent
when appending new data. If each data series has its own update pathway, give each series a separate revision property.
This is a very good overview.
My personal taste goes for the
React for the DOM, D3 for the math
option. I have created a react-graph-gallery.com website that provides several examples using this approach. Do you think it could be integrated as an example in this doc?Thanks for your work!