Data join only works when data is in separate cell?

I’m working on building my understanding of d3, and right now, I’m specifically having trouble with understanding some behavior of selections (especially with respect to stuff here in observable).

In this notebook, I try to define my data in the same cell where I join that data to a selection, but it doesn’t work, and I’m not sure why.

You need to do the data-join after the data has changed.

It’s due to Observable’s built-in reactivity (the dataflow runtime). It’s described here:

The second example works because the entire chart is discarded and recreated from scratch when its data changes (because the data is defined as a generator cell, and the chart cell references it).

In contrast the first example mutates the data (by pushing onto the array), but that data is a local variable, and cells only run (react) in response to other cells, not local variables.

1 Like

Ah! Thank you both for the help!

After struggling a bit longer, I realized that I was thinking about the data join incorrectly. I had thought that a joined selection watched for changes to the data, and updated the DOM accordingly.

I think where I got confused was with the d3-force example. Specifically, I lost track of which d3 APIs maintain immutability. Next time I live this life, I’ll have to try learning d3 selection before I jump on d3-force.

I’m curious about how the organization of the d3-force example would have to be modified in the face of changing data, and I plan on updating this topic with a version of the notebook that explores that.

1 Like

Potentially helpful: here’s a notebook I wrote (based on this block by Harry Stevens) when I was learning d3 and Observable

1 Like

Using d3.forceSimulation does present some additional concerns: the simulation starts a timer that runs for 300 ticks by default (5 seconds), and the simulation mutates the passed-in nodes and links (to assign the x and y positions, and also optionally reassigning the link.source and if specified by id rather than object reference).

This is why the example you linked makes a shallow copy of the nodes and links using Object.copy, and uses the invalidation promise to stop the simulation.

Hence, that example is designed to be “pure” in the sense that if the data changes, it creates a new simulation and discards the old simulation (and all its effects). That is the cleanest approach in Observable. It’s the easiest to reason about because there’s no hidden state that’s affecting the behavior of cells.

If you want to do animated transitions or incremental updates without discarding the old state, you’ll need to opt-in to some additional complexity. There are a variety of ways of doing that, but the one that I recommend is exposing an update function as in these examples:

The skeleton of that approach has two parts. Your chart cell (or whatever cell you want to be responsible for the state…) exposes an update method:

chart = {
  const svg = d3.create("svg");

  return Object.assign(svg.node(), {
    update(data) {
      // Animate here! 💃

And then a cell that calls chart.update whenever the data (or whatever other state) changes:


If you have more than one thing you want to change, you can have multiple update methods with different names that take different arguments. (The “update” name is just a convention and isn’t required.)

The principle here is that the code that mutates state should live in the same cell that defines the state.

(Also: I would avoid using Observable’s this! I now consider this an antipattern because it’s too dangerous. With animated transitions or incremental updates you typically only expect one thing to change, such as data here; with this, the old value is preserved regardless of what changes, making it much easier to write buggy code. For example—and sorry to pick on @bgchen, it’s not your fault, it’s mine for designing this—if you resize the window in @bgchen’s example the SVG element does not resize as expected because it ignores changes to width.)