How to get the bin values from a Plot?

When I apply a bin transform, I know I can access the elements of each bin from the outputs object as shown in this plot:

Plot.rectY(
  olympians,
  Plot.binX(
    {
      y: "count",
      fill: (bin) => bin.some(d => d.name === "Aaron Brown")
    },
    { x: "weight" }
  )
)

If I just want to draw a line at the median frequency across all bins, like in this histogram, I can use groupZ and set x, x1, and x2 to undefined (although I’m not yet sure why that works :sweat_smile:):

 Plot.ruleY(data, {
  ...Plot.groupZ(
    { y: "median" },
    Plot.binX(
      { y: "count" },
      {
        x: "date",
        // strokeDasharray: [5, 10],
        // strokeWidth: 0.25,
        thresholds: 100
      }
    )
  ),
  x: undefined,
  x1: undefined,
  x2: undefined
})

But what if I want to get the actual bins to use them in a different cell?

If I can’t extract them from the plot, I thought I could do it manually, but I must be doing something wrong because I don’t get the right value (I get 20 instead of 14):

d3.median(
  d3
    .bin()
    .value((d) => d.date)
    .thresholds(100)(data),
  (d) => d.length
) // 20
1 Like

A reason might be because d3.bin is not using temporal bins, so all the dates are converted to numbers (milliseconds since EPOCH), and that is what gets binned?

It could be. I tried your thresholdTime suggestion, but I still get a different median:

Is this not possible, then?

Can you share a notebook with your experiment and expected results?

Is this not possible, then?

I haven’t said that :slight_smile:

Plot is not meant for that, but the mark’s initialize method is the one that computes these channels.

Here you go. I was just modifying the stargazer notebook.

To get 14, you need to ignore the empty bins:

d3.median(
  d3
    .bin()
    .value((d) => d.date)
    .thresholds(thresholdTime(100))(data),
  (d) => d.length || NaN 
)

If you call:

init = Plot.rectY(
  data,
  Plot.binX({ y: "count" }, { x: "date", thresholds: 100 })
).initialize()

you can then inspect the bins in init.data, and their bounds in init.channels.x1.value and init.channels.x2.value

2 Likes

That’s super useful! Thanks, Fil. I also found your excellent notebook where you go deeper on this:

1 Like

Thanks! I had forgotten about it :slight_smile:

1 Like

@Fil I’m having trouble with a more complex example.

I’ve been able to get the bin values from plot, but I’m not able to get the same bins when using d3.bin.

EDIT: Okay, I figured it out. But I’m still wondering if there’s a simpler way. :sweat_smile:

to answer

new Date("1854-01-01"), // how can I not hardcode this?

you should probably use:

d3.utcMonth.every(4).floor(d3.min(crimea, (d) => d.date))

(Time intervals are hard, in any case)

1 Like