Sorry for the rookie question, but can anyone point me to a use of Plot that shows a count of items each hour from data that is rows of timestamp, item
.
Hereās a quick page with sample data and an explanation: Hourly binning question / kpm-at-hfi / Observable
Typically something like this:
Plot.rectY(data, Plot.binX({y: "count"}, {x: "timestamp"})).plot()
Or, if you explicitly want to bin by hour:
Plot.rectY(data, Plot.binX({y: "count"}, {x: "timestamp", thresholds: d3.utcHour})).plot()
If you have too many bins (in your case, 668 hours span the dataset), then itās possible youāll end up with zero-width invisible rects. You can mitigate this by setting inset: 0
, which eliminates the gap between rects.
Plot.rectY(data, Plot.binX({y: "count"}, {x: "timestamp", thresholds: d3.utcHour, inset: 0})).plot()
Live example:
Thanks, Mike. I did get close to that thanks to the cheatsheets, but (I probably didnāt ask very well) Iām hoping to see all of the 8am occurrences (regardless of day) get counted up in one 8am bar.
Ah, I see, like an intraday histogram. Youāre pretty close already.
The main thing is that youāll have to be a little bit more explicit in telling the bin transform how to compute thresholds. If you specify the thresholds option as a number but donāt specify the domain option, itāll compute the domain (the extent) of your data automatically, and then try to divide that into the specified number of bins. For your data, the natural domain is [5, 15]; if you ask for 24 thresholds, youāll get [5.5, 6, 6.5, 7, ā¦, 14.5]. In other words youāll get half-hour bins even though your data is defined on the hour.
To fix this, you can specify the domain explicitly to [0, 24]:
Plot.plot({
marks: [
Plot.rectY(data, Plot.binX({y: "count"}, {
x: d => d.timestamp.getHours(),
domain: [0, 24],
thresholds: 24
})),
Plot.ruleY([0])
]
})
You can alternatively specify the thresholds as [1, 2, 3, 4, ā¦, 23]:
Plot.plot({
marks: [
Plot.rectY(data, Plot.binX({y: "count"}, {
x: d => d.timestamp.getHours(),
domain: [0, 24],
thresholds: d3.range(1, 24)
})),
Plot.ruleY([0])
]
})
However, it looks like thereās a bug with the last bin with this approach I need to investigate. (Probably the legacy of this d3-array bug.) (Itās not a bugā¦ you just have to specify the domain explicitly if you donāt want the last bin to have zero width. I put up a PR for an interval option to make this easier.)
Lastly another gotcha is that date.getHours will use your browserās local timezone. Thatās fine if all of your viewers are in the same timezone; but if theyāre not, youāll probably want to use UTC, so do some timezone conversion yourself.
This is greatā¦ thanks! Good note about the time zone. Iām only doing some internal analysis so I get to be sloppy, but youāre spot on if I was going to make this available more widely.
Just for learningās sakeā¦ is there any way to be able to display the bins for hours 0, 1, 2, etc that have a count of zero? It would be kind of nice/intuitive to know that Iām looking at the whole dayās worth of hours. I feel like Iāve done this with Vega-Lite, but wanted to try it in the new shiny.
Yep, you can add filter: null
if you donāt want to suppress the empty bins.
Plot.plot({
marks: [
Plot.rectY(data, Plot.binX({y: "count", filter: null}, {
x: d => d.timestamp.getHours(),
domain: [0, 24],
thresholds: 24
})),
Plot.ruleY([0])
]
})
Ooo niceā¦ thanks!