how to filter a dataset based on multiple choices

G’day everyone,

I’m pretty new to all of this (I’ve used lots of Stata and some R but not java stuff) so I hope that this is an easy question, but I’ve had a lot of googling etc. and can’t find the answer (likely my bad searching!)

So I have a dataset with multiple variables (e.g. temperature, electricity demand and time) and can make a scatterplot of one variable vs another. (e.g. temperature vs demand - Excellent!)

I can then add a selection box to select something about a third variable and have it highlighted on the graph based on that selection. (e.g. temperature vs demand at 6 o’clock - Excellent!)

What I would like is to be able to have multiple (or no) selection from the selection input and have it also show the multiple selections on the graph. (e.g. temperature vs demand at 6, 8 and 11 o’clock).

I think it’s because my filter for the data in the plot uses…
filter: (d) => d.hour == selectTimes

And I guess I need to know what to change the == into so that it’s filtering based on the array contained within selectTimes?

I’m sharing an example notebook here and would greatly appreciate any help! Thanks!

1 Like

You’re very close!!

When you select just one hour, like 5, the value of selectTimes is [5]. In the plot’s filter, you’re checking if the current hour, something like 5, equals selectTimes.

The == in JavaScript can be pretty confusing: 5 == [5] is true, even though the left side is a number and the right side is an array! But 5 == [5, 6, 7] is false, so choosing multiple things means none of them show up.

(You can think about the JavaScript double-equals as casting the array to a string in this case; "5,6,7" == [5, 6, 7] is true. There are tables of examples here. People often recommend never using ==, and only ever using the triple equals ===, which is stricter, which helps you catch things like this that only work in a sorta fragile way.)

For the filter to work in the more general case, where you want the filter function to return true for hour 5 whether selectTime is [5] or [5, 6, 7], you can use the JavaScript array’s includes method:

filter: (d) => selectTimes.includes(d.hour),

Another consequence of the way you were using == is that 0 == [] is true, so, when you have nothing selected, it shows hour 0. That doesn’t happen anymore; the default highlights nothing. If you wanted the default state to show everything, you could do:

filter: (d) => !selectTimes.length || selectTimes.includes(d.hour),

I also noticed that the hours in your data go from 0 to 23, but the hours in your selector go from 1 to 12. Also, instead of writing out a big list like [0, 1, 2], you can write d3.range(0, 3) (it goes up to but not including the second number; more info).

Here’s a suggestion (you can just click a button to merge it) with these tweaks:


Thank you so much!
Not just for the solution (which is great) but for taking the time to explain it to me, I really appreciate it!