🏠 back to Observable

sum of two arrays in D3?

How does one get the sum of two separate arrays? I have been trying d3.sum and d3.merge` to no avail.

These are the two sum functions:

totalFruit = [d3.sum(fruits.map(function(d){ return d.quantity}))];
totalVeg = [d3.sum(veg.map(function(d){ return d.quantity}))];

Which I would like to merge in a relatively straight-forward way; something like this (although this code just reproduces the value of the second array):

totalBasket = [d3.sum(fruits.map(function(d){ return d.quantity}))],
              [d3.sum(veg.map(function(d){ return d.quantity}))]

or, more simply

totalBasket = d3.sum([totalFruit],[totalVeg])

Using d3.merge to group together two separate d3.sum operations, I hit errors about invalid array length.

As a digression, what is the (d) meant to signify in function(d) ? It seems one could use any letter, but typically it’s (d). Is this for data?

Thanks for any and all help!

I would tackle this without using D3 at all, with JS’s reduce function.

[Not having much experience with functional programming, I remember being quite confused by the syntax at first, but I think the examples on that doc page are really helpful.] Here’s a quick intro:

Typically whenever I have an array and I want to get a single value out of it in some way, I think about using reduce if I can. The way it works is as follows: you want to write a function which takes two inputs, the “accumulator” and the “current value” and which returns a new value of the accumulator. This function is called on each element of the array starting with 0 and ending with the last element (if you want to go backwards, you can use reduceRight).

For instance, to sum a numerical array, we would write a function:

function sum(acc, val) {
  return acc + val;
}

and use this in reduce as follows (the 0 is the initial value of the accumulator variable):

[0,1,4,2,4,5].reduce(sum, 0)

We can also use “arrow notation” to avoid having to write a separate function:

[0,1,4,2,4,5].reduce((a,b) => a+b, 0)

For your problem, where you want to sum the quantities of the objects, I’d write something like this for totalFruit and totalVeg:

function sumQuantities(acc, val) {
  // we only care about the quantity field of the object
  return acc+val.quantity;
}

totalFruit = fruits.reduce(sumQuantities, 0);
totalVeg = veg.reduce(sumQuantities, 0);

To get totalBasket, you can merge the arrays before applying reduce; remember that the .concat method concatenates arrays:

basket = fruits.concat(veg);

totalBasket = basket.reduce(sumQuantities, 0);

If you have more stuff to sum in the basket, note that you can do this (assuming the data format is the same):

basket = fruits.concat(veg, meat, fish, bread);

totalBasket = basket.reduce(sumQuantities, 0);

And yeah, I’ve also always thought that the d convention stood for data.

EDIT: The following alternate approach which avoids the cost of concatenation might also be enlightening:

basketArr = [fruits, veg, meat, fish, bread];

totalBasket = basketArr.reduce(
  (acc, arr) => arr.reduce(sumQuantities, acc), 
0);

The inner reduce does the same thing as the ones above, except that the initial value is set to acc instead of zero. The outer reduce makes it so that acc contains a running total of the sums of arrays that have already been looked at.

1 Like

Hi @bgchen, and thanks for always being so willing to help me and other to learn. I just saw in another forum post today that you are new to JS as well – and yet you seem light years ahead!

I am still having troubles to put all of this together. I started a new notebook to try to implement your solution here, but I am still not seeing how the pieces are meant to connect. Specifically, I seem to be having trouble with the sumQuantities function:

Regarding the ‘reduce’ approach - is it required to always create a value of 0, add to it, and then subtract? That is, is it not appropriate to say say start with the calculated value of X and subtract Y?

Thanks again for all your help!

Hi @aaronkyle, in Observable, functions can be used as cells directly without any wrapping:

While it’s true that I’m fairly new to JS and the web, I spent quite a bit of time programming in various other languages, so don’t feel discouraged!

Regarding the ‘reduce’ approach - is it required to always create a value of 0, add to it, and then subtract? That is, is it not appropriate to say say start with the calculated value of X and subtract Y?

I’m afraid I don’t understand your question – what are X and Y? The second parameter to reduce is the initial value to the accumulator variable in the reducer function. To get the sum of elements of a single array, it’s most appropriate to start with zero and then add to that. If, on the other hand, you wanted to multiply the elements of an array then you would want to start with an initial value of 1 instead.

If you haven’t already, have a look at this worked out example in MDN’s page on reduce. When I was first learning this I spent some time working out a few examples on paper as well and comparing to what I got from JS just to make sure I had the right “picture” in my head.

1 Like

Thanks again, @bgchen.

Thanks for the working example! It seems that there’s still something a bit off with the calculations, as totalFruits seems to be concatenating the values in the array rather than adding them. Looks like it works with d3.sum wrapped around the reduce function :slight_smile:

I’ll work over the different links you shared. Thank you. For me, going through these tutorials helps a lot indeed.

As for the question I raised on the ‘reduced’ approach - You may have already answered it, but it looks like the approach you use (and also used in Mike’s tutorial on manipulating flat arrays) is to say something like this (please excuse me for not really knowing the JS vocabulary) :

let desired_sum = 0
add variable_1
add variable_2
return desired sum

(I wrote subtract b/c my mind is on reduce, but same concept)

My question was why we start with 0. If we already have two calculated values (variable 1 = 12 and variable 2 = 6), why not just directly add and or subtract. Something like:

add variable_1, variable_2

or

subtract variable_2 from variable_1

Looks like still more reading for me!

Ah, oops! The concatenation issue is because the values in the quantity field are strings, not integers, so JS “helpfully” interprets + as string concatenation. (This is what happens when I don’t test example code or pay attention to the actual output!) The d3-free fix is to replace sumQuantities with:

function sumQuantities(acc, val) {
  // we only care about the quantity field of the object
  // ... but make sure it's an integer before we sum it!
  return acc + parseInt(val.quantity);
}

Coming back to your question, it looks to me that what you’re saying is on the right track as well: it absolutely does make sense to use other starting values if you already have “partial progress” on a sum or other operation. I guess the key point is that while the overarching “shape” of a solution using reduce won’t vary too much, the precise details (what reducing function you use, whether the accumulator is a number, string, array, or other object, what the initial value of the accumulator will be) will depend on what you starting with and where you’re trying to go. It gets easier to know what to try with experience, I promise!

1 Like

Thanks for the encouragement, @bgchen. A lot of my challenges are understanding all the pieces and when and why they are needed. For example, in your code, although I get that we’re building a function to run on all the arrays in my ‘basket’ to derive the value of ‘quantity’, I don’t quite understand how this relates to the ‘reduce’ function, whether the two acc in the sumQuantities are refer to the the same thing as the acc and reduce functions are the same, and why we introduce the arr.reduce within basketArr. Just seems like a lot of manipulation :sweat_smile:

Also strange to me (discovered while building from your example): I learned that I can specify each data ‘column’ and ‘row’ individually to get the ‘correct’ value

d3.sum(fruits[1].quantity+fruits[0].quantity+veg[1].quantity+veg[0].quantity)

But it doesn’t seem to be so straightforward to calculate all ‘rows’ for a specific ‘column’… i.e., this results in 0:

d3.sum(fruits.quantity+veg.quantity)

also

d3.sum(parseInt(fruits.quantity)+parseInt(veg.quantity))

I’ll keep playing. Next step will be to try to average a range of filtered values. Let’s see how it goes. Thanks again for all your time and for pointing me to some great references!!

One last note - and thanks again @bgchen for leading me to this!

fruits_sum = d3.sum(fruits, function(d) { return d.quantity; })
veg_sum = d3.sum(veg, function(d) { return d.quantity; })
basket_sum = fruits_sum + veg_sum

Totally works!

So out of curiosity, what’s wrong with plain for loops in this case? It saves overhead on memory allocation overhead, JIT compilation, etc, and doesn’t look all that confusing to me:

totalBasket = {
  let sum = 0;
  for (let i = 0; i < fruits.length; i++) sum += fruits[i].quantity;
  for (let i = 0; i < veg.length; i++) sum += veg[i].quantity;
  return sum;
}

EDIT: Alternatively, you could try for...of, although that is currently a bit slower in most JS engines:

totalBasket = {
  let sum = 0;
  for (let f of fruits) sum += f.quantity;
  for (let v of veg) sum += v.quantity;
  return sum;
}```
2 Likes

Glad you were able to come up with a solution you’re happy with. My preference for this would still be for a solution not relying on d3, but I’ll not go on about it.

Also, I agree that @Job’s for solution is better if performance is really an issue! (I certainly wouldn’t say that anything is wrong with plain for loops.)

Last few bits:

First, by the way that variable scopes work in JS, the two accs above end up referring to different things. I try to think of functions as their own little worlds, but it’s true that things can get hairy when they’re nested.

The double reduce code I posted was indeed a bit dense! If I unwind the arrow notation, it’s equivalent to this expanded version:

basketArr = [fruits, veg, meat, fish, bread];

function sumQuantities(acc, val) {
  // add the quantity field of the object to the accumulator
  // ... but make sure it's an integer before we sum it!
  return acc + parseInt(val.quantity);
}

function sumArrays(acc, arr) {
  // sum over the entries in arr, and add it to acc
  // do this by calling reduce with sumQuantities
  return arr.reduce(sumQuantities, acc);
}

// sum over all arrays in basketArr
// do this by calling reduce with sumArrays
totalBasket = basketArr.reduce(sumArrays, 0);

The logic might seem a little convoluted, but if you think about how you would do this by hand, it may make more sense. Start with the totalBasket line and work upwards. We want to turn totalBasket into a number, so and we do this by performing sums at two conceptual levels:

  1. We have to sum each array in the basket (sumArrays does this)
  2. We have to add the sums together (sumQuantities does this).

This solution has the advantage that if you want to add more stuff to your baskets, you just need to modify “data” by changing basketArr, and you don’t need to edit any of the code doing the actual summing. But this is a fine point which may not matter here.

2 Likes

Thanks to you both @bgchen and @job !

Just to clarify: I have no problems in not relying on D3.js and just using plain JavaScript, and I am happy to for all advice and pointers and how to continue to learn better coding.

I was happy to find the d3 solution that I posted mostly because it I found it to be relatively easy to understand and because I could get it working :). I will go over all your more closely later tonight and in the coming days. Thanks to this thread, I started learning about reduce and other JS functions, and it’s been I’ve felt like I made some great steps today in terms of how JS works. Thank you so much for you time and patience!!

1 Like