From absolute values to percentages in a rollup?

I’m trying to get the syntax right to return the percentages rather then the absolute values in this example:
https://observablehq.com/@d3/d3-group#cell-165

So the result I’m looking would look something like this:
"United States" => Map(3) {"Boxing" => 0.2, "Basketball" => 0.4, "Football" => 0.4}

Basically v => v.length needs to be divided by the total number of this group.

Any pointers in the right direction are much appreciated.

you might do it in two nested calls to rollup:

d3.rollup(
  athletes,
  byNation => d3.rollup(byNation, bySport => bySport.length / byNation.length, d => d.sport),
  d => d.nation
)
1 Like

Thank you @Fil !

I’m just curious how byNation can be used both as the return result as well the input parameter in the nested rollup.

A more verbose version of what Fil wrote that is (mostly) equivalent is this:

d3.rollup(
  athletes,
  function outerRollup(byNation) {
    return d3.rollup(
      byNation,
      function innerRollup(bySport) {
        return bySport.length / byNation.length;
      },
      function innerKey(datum) {
        return datum.sport;
      }
    );
  },
  function outerKey(datum) {
    return datum.nation;
  }
)

The function outerRollup introduces a new variable scope, which contains the variable byNation. That can be used anywhere inside it’s scope, and can be used multiple times.

The function innerRollup is contained inside of outerRollup, and so is contained in its variable scope, and can access any variable available to outerRollup.

Does that help understand why byNation can be used in both places?

1 Like

Yeah, that definitely helps to understand a bit better what’s going on under the hood. As byNation is passed as a variable into outerRollup it’s also available within the innerRollup, correct?

One more question about this. How is the value of byNation specified as the variable for the outerRollup?

As byNation is passed as a variable into outerRollup it’s also available within the innerRollup, correct?

Exactly, yes.

How is the value of byNation specified as the variable for the outerRollup?

That’s just how d3.rollup is defined. To get a bit more insight into that, lets consider how d3.rollup is implemented. The actual source code of it is a bit intense, but here is a version that for the purposes of this discussion is the close enough:

function toyRollup(array, rollupFunction, keyFunction) {
  // Separate all the values of arrays into groups, based on the keyFunction
  let groups = new Map();
  for (let element of array) {
    let key = keyFunction(element);
    if (!groups.has(key)) {
      groups.set(key, []);
    }
    groups.get(key).push(element);
  }

  // For each group, use the rollupFunction to combine it into a single value
  let rolledUpData = new Map();
  for (let [key, group] of groups.entries()) {
    let combinedValue = rollupFunction(group);
    rolledUpData.set(key, combinedValue);
  }

  return rolledUpData;
}

If we then consider just a one layered version of the problem where we just want to calculate the number of athletes in each nation:

function rollupFunction(athletesOfOneNation) {
  return athletesOfOneNation.length;
}
function keyFunction(athlete) {
  return athelete.nation;
}
let numberOfAthletesInEachNation = toyRollup(athletes, rollupFunction, keyFunction);

then we can substitute the value into the definition of toyRollup. This is what the solution to the question about the number of athletes in each nation would look like without d3.rollup:

function calculateNumberOfAthletesInEachNation(athletes) {
  let athletesGroupedByNation = new Map();
  for (let athlete of athletes) {
    let key = athlete.nation;
    if (!athletesGroupedByNation.has(key)) {
      athletesGroupedByNation.set(key, []);
    }
    athletesGroupedByNation.get(key).push(athlete);
  }

  let numberOfAthletesByNation = new Map();
  for (let [nation, group] of athletesGroupedByNation.entries()) {
    numberOfAthletesByNation.set(nation, group.length);
  }

  return numberOfAthletes;
}

Hopefully that helps make sense of how the data moves around during the rollup process! I also hope that now you can see the appeal of having an API that can turn that complicated function into something as compact as this:

const numAthletesByNation = d3.rollup(athletes, ds => ds.length, d => d.nation);
2 Likes