how to achieve simple array wrangling - mapping two values together

I would like to play around with MA tax data. I’ve managed to isolate the two values I would like in my array, and I’d appreciate some help bringing them together.

Essentially, I have one array for ‘year’ and another for ‘amount’:

I would like to bundle them together to make an array that conforms to this shape/structure (not sure the term) from Mike’s Line Chart notebook:

I guess the question is: How do I map together only two array ‘columns’ / ‘fields’ while renaming their key values according to this specific format?

data = Array(n) [
  0: Object {
  date: 2007 /// indicative of one data point
  value: 193000
}

In the past, @radames reminded me of the pattern to map out specific array values, which I used to show the year and amount variables (above).

I have tried to join these in some funny ways. E.g. Here I try to map values together, but only manage to nest an array in another array:

data_redux = tax_average_single_family.map(({date, value}) => ({date: tax_average_single_family.map(v => v["Year"]), value: tax_average_single_family.map(v => v["Single Family Tax Bill*"]),}))

Here’s my working notebook with other strange attempts ;):

As @mootari pointed out one other time when I have faced a very similar problem, I have a lot to learn about data structuring and re-structuring :frowning: [Apologies I am still on the basics!] I fear a large part of my problem is not being able to research using correct and concise terminology regarding the problem

Any help and guidance is appreciated!

1 Like

If your question amounts to “how do I index an array of objects by a (unique) property value?”, then I’d recommend to either create a Map or an object:

const map = new Map(tax_levy_by_class.map(o => [o['Fiscal Year'], o]));
// Access:
map.get(year)

Note that a Map requires the key to be of the same type. If you’ve passed the years as strings, then map.get("2003") will work, but map.get(2003) won’t (and vice versa).

Objects don’t have the same “problem”, because object keys are always strings:

const obj = Object.fromEntries(tax_levy_by_class.map(o => [o['Fiscal Year'], o]));
// Access:
obj[year]
1 Like

If you’re using D3 anyway, then how about d3.zip:

d3.zip([1, 2, 3], [4, 5, 6])
// Output: [[1,4],[2,5],[3,6]]

PS: Your use of embedding to convey what’s going on with code is a clever idea, but I think that the way input and output is structured in the notebook is not really ideal here.

2 Likes

Thanks @mootari!

Regrettably I think this isn’t my question. Following your map example, I can map the year to a value:

But this isn’t exactly it. Rather, I need to re-key these values somehow so that they’re returned as an array of objects

data = Array(n) [
  0: Object {
  date: 2007 /// indicative of one data point
  value: 193000
      },
  1: Object {
  date: 2008 /// indicative of one data point
  value: 199000
      }
  n: Object { /// indicative end value
  date: 2009 /// indicative of one data point
  value: 201000
      }

You grabbed tax_levy_by_class, so for this one it’d be Fiscal Year by, I guess Personal Property Levy

The idea is that for each original array object, I would like to choose some number of value and return a new object with the data keys above (all of this just to transform my source data into the format expected by the line chart).

Thanks @mcmcclur!

This approach gets me a lot closed, but what I am looking for is slightly different.

re-keyed_data

Thanks - and yes, the notebook is a mess. It started by me pulling down the two CSV files I wanted to graph, and then me trial-and-error working my way through isolating the data I want to play with, and then trial-and-error experimenting how to re-format it so that it has the key values expected by the line chart. I note that the data in many of Mike’s examples get re-formatted in this way. I am still very far away from being able to ‘de-structure’ and ‘re-structure’ information to fit nicely.

Yes, I figured that pairing was the crux of that matter but it looked like you knew where to go from there.

1 Like

Ah ha! I know nothing of anything! :stuck_out_tongue:

This must mean something in my notebook gave you this idea. I’ll try again.

Reflecting on the problem, I also thought I should look back into filter methods, as that might be all I need.

It was also my impression that you’re trying to merge multiple data sets by year, and thus need a way to look up values.

If you simply want to transform the data, all you have to do is write

tax_levy_by_class.map(o => {
 return {
   date: o['Fiscal Year'],
   value: o['Personal Property Levy ']
 };
})

I very strongly suggest that you avoid all short form notations like () => ({}) for the time being. You may also want to avoid nested map calls and instead define your data transformations as named function callbacks. Example:

function transformTaxData(obj) {
  return {
    date: obj['Fiscal Year'],
    value: obj['Personal Property Levy ']
  };
}
tax_levy_by_class.map(transformTaxData)

Brevity can be detrimental if it hinders your understanding of your code.

1 Like

Thanks @mootari!

This is it. I was starting down that road, but was only managing to return a function:

tax_average_single_family_rename = tax_average_single_family, (d) => {
  return { 
    date: d["Year"],
    value: d["Single Family Tax Bill*"],
  }
}

…And thanks to your example, I can see that my ‘punctuation’ was at fault (defining this as a function of d).

Agreed, especially since I don’t yet know what they look like in the long form!

Awesome. Thank you - this also helps me understand the terms!

:+1: Agreed!

Shamless plug for a “dataflow” library (a pipeline concept using generators/iterables, to encourage folks to only iterate over the source data once): https://github.com/hpcc-systems/Visualization/tree/trunk/packages/dataflow

Solution using the above lib:
https://observablehq.com/@gordonsmith/dataflow-join-test

2 Likes

fwiw, another take

year.map(function(d,i) { return {year: year[i], amount: amount[i]} })

(see [MDN map documentation](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map for the second argument, index.)

1 Like

Awesome @minshall! This looks exactly what I imagined might be possible. Super succinct.

To make sure that I understand your and @mootari’s solution, please allow me to elaborate how I am conceptually trying to read each.

In @mootar’s solution, tt looks like I create a value obj by describing that value as function that assigns the value of an object’s property (not sure if this is the right word, it’s like ‘data category’ here it’s Fiscal Year) to a corresponding key called date. When invoking the map method on my CSV dataset in tax_levy_by_class.map(transformTaxData), I am calling that function to run/iterate over each property in the dataset with that value. I can perform this operation for any number of defined data ‘properties’ / data categories (‘Fiscal Year’, 'Personal Property Levy ', etc.)

In your code (which I’ll adjust slightly to tax_average_single_family.map(function(d,i) { return {year: year[i], amount: amount[i]} })) I am doing a similar thing: running a data map on the result of a function that, for each data object (represented by i ?) in a data array (represented by the d ?), I am returning the corresponding value in another dataset–in this case year and amount ?

These are two quite different and helpful to understand methods! Thank you for taking the time to show and explain!

all of year, amount, tax_levy_by_class, and tax_average_single_family are javascript arrays. the first is an array of integers (as strings); the second of strings; and the latter two are arrays of objects.

all javascript arrays have a method called map, which iterates over the array, passing each element of the array, in turn, to a function. in addition to the element of the array, the function receives the index of the element of the array within the array (indices in javascript start at 0), and (though we don’t use it here), also the entire array (in case, for example, you wanted to do the difference operator of each numeric element and its predecessor in the array).

@mootari stepped back and looked at your data source (Sandisfield …), and realized all the data was in the one array of objects, and he chose tax_levy_by_class. in his function transformTaxData(), each time it is invoked with an array element (which, in this case, is an object), he extracts two properties (“Year”, “Single Family Tax Bill*”) from the passed object, and uses them to create, and return, an “object literal”, with the properties you were looking for: date, value.

@mootari’s answer pleases me more than mine, as he isn’t relying on two totally separate arrays somehow conforming in terms of number of elements, ordering of the elements, etc. and, he only needed to use the first argument to map (parsimony is good).

but, mine works in this case (as one would expect it to, given the way year and amount were produced). my code uses the index i, passed by the array map function, to index into both the year and the amount arrays and, as in @mootari’s code, create an object literal with the two properties you wanted.

note that i could have written my routine

year.map(function(d,i) { return {year: d, amount: amount[i]} })

as element i of the array year is passed, for each i, as the first argument to the map function. (i called this argument d, but that choice is basically arbitrary, just as was the choice of i for the index argument.)

hth.

1 Like