using array.splice non-destructively?

aaronkyle · September 10, 2019, 9:25pm

Having read the documentation on Array.prototype.splice()
I understand that it will delete the spliced data from an original array. How to avoid this?

I have loaded in a csv file that has several tables (and a bunch of junk header information) using d3.csvParseRows. I wish to create an array that has a subset of the full csv file… but to leave the original array alone so that I can pull out one of the rows repeatedly as a header / key row (namely, the row with labels for each year).

I’ve tried prepending new before using splice, but this results in a TypeError: data.splice is not a constructor.

Any suggestions?

Here’s my test notebook:

And for convenience, here’s a link to a PDF version of the original data set, which shows the formatting of the the original .xlsx from which my csv is constructed

bgchen · September 10, 2019, 9:43pm

You can copy the array using .slice() before applying .splice():

afghanistan_2019_population = afghanistan_2019.slice().splice(6,6)

aaronkyle · September 10, 2019, 9:49pm

Cool. Thank you. I had read about slice but didn’t quite get how it worked till now. Thanks again!

mootari · September 11, 2019, 10:35pm

Please be aware that Array.splice() returns the removed elements, so that

foo = arr.slice().splice(6, 6)

is effectively the same as

foo = arr.slice(6, 12)

If you want to “break out” the middle without a temp variable, you can use

foo = arr.slice(0, 6).concat(arr.slice(12))

aaronkyle · September 12, 2019, 2:29am

Thank you @mootari. After playing more with slice, I figured out this equivalence (before I was trying array.slice(6,6) and getting an empty array. I had confused how slice and splice operate.)

As for breaking out the middle bit - I am not yet following so I’ll play around and learn. Thank you for this.

What led me to splice rather than slice in the first place was something that I read (and will need to find again) about giving additional variable to the excerpted data…something along the lines of

array = [1,2,3,4,5]

array.splice(1,2,'k2','k3'));

would return something like:

k2	k3
2	3

… but I haven’t quite gotten this working yet.

mootari · September 12, 2019, 10:31am

I recommend always having the MDN docs open to look that stuff up (e.g., I had to look up if the second arg in .slice() is length or end index). Working off of wrong assumptions is an ideal way to waste lots of time and mental energy.

The snippet is returning a copy of the array without the middle part (offsets 6-12) by taking the start of the array(0-6) and glueing it together with the last part (offsets 12-end).

The key fact to keep in mind is that Array.splice() modifies the array in place (Array.sort() does, too). It’s rare that you’ll actually have to use splice(), since most of the time you’ll want to instead recombine parts of arrays, or map them to produce new values.

In your example, what is your precise goal (or rather problem):

Do you want to extract a “vertical slice” from the rows (I assume that array represents just a single row)?
Or do you perhaps want to create objects where the labels are the property names?
Or something else entirely?

aaronkyle · September 12, 2019, 1:54pm

Thanks @mootari.

So here’s the problem in a nutshell:

The Asian Development Bank put out a dataset of ‘key indicators’ for sustainable development covering their 86 member countries. I would like to have a look at how countries are performing relative to one another.

In my normal way of things, I would go about a long and tedious process of pulling out each table of interest as its own CSV file, but this practice is time intensive, prone to errors, and not reproducible. As the data for all 86 countries is relatively uniform, I figured this might be a great occasion to learn more JS and set up a notebook that extracts all these tables for me once, and then set up a selection input (like a dropdown), and then toggle between countries.

This line of questioning I’ve been asking is me trying to learn how this sort of thing is done. Here’s the first page of one country:

At the top is a header row, with values for years the each data point is collected. I’ll need to join this as a header across all tables.

After the header row is the name of the table. I’ll also need to pull this out and return in back somewhere.

Then come the table data. It’s unfortunately not all that well organized and there appear to be some conversion errors when I saved from Excel to CSV, but overall things are looking pretty well aligned using csvParseRows.

So now I am at the point where I am starting to extract and rebuild each table.

I think so - yes? The CSV files render pretty poorly, so I need to select out specific columns and rows (for instance, starting with column 2 rather than column one)

Yes also? I would like to create arrays that contain the specific table information, in such a way to eventually pipe 'em through Tom’s Table’s tool (or similar) to basically re-create ADB’s publication, but in a manner more amenable to comparing and visualising the data.

Maybe… if what I am describing above turns out to be completely different that what you were asking about under the first two points above

By the way, ADB is nice as they do allow manipulation and re-presentation of their data, with appropriate attribution (under CC BY 3.0 IGO).

mootari · September 12, 2019, 2:17pm

Without getting into specifics, here’s what I would do:

Download the country tables xlslx and host it on glitch.com (as a project asset) for CORS access (or use this link).
Create a notebook and use sheetjs to read the file
Define area coordinates (header cells, row labels) in a static configuration object
Compare the labels of each datasheet (flatten hierarchy to simple “parent > child” path, then count each path occurence, finally compare counts)
Figure out a way to deal with “…” and “{” (perhaps summarized data can be assumed if cells below are empty?)
Verify that the cleaned up data structure of all sheets is identical. This step is crucial, as all your follow-up work must be able to build on the assumption that the data is clean. You don’t want to debug structural issues when you’ve already filtered and transformed the data.
Go nuts on the data.

4, 5 and 6 will each be a tough nut, but are critical steps.

aaronkyle · September 12, 2019, 3:46pm

Thanks for the guidance! I’ll keep at this. Hopefully I have some time this week and next. ADB has their own tableau summaries of how well countries are performing relative to the Sustainable Development Goals and other global performance targets, but it’s pretty limited. I’d really feel accomplished if I can help make these data a bit more open and accessible.

Topic		Replies	Views
re-keying sliced data? Community Help	4	606	December 7, 2020
FileAttachment now supports CSV (and TSV)	7	1818	February 7, 2021
selecting / reporting data from CSV using D3 Community Help	5	2027	December 20, 2018
d3.csvParseRow for data via URL? Community Help	3	1257	September 11, 2019
Issue with duplicate values in Inputs.table() Community Help	3	30	January 5, 2025

using array.splice non-destructively?

Related topics