Type Conversion

I’m trying to convert a CSV file with numbers, strings, and currency data into something that I can work with. I’m having a hard time understanding how I should do this. I looked around in the documentation, but couldn’t really grasp the concepts there.

It’d be great if somebody can help me out a bit.

Here is the dataset -

`Rank,Name,Net Worth,Country,Industry\r
1,Elon Musk,$203B,United States,Technology\r
2,Jeff Bezos,$195B,United States,Technology\r
3,Bill Gates,$135B,United States,Technology\r

And here is the function -
d3.csvParse(await FileAttachment(“bloomberg_billionaires.csv”).text(), d3.autoType)

autotype helps me with converting the rank to the number, but I can’t seem to get how to convert the net worth to a number. :frowning:
Any help would be appreciated! And apologies for a noob question.

Thanks!

1 Like

Have you found the documentation for d3.autoType? It might help.

Specifically, d3.autoType only coerces to a number if the input is coercible to a number according to the JavaScript specification:

https://www.ecma-international.org/ecma-262/9.0/index.html#sec-tonumber-applied-to-the-string-type

In this case +"$203B" evaluates to NaN, so it’s not coercing to a number. You’d need to write your own number parser for that. For example:

function parseDollars(input) {
  let match;
  if ((match = /^\$(\d+(\.\d+)?)B$/.exec(input))) return match[1] * 1e9;
  if ((match = /^\$(\d+(\.\d+)?)M$/.exec(input))) return match[1] * 1e6;
  throw new Error(`unexpected input: ${input}`);
}

Then you can say:

data = {
  const data = await FileAttachment("bloomberg_billionaires.csv").csv();
  for (const d of data) {
    d["Rank"] = +d["Rank"];
    d["Net Worth"] = parseDollars(d["Net Worth"]);
  }
  return data;
}
3 Likes

I think you have what you need (I was too slow to send my quick mock-up) but sharing here in case this example helps.

2 Likes

Thank you so much Mike!

I did look at the autotype documentation, but was getting stuck at two things -

  1. Parsing complex strings like $203B - still can’t wrap my head around how these expressions work, will read up on it
  2. reference to the columns names - i wasn’t specifying them correctly and was getting NaN and undefined which threw me off.

But it’s a lot clearer now. I used the reference in the autotype documentation to write this more concisely -
parsedData2 = d3.csvParse(await FileAttachment("bloomberg_billionaires@1.csv").text(), ({rank, name, net_worth, country, industry}) => ({rank: +rank, name: name, net_worth: parseDollars(net_worth), country:country, industry:industry}))

Again, thanks for the prompt response. Much appreciated!
Also, love working in Observable, and excited to get better at d3.
:beers:

Thanks Cobus!

Even cooler, can use …d and make function even more concise. Perfect!

parsedData2 = d3.csvParse(await FileAttachment("bloomberg_billionaires@1.csv").text(), d => ({...d, rank: +d.rank, net_worth: parseDollars(d.net_worth)}))

2 Likes