Type Conversion

I’m trying to convert a CSV file with numbers, strings, and currency data into something that I can work with. I’m having a hard time understanding how I should do this. I looked around in the documentation, but couldn’t really grasp the concepts there.

It’d be great if somebody can help me out a bit.

Here is the dataset -

`Rank,Name,Net Worth,Country,Industry\r
1,Elon Musk,$203B,United States,Technology\r
2,Jeff Bezos,$195B,United States,Technology\r
3,Bill Gates,$135B,United States,Technology\r

And here is the function -
d3.csvParse(await FileAttachment(“bloomberg_billionaires.csv”).text(), d3.autoType)

autotype helps me with converting the rank to the number, but I can’t seem to get how to convert the net worth to a number. :frowning:
Any help would be appreciated! And apologies for a noob question.


1 Like

Have you found the documentation for d3.autoType? It might help.

Specifically, d3.autoType only coerces to a number if the input is coercible to a number according to the JavaScript specification:


In this case +"$203B" evaluates to NaN, so it’s not coercing to a number. You’d need to write your own number parser for that. For example:

function parseDollars(input) {
  let match;
  if ((match = /^\$(\d+(\.\d+)?)B$/.exec(input))) return match[1] * 1e9;
  if ((match = /^\$(\d+(\.\d+)?)M$/.exec(input))) return match[1] * 1e6;
  throw new Error(`unexpected input: ${input}`);

Then you can say:

data = {
  const data = await FileAttachment("bloomberg_billionaires.csv").csv();
  for (const d of data) {
    d["Rank"] = +d["Rank"];
    d["Net Worth"] = parseDollars(d["Net Worth"]);
  return data;

I think you have what you need (I was too slow to send my quick mock-up) but sharing here in case this example helps.


Thank you so much Mike!

I did look at the autotype documentation, but was getting stuck at two things -

  1. Parsing complex strings like $203B - still can’t wrap my head around how these expressions work, will read up on it
  2. reference to the columns names - i wasn’t specifying them correctly and was getting NaN and undefined which threw me off.

But it’s a lot clearer now. I used the reference in the autotype documentation to write this more concisely -
parsedData2 = d3.csvParse(await FileAttachment("bloomberg_billionaires@1.csv").text(), ({rank, name, net_worth, country, industry}) => ({rank: +rank, name: name, net_worth: parseDollars(net_worth), country:country, industry:industry}))

Again, thanks for the prompt response. Much appreciated!
Also, love working in Observable, and excited to get better at d3.

Thanks Cobus!

Even cooler, can use …d and make function even more concise. Perfect!

parsedData2 = d3.csvParse(await FileAttachment("bloomberg_billionaires@1.csv").text(), d => ({...d, rank: +d.rank, net_worth: parseDollars(d.net_worth)}))