Help with creating a grouped bar graph

I’m trying to create a grouped bar graph like this one below:


I’m looking to have the x axis be different US states, and the multiple bars to represent no_of_employees in a company (where the possible options are 1-5, 6-25, 26-100, 100-500, 500-1000 and over 1000)
For the y axis, I want to show the percentage of respondents from each state who answered “yes” to a question on the survey. So the general graph will show the relationship between company size and the likelihood to answer yes to our question (lets assume the question is “do you work at a tech company”), and visualize that for multiple states.

Below is the code I have so far, plus an example of an object that we have in the data array. I created a class to hold each state, and variables that represent the number of respondents per category that answered yes to our question. I want to loop through our main array and create a class object for each state that has the percentages for each category populate, then visualize each of these in a multigroup graph. I’m confused on how to properly write the loop, and what functions I can use to count and calculate the number of respondents for each company size. Any help on how to do this will be appreciated! Thanks.

Here are the images for the code and the sample object:

@mulumbi can you share the notebook where you are working on this? It’s a lot easier to collaborate that way than with screenshots.

yeah! here’s the link @mythmon : Grouped Bar Chart / Shirley Mulumbi | Observable

There is a lot to handle here, but I can give you a few pointers to get started.

First off, you may not need to do any of this data wrangling yourself. The data cell you have is already nicely tidy. By that I mean that it follows the ideals described in this blog post (this is for the R programming language, but it applies to JS too).

Taking advantage of the tidiness of the data, we can use Plot, which is Observable’s declarative tool to make charts. It is usually much simpler that d3, and I’d recommend using it wherever you can.

For conciseness, I’ve filtered the states to only show the 6 that were in your example.

Plot.plot({
  x: { domain: company_sizes, tickRotate: 25 },
  y: { percent: true, label: "% remote_work" },
  marks: [
    Plot.barY(
      us_only.filter((d) =>
        ["CA", "TX", "FL", "NY", "IL", "PA"].includes(d.state)
      ),
      Plot.groupX(
        {
          y: "mean"
        },
        {
          x: "no_employees",
          fill: "no_employees",
          y: (d) => (d.remote_work === "Yes" ? 1 : 0),
          fx: "state"
        }
      )
    )
  ],
  width,
  marginBottom: 60
})

One tricky thing here is to generate a percent with from a group. I’ve mapped answers from Yes and No to 1 and 0, and then taken the average (mean) of the group.

For more details about Plot, you can see it’s documentation notebook and collection here: Observable Plot / Observable | Observable.

There are several other issues I can see with your code. I’ll point out a few here:

You wrote in one cell

company_sizes = {"1-5", "6-25" ,"26-100", "100-500","500-1000", "More than 1000"}

While this is valid JS, it doesn’t do what you want, and is a bit nonsensical. In JS, curly braces ({}) are used to declare objects (mappings from keys, usually strings, to values), and also to enclose blocks, groups of lines of code, usually seen in functions, if statements, and loops. The fact that the Observable inspector shows undefined above your cell should indicate that something has gone wrong. What you want, to declare an array of values, is square brackets:

company_sizes = ["1-5", "6-25" ,"26-100", "100-500","500-1000", "More than 1000"]

You repeat this in the large state_totals cell. At the top you say var state_totals = {}, and then later you say state_totals.push(...). push is a method on JS arrays, not on objects. You’ll want to again use square brackets here:

let state_totals = [];
for (...) {
   ...
   state_totals.push(...);
}
return state_totals;

You may also notice that I used let instead of var. Using var is an older style of JS that isn’t often used anymore. The differences between the two are subtle, but I’d encourage you to always use let (or it’s sibling, const), and never use var.

Finally, I would gently guide you away from using classes right now, and recommend using plain objects. Instead of var c_state_obj = state_obj(), try let cState = {name: curr_state[0].state}. You can then fill in the rest of the fields in the further lines in the loop.

If you would like a fundamentals approach to learning about JS, I’d recommend reading through MDN’s JS tutorials. They are a bit wordy, but do a good job of covering the basics of JS.