How to schedule workers in a smart way?

The filters in my last notebook are pretty heavy and originally tended to completely freeze the tab while I was writing it:

So to avoid that I rewrote my notebook in such a way that the images would be rendered in workers instead. I read through and combined ideas from multiple other notebooks that used workers.

One trick that I’d like to highlight came from @Fil’s worker notebook, where he used [function].toString() to embed functions defined in cells in template strings that are used to generate web workers. This allows you to define functions in cells and then put them in workers like so:

Click to show code
// Assumes the filter will return a typed array
function makeFilterWorkerScript(filter) {
  const filterScript = filter.toString();
  const script = `
    // Inline all the generic functions used by the filter
    ${colorDelta.toString()}
    ${filterScript.indexOf("selectNearest") !== -1 ? selectNearest.toString() : ""}
    ${filterScript.indexOf("selectFurthest") !== -1 ? selectFurthest.toString() : ""}
    ${filterScript.indexOf("selectDipole") !== -1 ? selectDipole.toString() : ""}
    ${filterScript.indexOf("bellcurvish2D") !== -1 ? bellcurvish2D.toString() : ""}
    ${filterScript.indexOf("bellcurvish1D") !== -1 ? bellcurvish1D.toString() : ""}
    // Inline the filter
    ${filterScript}
    
    onmessage = event => {
      const {params} = event.data;
      const rgba = ${filter.name}(params);
      postMessage({rgba}, [rgba.buffer]);
      close();
    }
  `;
  const url = URL.createObjectURL(new Blob([script], {type: 'text/javascript'}))
  invalidation.then(() => URL.revokeObjectURL(url));
  return url;
}

Being able to write and present functions like normal in cells and still use them in web workers is really nice!

Then, to render the filters asynchronously, I wrote a function that takes a filter and its parameters as arguments, and returns a canvas that finishes rendering later:

Click to show code
function makeFilterWorker(filter, params) {
  const script = makeFilterWorkerScript(filter);
  
  const context = DOM.context2d(params.width, params.height, 1);
  const imageData = context.getImageData(0, 0, params.width, params.height);
  // use original image to start with
  imageData.data.set(params.source);
  context.putImageData(imageData, 0, 0);
  
  // emphasize that this is the unrendered image
  context.font = '40px serif';
  context.fillStyle = '#00000088';
  context.fillRect(0, 0, params.width, params.height);
  context.fillText('Rendering...', 24, 64);
  context.fillStyle = 'white';
  context.fillText('Rendering...', 20, 60);
  context.strokeStyle = 'black';
  context.strokeText('Rendering...', 20, 60);
  
  const worker = new Worker(script);
  const messaged = ({data: {rgba}}) => {
    imageData.data.set(rgba);
    context.putImageData(imageData, 0, 0);
  };
  invalidation.then(() => worker.terminate());
  worker.onmessage = messaged;
  worker.postMessage({params});
  context.canvas.style = "image-rendering: crisp-edges; image-rendering: pixelated"
  return context.canvas;
}

Overall it works quite well, but there are still some things I’d like to improve - and they’d probably be useful for me in other notebooks too so I want to figure out how to get this right.

The current approach works OK when I first load the notebook, but if anything causes all of these cells to re-evaluate at once it seems to struggle quite a bit. Aside from me changing code common to all filters, one example that triggers this is when a phone switches between portrait and landscape, which resizes the source image data. On top of that the filters are rendered out of order. So it is a relevant problem for end-users.

I tried using await visibility, but it’s not quite what I want: the filters are slow enough that they take a couple of seconds to render, so I would like them to be done by the time someone scrolls down to them.

What I think would really work well is a way to queue up the creation of workers so that there are at most navigator.hardwareConcurrency workers running at once. Ideally they would then be scheduled either in order of closest-to-furthest from visible, or from top to bottom (not sure which would be best).

Does anyone have any ideas how I might try doing that? A quick search turns up this notebook for queuing workers but it’s broken right now:

(tangent: I think if we spend some time exchanging ideas here we could write a nice “Tricks to make writing workers easier in Observable” notebook out of it)

2 Likes

I have a generic worker helper createWorker() in this notebook:

The cell after the helper, named worker, contains an implementation that uses the helper that also implements a basic API.

Note that the helper is able to serialize functions and regular expressions and pass them on as data.

Edit: The notebook was published ahead of time and is a bit of a mess. This backup fork should be cleaner and work well:

2 Likes

Very nice! Going to study that in more detail later!

By the way, Fil’s notebook mentioned that iOS needs some help with generators, so your serialize function might want to copy the workaround he uses for that:

// On iOS, [generator].toString() doesn't give "function*" but "function". Fix this.
function function_stringify(f) {
  let g = f.toString();
  if (f.prototype && f.prototype.toString() === "[object Generator]")
    g = g.replace(/function\*?/, "function*");
  return g;
}
1 Like

@Fil I can’t reproduce this via browserstack, can you give an example snippet that will produce the wrong output? Also, in which version of iOS/Safari did you encounter this?

I don’t remember, it was a while ago (Aug 6, 2018 judging from the history of that notebook), on my iPhone and (from memory) using Safari. Maybe the bug is gone now.