Optimising many-to-many geo-contains

I have a map with hundreds of tiles, and need group 1000s of points into these tiles.

This is a painfully slow operation, and I’ve tried the following but with no real performance improvements:

  1. Pre-filtering to local points before performing the geo-contains function
  2. Using @Fil 's geoContainsCache so that each tile’s geometry is calculated just once (and not for every tile x point comibation).

Is there anything else I could try? I could materialise the groupings in Python, but it’d be neat to have dynamic tile sizes in Observable.

Thanks,
Ben

1 Like

Hello,

we can take advantage that the grid is very griddy (with the same longitudes and latitudes on each axis), and index it based on the top-right corner of each tile. Then the search for a specific coordinates is just a matter of bisecting these coordinates on x then on y. The following function returns the index of the tile element that contains the point, if any:

findTile = {
  const corners = tiles.features.map((d, i) => [
    ...d.geometry.coordinates[0][2],
    i
  ]);
  const x = d3.sort(d3.group(corners, (d) => +d[0]).keys());
  const y = d3.sort(d3.group(corners, (d) => +d[1]).keys());
  const I = d3.rollup(
    corners,
    (v) => v[0][2],
    (d) => +d[0],
    (d) => +d[1]
  );
  return ({ Long, Lat }) => {
    const x0 = x[d3.bisect(x, +Long)];
    const y0 = y[d3.bisect(y, +Lat)];
    return I.get(x0)?.get(y0);
  };
}

Note that, if you filter on the result being undefined, you will see that the grid is not covering the whole dataset:

Notebook: Catches sites / recifs / Observable

2 Likes

Fantastic, thanks! I look forward to exploring this tomorrow.

When I try to make the tile-sizing dynamic, your notebook seizes up:

Why would this be the case?

The tile algorithm itself is reasonably responsive in isolation:

Thanks,
Ben

The notebook works well with step: 0.5, but not for step 0.1.

Why would the step size of the parameter impact performance so much?