Weighted rolling average with Arquero

harris · September 9, 2021, 3:23pm

I know that you can calculate a moving average with Arquero like so:

data.derive({
  sevenDayAvg: aq.rolling(d => d.average(d.dataField), [-6, 0])
})

but is there a way to calculate a weighted average where points are given less weight when they are further from target point (for example a gaussian kernel)?

Fil · September 9, 2021, 7:23pm

Not arquero, but array-blur is a (fast) approximation of a gaussian kernel applied on 1-d or 2-d data GitHub - Fil/array-blur: blur an array of numbers in 1 or 2 dimensions

harris · September 9, 2021, 9:24pm

Oh, nice! Thank you!

harris · September 10, 2021, 12:43am

Okay, to answer my own question I’ve written an arquero window function that I believe does gaussian kernel density estimation. My stats is a little rusty, so I’d welcome any corrections in what I’ve done here!

normalDistributionGenerator = (location = 0, scale = 1) => {
  return (x) => {
    const left = 1 / (scale * Math.sqrt(2 * Math.PI))
    const rightTop = -((x - location) ** 2)
    const rightBottom = 2 * scale ** 2
    return left * Math.exp(rightTop / rightBottom)
  }
}

aq.addWindowFunction(
  'kde',
  {
    create: (scale = 1, distributionGenerator = normalDistributionGenerator) => ({
      init: state => state,
      value: (w, f) => {
        const normal = distributionGenerator(w.index, scale)
        return d3.range(w.i0, w.i1).reduce((acc, i) => acc + w.value(i, f) * normal(i))
      },
    }),
    param: [1, 2],
  },
  { override: true }
)

Usage:

data.derive({
  newCasesKDE: aq.rolling(d => aq.kde(d['actuals.newCases'], 7), [-Math.Infinity, Math.Infinity])
}),

harris · September 10, 2021, 1:05am

Here it is in action

Fil · September 11, 2021, 7:59am

I’ve made an example with array-blur here Time Series Data Smoothing / Fil / Observable

you can see that there is a tiny difference in the last few days, where the KDE has a sharp drops, because it thinks that there are zeroes to the right of the last values — array-blur takes the boundary into consideration.

harris · September 11, 2021, 9:08pm

Awesome! That’s super cool. Yeah I noticed that drop at the end and wasn’t sure how that’s typically accounted for. Does array-blur adjust the amplitude of the kernel according to how close to the end it is (or is the calculation totally different from that? I admit I haven’t even looked at the code )

Fil · September 12, 2021, 9:17am

In array-blur we clamp the index to the range; in other words when you reach the end of the data vector, the missing values are taken to be the last value instead of 0. (And similarly on the left-hand side, the missing values “before 0” are v[0] instead of being 0.) It happens here array-blur/blur.js at master · Fil/array-blur · GitHub

harris · September 12, 2021, 1:35pm

Oh, that makes sense! I think I can do that with arquero too. Going to give it a try.

Topic		Replies	Views
data smoothing operations?	3	628	May 8, 2020
Dynamic rollup object for Arquero Community Help	6	941	March 10, 2021
Round to 2 decimals using Arquero Community Help	4	261	August 30, 2023
d3-array equivalent for this notebook? Community Help	3	563	January 6, 2020
Grouping data by date Community Help	5	3107	September 9, 2019

Weighted rolling average with Arquero

Related topics