Framework: How to buffer consecutive data

Hello !
Having a simple app getting a new set of data each day, Great ! I would like to get the same app able to memorize let’s say 2-5 days history in order to offer a better dashboard : offering trends analysis for instance.
I am thinking of having Github Actions doing this work. Something like :
1 - upload artifact history.json
2 - doing a npm ci of framework which performs a side effect of merging today data with history.json.
3- download artifact history.json. Exact pathname of history is an issue

I have very low experience with Github Action. Do you think I can succeed ? Are there better options from your expertise ?

Thanks

Alain

You can use the official GitHub cache action to persist data between workflow runs:

Alternatively you could aggregate your data in an external database and pass the credentials via GitHub action secrets. if you don’t want to manage additional cost then this list of free database hosting might help.

Thank you mootari,

actions/cache seems to work fine, but I am struggling with the pathname correspondance of my history file. Do I need to use the src/data or the _file/data pathname ? To write ? To read ?It looks like framework generates a random key (history.XXXXXXX.json) to save the file. I am trying to have a flexible code to manage every possible configuration, but so far no luck. Any advice ?
External database would be my plan B. Thank you for the action secrets tip.

Alain

Where is the data in your history.json coming from? Are you already generating it via a dataloader? If so, I would recommend to have your dataloader write to and read from its own cache (that you persist and restore via the action), and simply generate a new history.json on every build.

Exactly,
My dataloader is doing : 1 -read history (readSyncFile), 2- read fresh data(fetch), 3-aggregate to history, 4- write history (writeSyncFile), 5- provides history to the app (standard output).
What do you mean by my own cache ? A ./cache directory outside of src/data?
Will this simplify my pathname variation problem ?

Thanks

Alain

From your description it seems that the only thing you’d need to cache is the path of your sync file (assuming it’s not versioned). The steps in your GitHub workflow would be:

  1. run actions/checkout
  2. run actions/cache with the path to your history cache (i.e. your sync file) - can be anywhere, but should be outside the dist directory so it doesn’t get removed
  3. run the build
  4. deploy/upload the build

(the action should automatically handle cache updates)

After many tries, one solution is :slight_smile: :
1- In the dataloader : aggregate the data into a file at the root level : ./history.json (don’t try to put it into src or _file)
2- In Github Actions : cache the exact same file : history.json

But the history.json file at the root level cannot be downloaded any longer as a in a .md for instance.
Never mind there is a dynamic content solution
How to download a file from framework ?

Are you referencing the file as FileAttachment in any of your app’s pages? Your data loader should only output the file contents, not write the file itself.

If your data loader is called e.g. /data/history.json.js and you reference a FileAttachment(`/data/history.json`).json(), then Framework will automatically run the loader.

No FileAttachment since my read/write are in my dataloader. I understood that FileAttachment is not available there. Only fixed URL and Read/Write FileSync.
In my .md files only FileAttachment(‘/src/data/current.json’) Coming from standard output.

Can you expand on why you’d want to download the cache file?

Yes. I would like to keep trace of the ‘5 latest dataset values’ at a given date. Why not ?
It is very confusing that framework renames files in a random way : file.XXXX.txt

I would be very glad to have a clear explanation about what happens to file names after/during deploy.

Mootari, many thanks for your patience.

Your dataloader will end up producing two files, but in two very different ways and with very different purposes:

  1. The cached file that it only reads and updates internally. This file is not meant to be exposed or served.
  2. The rendered file attachment that it produces by simply outputting the JSON.

I’ve put up an example that demonstrates a dataloader with its own actions cache: GitHub - mootari/dataloader-cache-example

On every build the dataloader fetches a new entry and adds it to the cached file. It then outputs the five last entries, which the app page in turn then renders.

You can see the output here: Dataloader Cache

The example is great ! Should be put with the others from my point of view. Thank you very much.

What do you mean by “…leaving a trail of dead caches. Acceptable for small datasets, but possibly devastating for larger ones.” ? Github Actions will fail after a while ? I hope not.

Alain

GitHub applies two limits to Action caches:

  1. It will remove caches that haven’t been accessed in the last 7 days.
  2. If the repository’s overall size of caches exceeds 10GB GitHub will start to remove the oldest cache.

In practice you’ll likely have a hard time hitting that limit unless you update the cache very frequently and/or store very large files. I also plan to look into removing the old cache entry so that this would no longer be a problem.

I’ve updated the workflow file to automatically remove the old cache entry, and to handle the absence of a cache as well as workflow reruns gracefully:

May I also suggest to update the selected solution to point to the example?