Observables Data Model

The “serverless,” lightweight, portable features of Observable are great, but I’m trying to understand the data model a little bit better. When I pull in data from another server, is it loaded once (or occasionally)? Or is it loaded from that server on every call?

The reason I ask is that we have an instance of our site hosted on AWS, and on that setup, we’re charged for data egress. I built an Observable notebook that pulls data from that site via API, then embedded a couple of the plots from that notebook on a site. Is data loaded from the origin on AWS every time the final endpoint is loaded (where the plots are embedded)? Is it loaded every time the notebook is loaded?

An easy way to picture what is happening is that (at the time of this writing) Observable only hosts and serves two things: the notebook’s code and its FileAttachments.

Whatever other data source you have is called by the code, which runs on your browser, and doesn’t ever touch Observable’s servers[1].

So, in the case you’re describing, each new client (each new user who visits the page) will load the resource from AWS. If that resource is not cached, they will load it on each of their visits.

What you could possibly do to avoid the egress costs is to set up a proxy with a cache—this can go from medium difficult to quite fancy, depending on your approach and your needs. (The simpler approach might be to create a notebook that retrieves your data (and maybe simplifies it), then manually download that and reupload it as a FileAttachment. But this means it’s not going to stay “fresh”.)

[1] One exception though: the script which creates the thumbnail image will download the same ressources (if they are public), in order to be able to run the code.