Good data hosting options?

Can anyone recommend a simple and low-cost data hosting option that can be used for datasets that are too large to upload directly to Observable?

I have a ~500MB dataset that I’d like to analyse using an Observable notebook. In the past I would have stored this data in CSV format and analysed it using R Markdown notebooks, but I’d really like to use Observable to create an interactive web-based notebook instead. I’m a bit overwhelmed with all the different options available for hosting and would value any advice. The dataset is already in the public domain, so I am not concerned about keeping the data private.

or if you want a full on platform I think Firebase pairs well

happy to realtime chat on

Depending on your requirements you could also keep your file local by either selecting it via a file input (which simply allows the notebook to access the file, without uploading anything anywhere), or by running a local web server (e.g. via npx http-server --cors, if you have npm installed) and accessing the file via its local URL.

1 Like

@mootari’s recommendation will definitely be the fastest, since you eliminate the network fetch. If you do need to access the file from the network then you should also look at Git LFS.

One thing to consider… your final visualization may not need all the data. Oftentimes, it is better (for the end user!) if the large dataset is reduced to something that supports the presentation of the data. You may need the large dataset to discover the story/patterns (which would work well with local files) and then you can reduce the dataset to a much smaller set that drives the visualization and have end users benefit from the speed of the smaller dataset.

Thanks for the suggestions!

I’ll use a local file for now and investigate Firebase once I’ve got the visualisations sorted.

You probably already have stuff set up but for future reference, Mike Bostock has this notebook with a wrapper that makes it easy to open a local file and treat it as an Observable FileAttachment:

Might be useful for swapping out one for the other more easily without writing a lot of glue code