Can anyone recommend a simple and low-cost data hosting option that can be used for datasets that are too large to upload directly to Observable?
I have a ~500MB dataset that I’d like to analyse using an Observable notebook. In the past I would have stored this data in CSV format and analysed it using R Markdown notebooks, but I’d really like to use Observable to create an interactive web-based notebook instead. I’m a bit overwhelmed with all the different options available for hosting and would value any advice. The dataset is already in the public domain, so I am not concerned about keeping the data private.
Depending on your requirements you could also keep your file local by either selecting it via a file input (which simply allows the notebook to access the file, without uploading anything anywhere), or by running a local web server (e.g. via npx http-server --cors, if you have npm installed) and accessing the file via its local URL.
@mootari’s recommendation will definitely be the fastest, since you eliminate the network fetch. If you do need to access the file from the network then you should also look at Git LFS.
One thing to consider… your final visualization may not need all the data. Oftentimes, it is better (for the end user!) if the large dataset is reduced to something that supports the presentation of the data. You may need the large dataset to discover the story/patterns (which would work well with local files) and then you can reduce the dataset to a much smaller set that drives the visualization and have end users benefit from the speed of the smaller dataset.