How does private data hosting work for teams?

I’m interested in Observable for teams, but first want to understand more clearly how Observable interacts with private data. Thanks!

2 Likes

Good morning!

Currently, Observable does not directly host private data.

We host the code in your team notebooks, which can securely connect to private data sources over HTTPS with proper CORS headers, using Observable Secrets, or OAuth, or other preexisting web authentication, or even local files.

In this way, your code, data analysis and visualizations run directly in your team members’ browsers, with the private data never touching Observable servers (unless you choose to paste small amounts of private data directly into the notebooks themselves).

For example, let’s say that your Observable team is called the-a-team, and you already have a private API you’d like to connect to, using Secrets.

First, you would add a secret token from your team’s settings page:

You would then tell your private API to reject any request that didn’t include that token as a parameter.

You would also need to add a CORS header to responses from your private API, allowing access from your team’s Observable notebooks. In this case:

Access-Control-Allow-Origin: https://the-a-team.static.observableusercontent.com/

With those two pieces in place, your team can issue requests for data to the private API from any team notebook:

data = fetch("https://api.the-a-team.com/data", {
  method: "POST",
  body: JSON.stringify({query, token: Secret("PRIVATE_API")}),
  headers: {"Content-Type": "application/json"},
  credentials: "include"
})

If you prefer to use an OAuth-style setup, with individual accounts instead of a shared team secret, that works just as well.

In the future, we’re working on both hosted file attachments (a little less exciting), and hosted database connectors (very exciting) to make all of this much more convenient. With a secure database connector with the proper permissions set, notebooks truly become an exploratory tool for interactively designing queries and reactively connecting the (potentially streaming) results to tables, maps and visualizations. We have an early prototype of this, and it’s pretty neat.

If there’s any further explanation we can provide, please don’t hesitate to ask.

(And with apologies if that was all overlong, and you just wanted a brief reply.)

3 Likes

OK, no thanks for the long reply. My team has all of our data in BigQuery, so would ideally like to be able to connect to BigQuery to read the data in a really simple way. It sounds like that might be something in the pipeline?

What if I need data to be nonpublic but my org doesn’t have an API? Does anybody have a trick for this?

I could do a secret gist, but that’s probably not good enough for my org.

@shastabolicious — What format (or database) is your private data in currently?

We’re getting ready to start beta testing a new feature that may help with your use case…

2 Likes

Plain old .csv’s

1 Like

I’m afraid in that case you’re going to have to wait a bit longer for us to release a CSV upload feature — this upcoming one focuses on MySQL and Postgres databases to start…

Some of your current options are:

  • Paste the CSV contents into an Observable cell (only if the data is very small)
  • Serve via your own secure API (if you can host one)
  • Upload as a Secret Github Gist (only if you don’t mind the security implications of that URL being public)
  • Upload as a Google Sheet, and access via the Google Sheets OAuth API — using Observable Secrets to store your OAuth key.

Out of those, I’d favor the last one. It’s a bit of a bear to set up, but it means that the data loading is secured with your Google login, and that the spreadsheet can be edited, with the data updates reflected live in your Observable notebook.

If you need any help with getting it set up, let me know. I just put one together the other day, and would be happy to turn it into an example notebook.

3 Likes

Glitch projects can be private, and even public glitch projects can contain private data that will never be shown or shared.
A glitch-hosted server could either expose the data via a heavily restricted API, or in combination with ObservableHQ secrets require an access key (or both).

Don’t be fooled by the playful appearance, they’re actually an incredibly easy way to set up hosted services, and rate limits/restrictions are generous for a free service.

3 Likes

I’d be interested in seeing an example notebook on the google sheet option.

Thanks,
Evan

3 Likes

Hi Evan,

I put together an example notebook — but the steps involved in getting a private Google OAuth connection set up are so complicated and unfriendly that it’s not going to be a method that we want to officially recommend to anyone.

We’re going to work on a real Observable / Google Sheets connector that can handle the connection elegantly.

In the meantime, if you still want to see the walkthrough, email me, and I can share the link with you in private.

3 Likes

We’re definitely working on this type of feature (see my latest post in announcements). We currently only have support for PostgreSQL and MySQL, but BigQuery is on the list already.

Thanks. I don’t need it urgently, so I’ll wait for the official connectors.

Best,
Evan

We just deployed BigQuery support within our database client beta feature! If you want, I can add you (and your team?) to the beta so you can try it out. It’s a very convenient setup to connect to BigQuery now.

2 Likes