Secrets for published notebooks?

Secrets work really well for private notebooks. It would be great to have an equivalent for public and shared notebooks so they can also access APIs, databases, etc. Similar to config files or environment variables for typical web apps.

Forgive me if I missed something - it looks like the only way to share notebooks that use secrets, right now, is to add the recipient to our team (which then shares all our notebooks). And no solution I can see for a public notebook.

Thanks for building such a great product!

1 Like

Hi @msb,

The problem is that public and shared notebooks are public ā€” anyone who visits them would be able to read (and therefore, steal) your secret.

If you want to list a bunch of shared configuration variables, so that you can use them in many different notebooks, but they arenā€™t actual secrets, and you donā€™t mind publishing them, I would recommend simply creating a ā€œconfigā€ notebook that you can import values from. Like so:

Then, in my other notebooks, I can:

import {PUBLIC_API_KEY} from "@jashkenas/config"

ā€¦ but your overall point is still well taken. Weā€™ll discuss adding a public version of the Secrets UI, that can be used for public and non-sensitive values.

2 Likes

As a workaround for API keys you could create a proxy server on glitch.com that injects the key into the request. The key can be stored in an .env file which will only be visible to you (read more about private/public projects here).

1 Like

If you add it, please also add a big fat warning about public access of ā€œsecretsā€. Many people are oblivious to the fact that anything that runs clientside (without an authentication layer) should be considered publicly accessible (e.g., I observed similar misconceptions in Nuxt.js issues).

Definitely. The difference between Secrets and Public Environment Variables would need to be crystal clear.

1 Like

Thank you @jashkenas and @mootari for the quick replies!

Totally see your point from a tech standpoint.

Just thinking out loud, here are a few product ideas that may address the need:

  1. User-level notebook sharing. For internal sharing (within my org), Iā€™m not worried about secrets - esp if they are not readily accessible via the Observable UI - but donā€™t want everyone to be able to see every notebook.
  2. Rendered views. For external sharing, a static view of a notebook would be better than nothing.
  3. Server-side cells. Probably a crazy idea, but it would be awesome to have a general form of @mootariā€™s suggestion available directly in Observable. Rather than setting up a proxy server somewhere else, you could just tag a cell / function to run on the server. It would be writable only by the notebook author + editors and would not have access to the client-side scope. Maybe too big a hammer for this particular problem, but just an idea :slight_smile:

@jashkenas Iā€™d like to attach a related feature request: Please allow Secret() to be referenced in a shared and public notebook. Let it throw a catchable error, let it fail gracefully, I donā€™t care - I simply want to add values that are only available when I am viewing my own notebook.

From a security standpoint: It appears that currently all secrets are fetched at once, which would be pretty terrible when happening on another authorā€™s notebook. Iā€™d like to suggest two possible strategies:

  • If a user is not the owner of the current notebook, donā€™t even bother to ask for permission - just fail immediately.
  • Request access to each secret individually, and fetch them as such.

Edit: Of course secrets must also be blocked when comparing two notebooks where at least one author is not the current user.

2 Likes

Hey, @mootari.

To make sure I understandā€¦ would this workaround let you use a public Observable notebook to access data from an API that requires an API key, without exposing that API key to the person using that notebook?

Follow-up questionā€¦ Do you have any tutorials that you really like on how to create a proxy server on glitch.com? Iā€™ve learned almost all of my programming ability in Observableā€™s environment, so I am very green on stuff like this.

I ask because I have a (currently private) notebook that uses user inputs to query a videogameā€™s API and make some visualizations of the data, and Iā€™d like to make that notebook publicly accessible, but canā€™t find a way to do it while keeping the API key ā€œsecretā€.

We chose not to do this because we want the behavior of the notebook to be the same for authors and for readers, so that authors can anticipate what readers will see.

If you want something only you can see from your own notebooks, you can use localStorage as Tom demonstrates here:

Would it work to save the API response as a file attachment on your notebook? Then your readers wonā€™t need to access the API directly.

1 Like

@foundflavor @mbostock I have a workaround that stores the HTML output of a notebook into Firebase, then reads it back on load (into the same notebook or a blank one).

Itā€™s pretty hacky, but super useful for sharing notebooks publicly without letting the user see API keys or triggering tons of API calls. I also use it on internal team notebooks that just take a long time to load.

Published here: https://observablehq.com/@msb/cache
(someone would have to stand up a Firebase instance to make this public version work)

1 Like

I think a way to do something like this within the UI of observable (to ā€œfreezeā€ a cell by creating an attachment serializing its current value, provided it is JSONifiable) would be incredibly useful, not just for easily serializing the return value of an API but also for use cases where you want to precompute and cache something expensive, or freeze the result of some interactive computation (e.g. a manually-adjusted force layout) before publication.

2 Likes

In this specific instance, it wonā€™t work, because part of the functionality of the notebook is that it takes inputs from the reader (player name, platform name, game mode) and spits out data about that player from the API. But, that might be a good solve for publishing less interactive visualizations from the gameā€™s data.

Thank you, @msb! Iā€™ve given this method a go, but have run into a snag. Maybe you can helpā€¦

It looks like your ā€œcacheā€ function is referencing another ā€œcacheWriteHelperā€ function that doesnā€™t seem to exist anywhere else in the notebook. Is that function missing?

Sorry about that, I forked from a private notebook and didnā€™t test before publishing. Iā€™ll fix in the next day or so, and also add a cell level option.

1 Like

Yes, thatā€™s the idea. :slight_smile:

Itā€™s not exactly a tutorial, but this Glitch project implements http-proxy to rewrite the Origin header and allow CORS requests to api.observablehq.com (see here for a usage example). Iā€™m sure that if you search through Glitch youā€™ll come across more examples.

Before we get into specifics about approaches, please take the following advice to heart: Assume that everyone queries your API with malicious intent. What this means is:

  • Donā€™t pass through any user input that isnā€™t required.
  • Always sanitize values (make sure theyā€™re in the proper format).
  • Never use blacklists (exclusion lists). Youā€™re bound to miss something. Define whitelists of routes or parameter names that are acceptable. You can use regular expressions to make the lists less verbose (if youā€™re careful).
  • Some APIs are heavily rate limited, and exceeding a limit can heavily impeed API usage or even create costs. To prevent a bad actor from hammering the remote API on your behalf, consider adding a rate limiter to your glitch project. Youā€™re already somewhat protected here though, as Glitch (softly) imposes its own rate limits.

With that out of the way, letā€™s talk strategies.

Rerouting (API in front of an API)

The safest way to proxy is to not proxy at all. :slight_smile:

Instead of granting direct access to a remote API, you can hide that API behind your own. What you end up with is a small router that takes a request, performs an action and optionally returns the result:

  1. Client requests data from your app/endpoint,
  2. your app/endpoint fetches the data from the remote API, using your credentials,
  3. Your app/endpoint returns the data.

The hello-express template on Glitch is a good starting point for this approach. The internal structure of the app very much depends on your requirements, e.g.:

  • If you only have one route you donā€™t even need to look at the requested path and can have your callback respond to any requests.
  • You might want to allow some user defined parameters (e.g. result count, date range, sort order).

Proxying

With an HTTP proxy Iā€™d define the following phases:

  1. Whitelist the request:
    • method: itā€™s fairly safe to say that youā€™ll only want to allow GET requests.
    • path: you probably should limit the accessible routes.
    • query: in some cases the query parameters may need to be restricted as well.
    • origin: you can ensure that your Glitch project can only be accessed from your own notebooks by restricting the origin to USERNAME.static.observableusercontent.com.
      (Fun fact: api.observablehq.com does this for observablehq.com.)
  2. Sanitize the request:
    • If you allowed user input, remove any query parameters that arenā€™t in your whitelist.
    • If necessary clean header values.
  3. Modify the request:
    • Add your API credentials as HTTP header or query parameter
    • Add any parameters that you want/need to enforce
  4. Let http-proxy (or whatever library you chose) do its magic
  5. Sanitize the response:
    • Remove your credentials from any headers or response data
    • Remove additional data (e.g. a user object)
  6. Return the response.

Conclusion

Ultimately you need to decide for yourself which of the above points you want or need to apply. E.g. if an API token isnā€™t really sensitive but requires an account to be created, youā€™ll want to pick the proxy strategy and only sanitize the response (if even necessary), and perhaps impose a rate limit.

2 Likes

@yurivish Love this way of looking at the problem. I took a stab at it here: https://observablehq.com/@msb/freezer

@foundflavor I also updated the cache notebook, which should work now. Not sure it will solve your use case though.

1 Like

There are various methods. One thing you can do is add a #secret=xyx to the link you give someone and have the notebook load it from the fragment. You might not like this because it exposes the secret (or an encoded version of it) to the user you share the link with. Thatā€™s true, but keep in mind any method which involves letting the notebook itself have access will be vulnerable to the user of that notebook retrieving its value.

1 Like

Thanks, @mootari. This makes a ton of sense. Iā€™ve given the ā€œreroutingā€ approach a go since itā€™s seems the most straightforward, and have run into a programming snag. Disclaimer: Iā€™ve basically been self-teaching programming / JavaScript through Observable projects, so I apologize in advance for the rookie questions.

Iā€™ve created a remix of the hello-express template that you referenced, seen below:

Iā€™m then trying to call that glitch API from an Observable notebook, just to make sure Iā€™m getting the code right. See below:

Iā€™m running into a network error anytime I try to reference this (or any other) Glitch network. Have any advice on where Iā€™m going wrong?

Looks like itā€™s missing the CORS header. If you look at your browserā€™s dev tools console while making the request youā€™ll likely see the precise error.