Secrets for published notebooks?

Yes, that’s the idea. :slight_smile:

It’s not exactly a tutorial, but this Glitch project implements http-proxy to rewrite the Origin header and allow CORS requests to api.observablehq.com (see here for a usage example). I’m sure that if you search through Glitch you’ll come across more examples.

Before we get into specifics about approaches, please take the following advice to heart: Assume that everyone queries your API with malicious intent. What this means is:

  • Don’t pass through any user input that isn’t required.
  • Always sanitize values (make sure they’re in the proper format).
  • Never use blacklists (exclusion lists). You’re bound to miss something. Define whitelists of routes or parameter names that are acceptable. You can use regular expressions to make the lists less verbose (if you’re careful).
  • Some APIs are heavily rate limited, and exceeding a limit can heavily impeed API usage or even create costs. To prevent a bad actor from hammering the remote API on your behalf, consider adding a rate limiter to your glitch project. You’re already somewhat protected here though, as Glitch (softly) imposes its own rate limits.

With that out of the way, let’s talk strategies.

Rerouting (API in front of an API)

The safest way to proxy is to not proxy at all. :slight_smile:

Instead of granting direct access to a remote API, you can hide that API behind your own. What you end up with is a small router that takes a request, performs an action and optionally returns the result:

  1. Client requests data from your app/endpoint,
  2. your app/endpoint fetches the data from the remote API, using your credentials,
  3. Your app/endpoint returns the data.

The hello-express template on Glitch is a good starting point for this approach. The internal structure of the app very much depends on your requirements, e.g.:

  • If you only have one route you don’t even need to look at the requested path and can have your callback respond to any requests.
  • You might want to allow some user defined parameters (e.g. result count, date range, sort order).

Proxying

With an HTTP proxy I’d define the following phases:

  1. Whitelist the request:
    • method: it’s fairly safe to say that you’ll only want to allow GET requests.
    • path: you probably should limit the accessible routes.
    • query: in some cases the query parameters may need to be restricted as well.
    • origin: you can ensure that your Glitch project can only be accessed from your own notebooks by restricting the origin to USERNAME.static.observableusercontent.com.
      (Fun fact: api.observablehq.com does this for observablehq.com.)
  2. Sanitize the request:
    • If you allowed user input, remove any query parameters that aren’t in your whitelist.
    • If necessary clean header values.
  3. Modify the request:
    • Add your API credentials as HTTP header or query parameter
    • Add any parameters that you want/need to enforce
  4. Let http-proxy (or whatever library you chose) do its magic
  5. Sanitize the response:
    • Remove your credentials from any headers or response data
    • Remove additional data (e.g. a user object)
  6. Return the response.

Conclusion

Ultimately you need to decide for yourself which of the above points you want or need to apply. E.g. if an API token isn’t really sensitive but requires an account to be created, you’ll want to pick the proxy strategy and only sanitize the response (if even necessary), and perhaps impose a rate limit.

2 Likes