backup method?

Hello,

I used to make a regular backup of all my notebooks (public, private and likes) by using the (unofficial) API.

It looks like the endpoint I was using (user/public, see My Notebooks / Fil / Observable) is gone, and I don’t know how to replace it.

The backups are really useful to track down issues, and to search my own notebooks. Barring an official method, what would you recommend to use?

Thanks

2 Likes

Seems that the endpoint is at fault. If you replace it with https://observable-cors.glitch.me/ it works again.

1 Like

Yes, it repairs “my notebooks”.

Unfortunately not my backups… the user/public service (which was my way of finding the list of notebooks to save) is gone.

Shouldn’t the list of notebooks you fetch via /user /documents contain only public notebooks anyway?

got it! now both drafts and public are under /user/documents ; they used to live under /user/public and /user/drafts
to retrieve my drafts I just use my user cookie, which I have to update by copy/paste from my browser (!!) every week or so.

Is the expiration date of the cookie enforced server-side? Otherwise maybe you could just set it to a later date (via dev tools > Application > Cookies).

it does expire server-side :slight_smile:

The internal endpoint you’re looking for is now /user/documents, which will return your most recently-updated notebooks. You can paginate using ?before=…, and if desired, you optionally filter based on the access level (e.g., ?types=public).

3 Likes

I’ve started mapping the routes, maybe to turn them into a more generic helper (yesyes, internal, I know …):

Cross-Origin routes:

/:user/:slug.js
/:user/:slug.tgz
/d/:id.js // redirects to /d/:id@:version.js
/d/:id.tgz
/d/:id@:version.js
/d/:id@:version.tgz

Limited to origin beta.observablehq.com:

/document/:user/:slug
/document/:user/:slug/head
/document/:user/:slug/forks?type=:type
/document/:id
/document/:id/meta
/documents/public?before=:before
/documents/public/popular
/documents/search?limit=:limit&offset=:offset&query=:query
/documents/:user
/documents/:id@:revision/:id@:revision/ancestry
/user/:user
/user/:user/stats
/collection/:user/:slug
/collections/:user
/collections/:user/pinned

Requires valid user session, limited to origin beta.observablehq.com:

/user
/user/documents?types=:type&before=:before // Valid types: public, shared, private, trash
/user/likes
1 Like

Here’s a node class that provides a basic server-side API client, with Github authorization:

2 Likes

It works on a test account. However on my main account it doesn’t, because of 2FA on github. If I use my password, Github sends me an authentication code by SMS. I have tried with an auth token (https://github.com/settings/tokens), no success yet.

Ah, that’s a shame. I don’t have 2FA enabled, so I couldn’t test it. Theoretically though it should be possible to capture that step and prompt for the code. Locally I’m using flat-cache now to store both the CSRF and session cookie, so that I don’t have to reauthenticate on every run. Ultimately I’d add a command line prompt for all credentials, and the 2FA code could be queried there as well.

Yeah, that won’t work. Authorization happens through Observable, and there’s a step where they pass the both client ID and secret to Github. You can’t work around that, unfortunately.

For what it’s worth, I’ve also gotten basic POST operations (like adding/removing to collections, creating a new notebook with up to two cells) to work.

Actually, I was able to fully import a notebook via WebSocket, by reindexing the node IDs and then issuing an insert_node event for every node:

Edit: Here’s a gist of the notebook data:

2 Likes

Thanks so much for sharing this! I had never looked at this sort of thing before so I’ve learned a ton. Here’s a very rough modification of your code that prompts for username+password+2FA (if necessary) and can perform simple notebook download / upload:

This is very much beginner code, so feel free to suggest changes or improvements!

It seems that you can upload a full notebook using the /document/new endpoint, so that’s what I went with. Though the edit websocket has been pretty fun to play around with too…

Edit: Here’s a utility script which takes a downloaded notebook JSON file and outputs a new file where the cells are reindexed so that they go from 0,1,… to # cells - 1.

2 Likes

Fantastic! Works perfectly!

2 Likes

Thanks so much for pointing that out! My brain might have been mud after messing around with too many “400 Bad Request” responses, so I didn’t even think to adjust the version property in the /new import. I probably would have never given it a another try because I ended up believing it was some kind of protection mechanism. Turns out it was just my stupidity. :grin:

In my local version I’ve already reduced Client to the bare minimum and moved authorization out into its own helper. I’ll try to clean up the code (and include your suggestions) and publish a repo this weekend.

Not sure yet about the high-level API though. First I was planning to stick closely to the original routes, but since those might change anyway it’s probably better to consider an abstraction on the concept/entity level (i.e. notebook, user, collection).

2 Likes

Great, really looking forward to it!

In case it helps, the /document/new endpoint performs a few more checks that I didn’t put in my gist. For each member of the nodes array:

  • id must be a nonnegative integer and cannot be missing
  • all id must be distinct
  • if present, pinned must be boolean (defaults to false)
  • if present, value must be a string (defaults to "")
  • no extra keys seem to be allowed

Edit: One more observation: in my gist, the condition that all id values must be less than version isn’t correct. Instead, all id values must be less than or equal to version and furthermore, version must be strictly greater than nodes.length.

1 Like

I attempted to fix the validation logic for nodes in my version of the script:

https://gist.github.com/bryangingechen/9d86f1e5ec01674a32dd9c54bdc13947

Let me know if I missed any cases.

Attention, there’s a serious flaw in both my script and Bryan’s version:

  • The code uses agent.query(...) for the step where the credentials get passed to Github via POST.
  • This appends the data to the URL instead of passing it in the body.
  • As a result your password will most likely end up in Github’s logs.

If you’ve used either one of those scripts I’d recommend changing your Github password, just to be safe. I’ll fix my gist shortly.

2 Likes

Gist has been updated. Here’s the diff: https://gist.github.com/mootari/511b751e325db8316bb3138dcb0a7393/revisions#diff-168726dbe96b3ce427e7fedce31bb0bc

1 Like