backup method?

bgchen · February 24, 2019, 5:39pm

Wow, thanks for noticing this. I’ve updated my script just now too.

mootari · February 26, 2019, 11:25pm

PSA: Observable has dropped the “beta.” subdomain. Be sure to update the site path and cookie domain in your scripts.

Fil · March 3, 2019, 5:48pm

I’ve added this block in @bgchen’s version of the script:

      case 'backup-user': {
        // backup public documents for @user
        const user = (process.argv[3] || "").replace(/^@/, "");
        const dir = process.argv[4] || "data";
        if (process.argv[3]) {
          let before = "";
          const dirName = `${dir}/${user}`;
          try {
            fs.mkdirSync(dirName);
          } catch(e) {}
          do {
            const nbdat = await api.get(`/documents/@${user}${before}`);
            for (const nb of nbdat) {
              const fileName = `${dirName}/${nb.slug.replace("/", ".v")}.json`;
              let savedContent;
              try {
                savedContent = JSON.parse(fs.readFileSync(fileName, 'utf8'));
              } catch(e) {}
              if (savedContent && savedContent.version >= nb.version) {
                console.log(`Skipping ${nb.title}`);
              } else {
                console.log(`Downloading ${nb.title}`);
                const nbdat = await api.get(`/document/${nb.id}`);
                fs.writeFileSync(fileName, JSON.stringify(nbdat), {flag:'w'});
              }
            }
            before = nbdat.length ? `?before=${nbdat.pop().update_time}` : "";
          } while (before)
          break;
        }
      }

Usage:

> node index.js backup-user @fil

will create a data/fil/ directory containing all my published notebooks. Note that, if the notebook is published but has been modified since last publication, what I receive and save is the most current version.

(Should me move this thread to a github project?)

mootari · March 3, 2019, 6:27pm

I’m almost done setting up a repo, just wanted to clean up a few things beforehand.

Thanks for sharing, I was wondering about your backup requirements when I tried to plan the high-level API and helpers.

Fil · March 19, 2019, 7:54am

this has been failing with

(node:35847) UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'value' of undefined
    at ObservableAPI.authorizeWithGithub (/Users/fil/Source/observable/backup/api.js:100:29)
    at processTicksAndRejections (internal/process/next_tick.js:81:5)

(last time it worked was about a week ago)

bgchen · March 19, 2019, 2:32pm

I noticed this too. I think the issue is that the Observable page now generates the “T” token client-side and posts that to the server, whereas the scripts have been getting this token from the cookie (?).

(I’ve been waiting for @mootari to share his repo so that we’ll have something nicer to build from than my crude edit )

mootari · March 19, 2019, 7:05pm

Yep, the relevant code is:

      onClick: ()=>{
        n && n(),
        window.location.assign(function(e) {
          return `https://github.com/login/oauth/authorize?scope=user:email&client_id=1a8619df27715d9d2c97&state=${pu()}&redirect_uri=${`https://api.observablehq.com/github/oauth?path=/loggedin${e}`}`
        }(r))
      }

and

  function pu() {
    const e = document.cookie.match(/(?:^|;)\s*T\s*=\s*([0-9a-f]{32})(?:$|;)/);
    if (e)
      return e[1];
    const t = (n = 16,
    Array.from(crypto.getRandomValues(new Uint8Array(n)), e=>e.toString(16).padStart(2, "0")).join(""));
    var n;
    const r = new Date(Date.now() + 1728e5);
    return document.cookie = `T=${t}; Domain=.observablehq.com; Path=/; Secure; Expires=${r.toUTCString()}`,
    t
  }

So, so sorry about the delay. These last days I’ve been either too swamped or too tired to finish setting up the repo. I’ll set it up this weekend, promise.

mootari · March 24, 2019, 12:05am

I’ve updated the gist to have ensureToken() generate the token by itself instead of fetching it from the server.

gist.github.com

https://gist.github.com/mootari/511b751e325db8316bb3138dcb0a7393/revisions?diff=split

index.js

const request = require('superagent');

class ObservableAPI {
  constructor() {
    this.SITE_URL = 'https://observablehq.com';
    this.API_URL = 'https://api.observablehq.com';
    this.GITHUB_CLIENT_ID = '1a8619df27715d9d2c97';
    // Cookie jar access info.
    this.accessInfo = {
      domain: '.observablehq.com',

This file has been truncated. show original

bgchen · March 24, 2019, 1:53am

Thanks! I used your work to update my version just now:

gist.github.com

https://gist.github.com/bryangingechen/9d86f1e5ec01674a32dd9c54bdc13947/revisions?diff=split

index.js

// https://gist.github.com/mootari/511b751e325db8316bb3138dcb0a7393
const request = require('superagent');
const cache = require('flat-cache').load('cacheId');

// https://stackoverflow.com/a/33500118/
const readline = require('readline');
const Writable = require('stream').Writable;

const fs = require('fs');
const crypto = require('crypto');

This file has been truncated. show original

I made one change to your new ensureToken. Instead of:

ensureToken(regenerate = false) {
  if (!regenerate && this.getToken()) return;

I use:

ensureToken(regenerate = false) {
  if (!regenerate && this.getToken() && this.getToken().value !== '') return;

The third clause is necessary since after authentication, the response from https://observablehq.com/loggedin has a set-cookie header which reads set-cookie: T=; Max-Age=0; Domain=.observablehq.com; HttpOnly; Path=/; Secure.

Edit: updated per @mootari’s comment below.

mootari · March 24, 2019, 7:09pm

I noticed that too, but wrongly dismissed it as irrelevant. I’d add the check against value to getToken() though, otherwise code might fetch a token cookie with an empty value. I’ve updated my gist accordingly.

mootari · March 26, 2019, 10:00pm

The repo is now available:

Fil · August 13, 2020, 1:58pm

Is this still working for you? I have stopped using it for quite a while and now it doesn’t want to connect anymore.

mootari · August 13, 2020, 4:34pm

Never got around actually to using it. Does the authentication fail, or can you at least retreive a token?

Edit: The login has now an extra step, and the CSRF token only gets set after one inputs the name.

Fil · August 13, 2020, 4:51pm

Don’t worry—I just wanted to check if it was just broken for me, but I don’t really need it for the moment. I still would like to be able to bulk download easily for backup and grep

mootari · August 13, 2020, 4:54pm

Noone else has complained either. I may have to assume that noone is using it.

Anyway, can I ask you to open an issue in GitHub - mootari/observable-client ?

tomlarkworthy · May 6, 2021, 1:08pm

I dunno if I should necro these threads but it makes sense to have only one backup thread IMHO. My backup solution exports to storage, ordered by update timestamp and checks version ids so it can stop early if nothing has changed.

I’ll probably automate it with cron once I gat some confidence with it.

I am not really planning on making this a service, but its pretty easy to copy if people want.

tomlarkworthy · February 5, 2022, 7:52pm

I did not like the previous approach in the end, it requires too much manual triggering and it was hard to setup in the first place, plus the end result was a tar archive that was hard to interact with. So based on the learnings from the previous one, I have done a fresh approach.

This new backup solution triggers a Github Action to unpacks the tar code, syncs with Github, and runs after every publish automatically, it also works with non-public team notebooks! You can point everything at a common repository, because the notebooks are unpacked into a directory mirroring their URL, you can take a look here. It only took me 270 Github action attempts before I got it!

tomlarkworthy · May 11, 2022, 8:01am

The intent of the github backups notebook is that you could set it up once in a personal backup notebook that can then be transitively imported everywhere you need backups and avoid having to configure the github token each time.

This was not working under certain conditions, thanks @jimpick for reporting the issue, which is now solved. As you can see I personally have quite the collection of backups now:- observable-notebooks/@endpointservices at main · endpointservices/observable-notebooks · GitHub

which only requires me to import my footer notebook

that footer does a few other useful things like install an error reporting framework and usage analytics.

Topic		Replies	Views
Resetting Browser History Eats Work... Community Help	4	342	August 21, 2020
lost notebook Community Help	8	1636	September 17, 2018
notebook lost charts and data Community Help	4	39	July 18, 2024
Sudden loss of most recent versions of a notebook Community Help	2	37	September 23, 2024
[rant]The current publishing/slug system is a PITA. Feedback	2	814	July 6, 2020

backup method?

Related topics