best practices for version control/branching/feature development?

Was wondering if anyone has developed good practices for collaboratively working on a project that is implemented in an Observable Notebook, in which the project is large enough to include discreet feature development tasks, bug fixes, and other work items that can be tracked and assigned in issue tracking software (Jira, Github issues, etc.)

I’m trying to figure out a system to have a Release notebook (available to client), a Development notebook (most up to date internally), and various branches off of Development for feature implementation that can get pulled into Development upon code review.

I’m running into issues when merging though, in that sometimes all of the changes made in a feature branch don’t seem to make it into Development (I don’t see red and green diffs sometimes when there are definitely differences – okay, granted, this has been when comparing two notebooks where one is not a direct fork of the other).

Just wondering if people had developed their own best practices around this that they were willing to share. I get that it would be better to like, write an actual web application when things get this complex – but writing a web app is slow – and Observable is meeting so many other needs.

2 Likes

Hi Stephanie,

Figuring out good practices for long-running asynchronous collaboration on a complex notebook is such a great topic!

But first — I just wanted to mention that if you ever run into an issue with merging — where a diff isn’t appearing correctly, or a change that you selected hasn’t been merged — please report it to support@observablehq.com. (Including links to the problematic notebooks in question, or a reproducible test case, if possible.) If there are still bugs lingering in the fork/compare/merge system, we would love to get them fixed, and quick!

Hi Jeremy –

Thanks for offering help on diff-ing. Is there any documentation of the Merge/Reverse Merge interface?
I think that perhaps part of my issue is that I don’t understand what the expected outcomes of Merge are. For example, I know that you can make changes to the Fork by clicking on Fork for any cell and then editing it (while in Merge), but I am never sure if those changes will actually stick, as if I then toggle back to Diff, and then to Fork again the changes are gone.

But I will be in touch if I see problems – thanks!

The current documentation for forking, sharing and merging is here: https://observablehq.com/@observablehq/fork-share-merge (although it predates even the introduction of suggestions).

And suggestions to clarify the documentation, are always, of course, welcome, and frequently merged.

Toggling the Parent/Diff/Fork status for a cell does clear any local edits you might have made to the content of that cell since loading the page. We wanted it to be the case that when you click on “Fork”, you see the contents of the fork, and not something else.

When you click “Merge”, the resulting notebook should contain exactly what you see on the page.

In the future, a nice addition might be the ability to keep your local edits as you toggle back and forth — but we’d probably have to introduce a fourth tab, so that you would have: Parent / Diff / Fork / Edited.

2 Likes

Just to make sure I understand: if “the resulting notebook should contain exactly what you see on the page,” which cell view is that referring to – Diff or Fork? (i.e. we will see exactly what is in green and what is in red will be removed?)

Great question. The default state of a compare view contains the contents of the target notebook (i.e., the fork, not the parent).

So, when you have the “Diff” tab toggled, and you see the red and green, if you merge, you’ll get the fork’s contents — the green.

So, I think the situation that is confusing to me is if I have Notebook A, and make Notebooks B + C as forks of A. Someone does some work in B and it gets merged into A. Then someone also does some work in C, and I want to merge it into A.

In this case, when I go to either Merge C into A, or I do Compare reverse, I get something along the lines of the following:


I am unclear what exactly happens with “Apply changes”…and then I am still unclear how to get the changes that came from Notebook B into A to remain if I merge in Notebook C – do I have have to hand copy/paste them over into the “Fork” cell?

For example, here the max-width specification came into the Parent notebook (A) since the Fork notebook C was made. But I don’t want it to be removed when I merge C into A. Essentially I guess I’m trying to figure out how to deal with merge conflicts in Observable?

(maybe then we are back to the thing about changes sticking if you toggle back to diff from fork?)

1 Like

(Apologies for dropping the thread on this for a week.)

The Missing changes message appears whenever work has occurred on both notebooks since they have diverged.

Because the default state of compare view is to show the contents of the target notebook (i.e., the fork), there may be changes that have occurred in the parent notebook that are not applied by default when you load the page. Clicking on “Apply” in that message will apply them by toggling those “Changed in parent”, “Added in parent” and “Removed in parent” cells for you to show the parent’s current state.

But that does still leave true merge conflicts for you to resolve. If a cell has been edited in both notebooks since they have diverged, the header will say “Changed in both”. Then, you’ll need to look at the diff, and manually edit the code to resolve the merge conflict to your satisfaction. (Say, by clicking on the “Fork” tab for the cell, and adding the changes made in the parent by hand.)

In the future, it would be nice if we had more advanced automatic merge conflict resolution built-in — but this is what we have for now.

And to try to be crystal clear about your question above: If the changes from Notebook B affect different cells than the changes in Notebook C, then “Apply” at the top of the page will take both sets of changes for the merge. If the changes in Notebook B edit the same cells as the changes in Notebook C, then you’ll have to resolve those cells by hand. In all cases, the notebook that you see on the page before clicking “Merge” is the notebook that you’ll get after the merge is complete. And if something goes wrong, you can always revert back to before the merge, and try again.

Cheers,
Jeremy

Now my reply is belated, but thanks so much for clearing this up for me. Obviously integrated git would be amazing, but it is very obvious why that is not your first priority, or even hundredth. But just knowing what the expected functionality is is very useful. I wind up hand copy-pasting changes from one notebook (fork) into another (parent) now, and while that isn’t ideal, without the red/green diffs it would be impossible. So, we’re…making it work.

Please note that you can merge any notebook into any other notebook, regardless of their relationship (or parent author). See this comment for the detailed steps:

Oh yeah, I am aware of that, though thank you. The issue here is that if you have two+ notebooks (B + C) both forked from a parent notebook (A), and after pulling in B to A, there may now be “merge conflicts” between C + A, and in that case you can’t simply “merge” (in the Observable sense), C into A.

Of course, you could implement a policy of not having more than one fork of a notebook at a time, but that makes development with more than one person difficult, as if Person 1 forks B from A, Person 2 would need to fork C from B, and then the merge cycle would have to proceed the other way around – C into B then B in to A, which is hard/not very efficient from a timing perspective.

1 Like

I see, thanks for the explanation! I’ll try to summarize what you described into a bug report. Please let me know if this is a good representation of your issue.

Steps to reproduce:

  1. Create parent notebook
  2. Create fork A
  3. Create fork B
  4. In fork A, create a cell with the content foo = "A"
  5. Merge fork A
  6. In fork B, create two cells with the contents foo = "B" and "bar"
  7. Open compare view for fork B

Expected result:

  • Compare view shows a single named cell “foo”, with parent tab containing “A”, and fork tab containing “B” .
  • Compare view shows single unnamed cell containing "bar" at bottom.

Actual result:

  • Compare view shows named cell “foo” twice, as “added in parent”, and “added in fork”
  • Compare view shows named cell “foo” for parent after unnamed cell containing "bar".

Some thoughts:

  • If we had a similar situation in git we would end up with a merge conflict.
  • In git, however we usually have a lot more context to assess a conflict, and a clear outline of the chunks that were added in each ref.
  • While in a merge conflict, symbols (i.e. functions and other units of code) are reduced to plain text, syntax is often corrupted and functions may be broken up.
  • Observable’s cells represent symbols, but these will always stay whole during a merge.
  • Moving even large blocks of code around is a lot easier than moving several individual cells.
  • To assess possible strategies it might help to represent the entire notebook as a flat text file and simulate a merge in git.

I don’t envy anybody who has to find a UI solution for this problem.

2 Likes

Hello @stephanietuerk, I’m currently facing the same problem in my work and I’m in charge of proposing a workflow for such development lifecycle in Observable.

Can you share some experience after this long time?

These days I would recommend using the new Observable Framework when you have demands of a more rigorous development workflow. With Framework, pages are Markdown that you can manage directly in source control and you can use standard code review processes, pull requests, unit tests, CI/CD, etc.