[Maria-developers] Bzr merge order
I noticed a difference between Bitkeeper and Bzr merges. In Bitkeeper, merge parents are essentially unordered. The two parents of a merge are completely equal, and there is no way from the Bitkeeper to history to say whether one parent was merged into the other or vise versa. The actual order in which merge parents are listed is based on some deterministic algorithm (most recent commit date if I recall correctly). In particular, this means that in Bitkeeper, the result of a pull of A into B is exactly identical to a pull of B into A. This makes it impossible to establish the history of pushes into main trees from just the Bitkeeper history. In Bzr, merge parents _are_ ordered. Merging B into A gives a different result from merging A into B. When merging B into A (cd A; bzr merge B), A is the primary or left-hand parent. The resulting tree has one extra top-level merge revision following the tip of A, with all merged revisions of B appearing as sub-revisions of this. This means that if one merges A into B giving C1, and later merges B into A giving C2, then C1 and C2 are considered diverged and need another merge changeset to be consolidated (in Bitkeeper C1 and C2 are identical and need no further merge). What this means is that in bzr is _is_ actually possible to see the history of what was merged into what, at least in some sense. Unfortunately, it tends to be backwards with usual MySQL working style. This is to make a branch at some point X, make a patch, commit and get review. Then when pushing, the main tree has gotten additional pushes by others and is now at revision Y > X. So one does a bzr merge of the main tree into the local clone, followed by a push to the main tree. Now the local patch becomes the main merge parent, and all other pushes between X and Y become sub-revisions with their revision numbers renamed. Which eventually makes primary/secondary merge parent relationships more or less random. So since I did not take part in the transition to bzr within Sun, I just wanted to ask if this is something that was discussed, and if so if there were any conclusions? As I understand, the Drizzle people make sure that whenever they merge, they merge into a copy of their trunk, so that all past pushes are visible as the line of primary/left-most merge parents in the main tree. This works better for them, as they follow the model of a single/few merge captains merging in other peoples trees upon merge requests. My personal opinion is that while it would be nice to have the push history available like the Drizzle people do, the bzr support for this is not good enough, without a very tight control on who can push to main trees that we do not want. What one can do is when pushing, first swith to an up-to-date fresh clone of the main tree, pull from the local branch with own changes, then push that to the main tree. Maybe this is a good way, even though it cannot be enforced? Just wanted to hear if there are any other opinions on this? - Kristian.
Hi! Kristian Nielsen пишет: [skip]
Just wanted to hear if there are any other opinions on this?
Since we switched to bazaar I do it in such way (to reduce merge commit size): 1) bzr branch <current-main-tree> <merge-tree> 2) cd <merge-tree> 3) bzr meerge ../<tree-with-patch> 4) build/test 5) bzr gcommit 6) bzr push (or merge) to (or with) <current-main-tree> and so on
"Oleksandr \"Sanja\" Byelkin" <sanja@askmonty.org> writes:
Since we switched to bazaar I do it in such way (to reduce merge commit size):
1) bzr branch <current-main-tree> <merge-tree> 2) cd <merge-tree> 3) bzr meerge ../<tree-with-patch> 4) build/test 5) bzr gcommit 6) bzr push (or merge) to (or with) <current-main-tree> and so on
Yes, I see. I like this way, I think I will use it as well from now on. Thanks, - Kristian.
Hi!
"Kristian" == Kristian Nielsen <knielsen@knielsen-hq.org> writes:
Kristian> I noticed a difference between Bitkeeper and Bzr merges. <cut> Thanks for the analyse and thinking about this! First, in Sun when we switched to bzr, we didn't care about the difference and continued as if nothing had changed. We did get some extra merges but no one cared (as far as I know). Kristian> My personal opinion is that while it would be nice to have the push history Kristian> available like the Drizzle people do, the bzr support for this is not good Kristian> enough, without a very tight control on who can push to main trees that we do Kristian> not want. What one can do is when pushing, first swith to an up-to-date fresh Kristian> clone of the main tree, pull from the local branch with own changes, then push Kristian> that to the main tree. Maybe this is a good way, even though it cannot be Kristian> enforced? I think that sounds like a good way to do it, even if there is some problems with it: We would have to lock the main tree between the time we do the fresh local tree' and the time we push. Kristian> Just wanted to hear if there are any other opinions on this? My main questions is how big the 'consolidation change set' really is. We used mainly bzr in the mysql-maria tree and I didn't notice any notable problems with this except that you had from time to time do the weave merging algorithm to get things right. Regards, Monty
Following up on this, I just learned about the bzr option append_revisions_only that can be set on a branch. This option can be used to enforce that the main history (sequence of primary/left-hand parents from the tip) correctly reflects the series of pushes into the public tree. So assume this sequence of events: 1. Joe branches lp:maria (revision 1000), and starts hacking on a patch. 2. Other developers push revisions 1001, 1002, and 1003 to lp:maria. 3. Joe finishes the patch, the review is good. He runs `bzr merge lp:maria`, followed by `bzr push lp:maria`. As it is now, the resulting history of lp:maria will look like this: 1002 Joe (merge) 1000.1.3 | Other 1000.1.2 | developer's 1000.1.1 | commits 1001 Joe's patch 1000 Starting point So it is not at all clear that the other commits were at some point pushed individually to lp:maria. And if someone goes to look at revision 1002 thinking to see one of the other developer's commits, it will be missing (or worse refer to the wrong commit after further pushes). But if we set the append_revisions_only option on lp:maria, instead of this Joe will get this error: bzr: ERROR: Server sent an unexpected error: ('error', 'Operation denied because it would change the main history, which is not permitted by the append_revisions_only setting on branch "lp-139886317008592:///~knielsen/maria/tmp-buildbot-test".') In this case, Joe will instead have to do this: bzr branch lp:maria # or bzr pull lp:maria into an existing clean clone bzr merge ../branch-with-patch bzr commit -m"merged with trunk" bzr push lp:maria Then the resulting history will be this: 1004 Joe (merge) 1000.1.1 Joe patch 1003 | Other 1002 | developer's 1001 | commits 1000 Starting point Which is much nicer, IMHO. So basically append_revisions_only enforces the merging style that I, Sanja, and Monty already proposed as a good practise. So I propose to add this option to the 5.1, 5.2, and 6.0 trees on Launchpad. This will provide a clear record of the push history in the repository, at the cost of enforcing the "good practise" merge style with one extra bzr step. Any opinions? Reasons not to do this? If there is general agreement I will add the option (the process is somewhat inconvenient, but I tested a procedure using sftp and that works ok). (Incidentally, this way also makes it possible to correlate the branch history directly with build history from Buildbot/Pushbuild, something I really missed in the BitKeeper days). - Kristian. Kristian Nielsen <knielsen@knielsen-hq.org> writes:
I noticed a difference between Bitkeeper and Bzr merges.
In Bitkeeper, merge parents are essentially unordered. The two parents of a merge are completely equal, and there is no way from the Bitkeeper to history to say whether one parent was merged into the other or vise versa. The actual order in which merge parents are listed is based on some deterministic algorithm (most recent commit date if I recall correctly).
In particular, this means that in Bitkeeper, the result of a pull of A into B is exactly identical to a pull of B into A. This makes it impossible to establish the history of pushes into main trees from just the Bitkeeper history.
In Bzr, merge parents _are_ ordered. Merging B into A gives a different result from merging A into B. When merging B into A (cd A; bzr merge B), A is the primary or left-hand parent. The resulting tree has one extra top-level merge revision following the tip of A, with all merged revisions of B appearing as sub-revisions of this.
This means that if one merges A into B giving C1, and later merges B into A giving C2, then C1 and C2 are considered diverged and need another merge changeset to be consolidated (in Bitkeeper C1 and C2 are identical and need no further merge).
What this means is that in bzr is _is_ actually possible to see the history of what was merged into what, at least in some sense.
Unfortunately, it tends to be backwards with usual MySQL working style. This is to make a branch at some point X, make a patch, commit and get review. Then when pushing, the main tree has gotten additional pushes by others and is now at revision Y > X. So one does a bzr merge of the main tree into the local clone, followed by a push to the main tree. Now the local patch becomes the main merge parent, and all other pushes between X and Y become sub-revisions with their revision numbers renamed. Which eventually makes primary/secondary merge parent relationships more or less random.
So since I did not take part in the transition to bzr within Sun, I just wanted to ask if this is something that was discussed, and if so if there were any conclusions?
As I understand, the Drizzle people make sure that whenever they merge, they merge into a copy of their trunk, so that all past pushes are visible as the line of primary/left-most merge parents in the main tree. This works better for them, as they follow the model of a single/few merge captains merging in other peoples trees upon merge requests.
My personal opinion is that while it would be nice to have the push history available like the Drizzle people do, the bzr support for this is not good enough, without a very tight control on who can push to main trees that we do not want. What one can do is when pushing, first swith to an up-to-date fresh clone of the main tree, pull from the local branch with own changes, then push that to the main tree. Maybe this is a good way, even though it cannot be enforced?
Kristian Nielsen <knielsen@knielsen-hq.org> writes:
Following up on this, I just learned about the bzr option append_revisions_only that can be set on a branch.
Any opinions on this? Having thought more on this, I really think we should enable append_revisions_only. There are many tools (and people as well) that depend on revision numbers and have no knowledge about using revid: revision ids. And they often get confused when a push changes the revision numbers of lots of recent commits. Just two days ago on Tuesday, the merge of MySQL 5.1 changed the last 6 revision numbers or so, and this seems to have confused the Buildbot Bzr code to the point where it choose to not build the new merge at all. (I have written and installed a new method for Buildbot to detect pushes that hopefully solves this problem. But I fear there are more places in the Buildbot code which assumes revision numbers are simple numbers that can be compared meaningfully to determine their ordering). In any case, having simple revision numbers that correctly reflect the push history is just so much nicer to work with, and this is something I really missed with Bitkeeper and Pushbuild. So unless there are objections, I want to implement this change early next week. At most, developers will need to do an extra pull into a fresh clone of main trees before pushing, if they forget this when merging up. And if needed, we can probably ask Bzr developers to implement a simple --reverse or --use-other-as-base option to make even this extra step unnecessary. - Kristian.
This option can be used to enforce that the main history (sequence of primary/left-hand parents from the tip) correctly reflects the series of pushes into the public tree.
So assume this sequence of events:
1. Joe branches lp:maria (revision 1000), and starts hacking on a patch. 2. Other developers push revisions 1001, 1002, and 1003 to lp:maria. 3. Joe finishes the patch, the review is good. He runs `bzr merge lp:maria`, followed by `bzr push lp:maria`.
As it is now, the resulting history of lp:maria will look like this:
1002 Joe (merge) 1000.1.3 | Other 1000.1.2 | developer's 1000.1.1 | commits 1001 Joe's patch 1000 Starting point
So it is not at all clear that the other commits were at some point pushed individually to lp:maria. And if someone goes to look at revision 1002 thinking to see one of the other developer's commits, it will be missing (or worse refer to the wrong commit after further pushes).
But if we set the append_revisions_only option on lp:maria, instead of this Joe will get this error:
bzr: ERROR: Server sent an unexpected error: ('error', 'Operation denied because it would change the main history, which is not permitted by the append_revisions_only setting on branch "lp-139886317008592:///~knielsen/maria/tmp-buildbot-test".')
In this case, Joe will instead have to do this:
bzr branch lp:maria # or bzr pull lp:maria into an existing clean clone bzr merge ../branch-with-patch bzr commit -m"merged with trunk" bzr push lp:maria
Then the resulting history will be this:
1004 Joe (merge) 1000.1.1 Joe patch 1003 | Other 1002 | developer's 1001 | commits 1000 Starting point
Which is much nicer, IMHO.
So basically append_revisions_only enforces the merging style that I, Sanja, and Monty already proposed as a good practise.
So I propose to add this option to the 5.1, 5.2, and 6.0 trees on Launchpad. This will provide a clear record of the push history in the repository, at the cost of enforcing the "good practise" merge style with one extra bzr step.
Any opinions? Reasons not to do this?
If there is general agreement I will add the option (the process is somewhat inconvenient, but I tested a procedure using sftp and that works ok).
(Incidentally, this way also makes it possible to correlate the branch history directly with build history from Buildbot/Pushbuild, something I really missed in the BitKeeper days).
- Kristian.
Kristian Nielsen <knielsen@knielsen-hq.org> writes:
I noticed a difference between Bitkeeper and Bzr merges.
In Bitkeeper, merge parents are essentially unordered. The two parents of a merge are completely equal, and there is no way from the Bitkeeper to history to say whether one parent was merged into the other or vise versa. The actual order in which merge parents are listed is based on some deterministic algorithm (most recent commit date if I recall correctly).
In particular, this means that in Bitkeeper, the result of a pull of A into B is exactly identical to a pull of B into A. This makes it impossible to establish the history of pushes into main trees from just the Bitkeeper history.
In Bzr, merge parents _are_ ordered. Merging B into A gives a different result from merging A into B. When merging B into A (cd A; bzr merge B), A is the primary or left-hand parent. The resulting tree has one extra top-level merge revision following the tip of A, with all merged revisions of B appearing as sub-revisions of this.
This means that if one merges A into B giving C1, and later merges B into A giving C2, then C1 and C2 are considered diverged and need another merge changeset to be consolidated (in Bitkeeper C1 and C2 are identical and need no further merge).
What this means is that in bzr is _is_ actually possible to see the history of what was merged into what, at least in some sense.
Unfortunately, it tends to be backwards with usual MySQL working style. This is to make a branch at some point X, make a patch, commit and get review. Then when pushing, the main tree has gotten additional pushes by others and is now at revision Y > X. So one does a bzr merge of the main tree into the local clone, followed by a push to the main tree. Now the local patch becomes the main merge parent, and all other pushes between X and Y become sub-revisions with their revision numbers renamed. Which eventually makes primary/secondary merge parent relationships more or less random.
So since I did not take part in the transition to bzr within Sun, I just wanted to ask if this is something that was discussed, and if so if there were any conclusions?
As I understand, the Drizzle people make sure that whenever they merge, they merge into a copy of their trunk, so that all past pushes are visible as the line of primary/left-most merge parents in the main tree. This works better for them, as they follow the model of a single/few merge captains merging in other peoples trees upon merge requests.
My personal opinion is that while it would be nice to have the push history available like the Drizzle people do, the bzr support for this is not good enough, without a very tight control on who can push to main trees that we do not want. What one can do is when pushing, first swith to an up-to-date fresh clone of the main tree, pull from the local branch with own changes, then push that to the main tree. Maybe this is a good way, even though it cannot be enforced?
Hi!
"Kristian" == Kristian Nielsen <knielsen@knielsen-hq.org> writes:
Kristian> Kristian Nielsen <knielsen@knielsen-hq.org> writes:
Following up on this, I just learned about the bzr option append_revisions_only that can be set on a branch.
Kristian> Any opinions on this? Kristian> Having thought more on this, I really think we should enable Kristian> append_revisions_only. Kristian> There are many tools (and people as well) that depend on revision numbers and Kristian> have no knowledge about using revid: revision ids. And they often get confused Kristian> when a push changes the revision numbers of lots of recent commits. Kristian> Just two days ago on Tuesday, the merge of MySQL 5.1 changed the last 6 Kristian> revision numbers or so, and this seems to have confused the Buildbot Bzr code Kristian> to the point where it choose to not build the new merge at all. (I have written Kristian> and installed a new method for Buildbot to detect pushes that hopefully solves Kristian> this problem. But I fear there are more places in the Buildbot code which Kristian> assumes revision numbers are simple numbers that can be compared meaningfully Kristian> to determine their ordering). Kristian> In any case, having simple revision numbers that correctly reflect the push Kristian> history is just so much nicer to work with, and this is something I really Kristian> missed with Bitkeeper and Pushbuild. Kristian> So unless there are objections, I want to implement this change early next Kristian> week. At most, developers will need to do an extra pull into a fresh clone of Kristian> main trees before pushing, if they forget this when merging up. And if needed, Kristian> we can probably ask Bzr developers to implement a simple --reverse or Kristian> --use-other-as-base option to make even this extra step unnecessary. Kristian> - Kristian. Agree with above. Kurt, can you talk with the bzr devlopers to implement a --reverse merge option to bzr? Regards, Monty
Hi!
"Kristian" == Kristian Nielsen <knielsen@knielsen-hq.org> writes:
K> Following up on this, I just learned about the bzr option K> append_revisions_only that can be set on a branch. K> This option can be used to enforce that the main history (sequence of K> primary/left-hand parents from the tip) correctly reflects the series of K> pushes into the public tree. K> So assume this sequence of events: K> 1. Joe branches lp:maria (revision 1000), and starts hacking on a patch. K> 2. Other developers push revisions 1001, 1002, and 1003 to lp:maria. K> 3. Joe finishes the patch, the review is good. He runs `bzr merge lp:maria`, K> followed by `bzr push lp:maria`. K> As it is now, the resulting history of lp:maria will look like this: K> 1002 Joe (merge) K> 1000.1.3 | Other K> 1000.1.2 | developer's K> 1000.1.1 | commits K> 1001 Joe's patch K> 1000 Starting point K> So it is not at all clear that the other commits were at some point pushed K> individually to lp:maria. And if someone goes to look at revision 1002 K> thinking to see one of the other developer's commits, it will be missing (or K> worse refer to the wrong commit after further pushes). K> But if we set the append_revisions_only option on lp:maria, instead of this K> Joe will get this error: K> bzr: ERROR: Server sent an unexpected error: ('error', 'Operation denied because it would change the main history, which is not permitted by the append_revisions_only setting on branch "lp-139886317008592:///~knielsen/maria/tmp-buildbot-test".') K> In this case, Joe will instead have to do this: K> bzr branch lp:maria # or bzr pull lp:maria into an existing clean clone K> bzr merge ../branch-with-patch K> bzr commit -m"merged with trunk" K> bzr push lp:maria K> Then the resulting history will be this: K> 1004 Joe (merge) K> 1000.1.1 Joe patch K> 1003 | Other K> 1002 | developer's K> 1001 | commits K> 1000 Starting point K> Which is much nicer, IMHO. K> So basically append_revisions_only enforces the merging style that I, Sanja, K> and Monty already proposed as a good practise. K> So I propose to add this option to the 5.1, 5.2, and 6.0 trees on K> Launchpad. This will provide a clear record of the push history in the K> repository, at the cost of enforcing the "good practise" merge style with one K> extra bzr step. K> Any opinions? Reasons not to do this? I like the end result of this approach, but I have got a couple of questions/concerns about this: - Doing the extra branch is a bit of a pain and slows down things if you pushes a lot. It would be nice to get this done internally in bzr as an part of the merge. Should we talk with canonical to get this option into bzr ? - What is the sequence to do if someone does a push of this code at between the step 3) and step 4): 1) bzr branch lp:maria # or bzr pull lp:maria into an existing clean clone 2) bzr merge ../branch-with-patch 3) bzr commit -m"merged with trunk" Someone else pushes here. 4) bzr push lp:maria Doing the 'merge' again would be a real pain. I assume one should in this case start again from step 1) and do the merge with the new tree? K> If there is general agreement I will add the option (the process is somewhat K> inconvenient, but I tested a procedure using sftp and that works ok). K> (Incidentally, this way also makes it possible to correlate the branch history K> directly with build history from Buildbot/Pushbuild, something I really missed K> in the BitKeeper days). Regards, Monty
Michael Widenius <monty@askmonty.org> writes:
K> In this case, Joe will instead have to do this:
K> bzr branch lp:maria # or bzr pull lp:maria into an existing clean clone K> bzr merge ../branch-with-patch K> bzr commit -m"merged with trunk" K> bzr push lp:maria
K> Then the resulting history will be this:
K> 1004 Joe (merge) K> 1000.1.1 Joe patch K> 1003 | Other K> 1002 | developer's K> 1001 | commits K> 1000 Starting point
K> Which is much nicer, IMHO.
K> So basically append_revisions_only enforces the merging style that I, Sanja, K> and Monty already proposed as a good practise.
K> So I propose to add this option to the 5.1, 5.2, and 6.0 trees on K> Launchpad. This will provide a clear record of the push history in the K> repository, at the cost of enforcing the "good practise" merge style with one K> extra bzr step.
K> Any opinions? Reasons not to do this?
I like the end result of this approach, but I have got a couple of questions/concerns about this:
- Doing the extra branch is a bit of a pain and slows down things if you pushes a lot. It would be nice to get this done internally in bzr as an part of the merge. Should we talk with canonical to get this option into bzr ?
Yes, this is the main pain point. There are a couple of things that can help with this: 1. If bzr people were to implement `bzr pull --reverse`, then there would be no additional step, so problem would be solved (as long as one remembers to use the --reverse). The merge to be done is exactly the same one way or the other, the only difference is which tree becomes the primary parent and which becomes the secondary. 2. I suppose most people (like me) in any case keep a branch around of the main trees that we pull into regularly. This can be used to do the merge to avoid the need for an extra `bzr branch`. But then if someone else were to push to the main branch in-between, the clean local branch would have to be re-created of course. 3. I often use bzr with `bzr branch --no-tree` branches and seperate `bzr checkout --lightweight` working trees. This sometimes greatly speeds up operations when one can use `bzr switch` to avoid having to make bzr check out a complete working tree (which it does really slowly). But it has its own hassles also due to the need to keep track of more directories, so it is no silver bullet. With --lightweight checkouts, one can do something like this, all the steps of which run in a few seconds (on my machine at least): (cd ~/repo && bzr branch --no-tree mariadb-5.1 mariadb-5.1-merge) bzr switch ~/repo/mariadb-5.1-merge bzr merge ~/repo/branch-with-patch # compile, test bzr push lp:maria
- What is the sequence to do if someone does a push of this code at between the step 3) and step 4):
1) bzr branch lp:maria # or bzr pull lp:maria into an existing clean clone 2) bzr merge ../branch-with-patch 3) bzr commit -m"merged with trunk"
Someone else pushes here.
4) bzr push lp:maria
Doing the 'merge' again would be a real pain.
The issue is really the same no matter which of the two methods you use. If someone else pushes in-between, you need to merge again.
I assume one should in this case start again from step 1) and do the merge with the new tree?
Yes. One would need a new fresh branch of the now updated lp:maria tree, and merge into that. One could merge in either the original branch-with-patch or the branch with the first merge, depending on whatever works best (using the original will save a merge changeset, but the line of primary parent commits will be the same in the resulting push either way).
K> If there is general agreement I will add the option (the process is somewhat K> inconvenient, but I tested a procedure using sftp and that works ok).
I discussed this with Monty on IRC. I will implement it later this week in out Launchpad trees (and if there turns out to be unforseen problems with it, it will be easy enough to remove it again). Monty also will talk to Jani to get this method into autopush (in the autopush case this should be no extra hassle at all for the person pushing). - Kristian.
participants (3)
-
Kristian Nielsen
-
Michael Widenius
-
Oleksandr "Sanja" Byelkin