[libvirt] [RFC] Accepting PRs/MRs for libvirt on GitHub/GitLab
As we look to make the libvirt project easier to contribute to, one fact that certainly comes to mind is that we only accept patches via the mailing list. While the core developers are comfortable with the email-based workflow and swear by it, many perspective contributors are used to the PR/MR workflow common to many Open Source projects these days, and similarly swear by it. If we look at the PRs submitted on GitHub against the libvirt mirror repository[1], there's just three of them per year on average, which is arguably not a lot; however, it should be noted that each repository also carries a fairly loud "PULL REQUESTS ARE IGNORED" message right at the top, and it's not unlikely that a number of perspective contributors simply lost interest after seeing it. As a way to make contributions easier without forcing core developers to give up their current workflow or making the project dependent on a third-party provider, I suggest we adopt a hybrid approach. First of all, we'd remove the ominous message from GitHub mirror repositories (interestingly, the same is not present on GitLab). Then, we'd starting accepting PRs/MRs. The way we'd handle them is that, when one comes in, one among the handful of core developers who volunteered to do so would review the patches on the respective platform, working with the submitter to get it into shape just like they would do on the mailing list; once the series is finally ready to be merged, the core developer would update the PR/MR as necessary, for example picking up R-bs or fixing the kind of trivial issues that usually don't warrant a repost, and then push to master as usual. More in detail: GitHub and GitLab have a feature that allows project members to update PRs/MRs: basically the way it works is that, if the PR/MR was created by the user 'foo' from their branch 'bar', the libvirt developer is allowed to (force-)push to the foo/libvirt/bar branch, and doing so updates the corresponding PR/MR; after that, if the updated branch is locally merged into master and master is pushed to the libvirt.org repo, once it gets mirrored GitHub/GitLab will recognize that the commit IDs match and automatically mark the PR/MR as merged. I have tested this behavior on both platforms (minus the mirroring part) with Martin's help. One last important bit. In the spirit of not requiring core developers to alter their workflow, the developer who reviewed the patches on GitHub/GitLab will also be responsible to format them to the mailing list after merging them: this way, even those who don't have a GitHub/GitLab account will get a chance to take notice of the code changes and weigh in. Unlike what we're used to, this feedback will come after the fact, but assuming that issues are spotted only at that point we can either push the relevant fixes as follow-up commits or even revert the series outright, so I don't feel like it would be a massive problem overall. Thoughts? [1] https://github.com/libvirt/libvirt/pulls?q=is:pr+is:closed -- Andrea Bolognani / Red Hat / Virtualization
On Wed, Oct 16, 2019 at 02:22:56PM +0200, Andrea Bolognani wrote:
First of all, we'd remove the ominous message from GitHub mirror repositories (interestingly, the same is not present on GitLab).
Well if you're using GitLab then you're already aware of the fact that not everything is hosted on GitHub.
Then, we'd starting accepting PRs/MRs. The way we'd handle them is that, when one comes in, one among the handful of core developers who volunteered to do so would review the patches on the respective platform, working with the submitter to get it into shape just like they would do on the mailing list; once the series is finally ready to be merged, the core developer would update the PR/MR as necessary, for example picking up R-bs or fixing the kind of trivial issues that usually don't warrant a repost, and then push to master as usual.
More in detail: GitHub and GitLab have a feature that allows project members to update PRs/MRs: basically the way it works is that, if the PR/MR was created by the user 'foo' from their branch 'bar', the libvirt developer is allowed to (force-)push to the foo/libvirt/bar branch, and doing so updates the corresponding PR/MR; after that, if the updated branch is locally merged into master and master is pushed to the libvirt.org repo, once it gets mirrored GitHub/GitLab will recognize that the commit IDs match and automatically mark the PR/MR as merged. I have tested this behavior on both platforms (minus the mirroring part) with Martin's help.
One last important bit. In the spirit of not requiring core developers to alter their workflow, the developer who reviewed the patches on GitHub/GitLab will also be responsible to format them to the mailing list after merging them: this way, even those who don't have a GitHub/GitLab account will get a chance to take notice of the code changes and weigh in. Unlike what we're used to, this feedback will come after the fact, but assuming that issues are spotted only at that point we can either push the relevant fixes as follow-up commits or even revert the series outright, so I don't feel like it would be a massive problem overall.
Here it deviates from the usual mailing list workflow where the patch has (in theory) a chance to be seen by all the developers. But given that the requests will probably a) be close to trivial b) seen by a group of developers, not just one sending it to the mailing list after it's pushed is better treatment than our language bindings get. Jano
Thoughts?
[1] https://github.com/libvirt/libvirt/pulls?q=is:pr+is:closed -- Andrea Bolognani / Red Hat / Virtualization
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Wed, Oct 16, 2019 at 05:03:31PM +0200, Ján Tomko wrote:
On Wed, Oct 16, 2019 at 02:22:56PM +0200, Andrea Bolognani wrote:
First of all, we'd remove the ominous message from GitHub mirror repositories (interestingly, the same is not present on GitLab).
Well if you're using GitLab then you're already aware of the fact that not everything is hosted on GitHub.
Then, we'd starting accepting PRs/MRs. The way we'd handle them is that, when one comes in, one among the handful of core developers who volunteered to do so would review the patches on the respective platform, working with the submitter to get it into shape just like they would do on the mailing list; once the series is finally ready to be merged, the core developer would update the PR/MR as necessary, for example picking up R-bs or fixing the kind of trivial issues that usually don't warrant a repost, and then push to master as usual.
More in detail: GitHub and GitLab have a feature that allows project members to update PRs/MRs: basically the way it works is that, if the PR/MR was created by the user 'foo' from their branch 'bar', the libvirt developer is allowed to (force-)push to the foo/libvirt/bar branch, and doing so updates the corresponding PR/MR; after that, if the updated branch is locally merged into master and master is pushed to the libvirt.org repo, once it gets mirrored GitHub/GitLab will recognize that the commit IDs match and automatically mark the PR/MR as merged. I have tested this behavior on both platforms (minus the mirroring part) with Martin's help.
One last important bit. In the spirit of not requiring core developers to alter their workflow, the developer who reviewed the patches on GitHub/GitLab will also be responsible to format them to the mailing list after merging them: this way, even those who don't have a GitHub/GitLab account will get a chance to take notice of the code changes and weigh in. Unlike what we're used to, this feedback will come after the fact, but assuming that issues are spotted only at that point we can either push the relevant fixes as follow-up commits or even revert the series outright, so I don't feel like it would be a massive problem overall.
Here it deviates from the usual mailing list workflow where the patch has (in theory) a chance to be seen by all the developers.
But given that the requests will probably a) be close to trivial b) seen by a group of developers, not just one
I wouldn't expect the changes to be trivial. Current stuff is trivial largely because we tell people not to open merge requests. If we adopt use of web based review, then expect people to submit non-trivial patches. I would do so myself for example. Thus I think we must make a clean switchover from email to a single web based tool.
sending it to the mailing list after it's pushed is better treatment than our language bindings get.
The situation with our language bindings has never felt very comfortable to me. Except for Python, they rarely get posted for review. Using a web based tool merge requests for language bindings would be a step forward for transparency in what we're submitting for them. If nothing else, it will ensure we can wire up CI to run gating pushes to master to avoid broken builds. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Wed, Oct 16, 2019 at 17:08:20 +0100, Daniel Berrange wrote:
On Wed, Oct 16, 2019 at 05:03:31PM +0200, Ján Tomko wrote:
On Wed, Oct 16, 2019 at 02:22:56PM +0200, Andrea Bolognani wrote:
[...]
Here it deviates from the usual mailing list workflow where the patch has (in theory) a chance to be seen by all the developers.
But given that the requests will probably a) be close to trivial b) seen by a group of developers, not just one
I wouldn't expect the changes to be trivial. Current stuff is trivial largely because we tell people not to open merge requests. If we adopt use of web based review, then expect
I'd still want the message we'll put out to encourage them using e-mail.
people to submit non-trivial patches. I would do so myself for example. Thus I think we must make a clean switchover from email to a single web based tool.
I disagree. There is nothing really appealing to me in any of the web based frontends for git. The user interface of them is designed to be flashy but that really hurts usability of git. We get cool icons but in return we must pay with always-online connection, loading bars if you click anywhere and the general necessity to interact with the browser which requires a lot of mousing around. The commenting interface on individual patches is very poor given what email allows you and in many cases it's hard to access older versions after a pull-request is force-pushed. Also we then get to the discussion whether to use the most popular web based git hosting site, which is the most terrible to use of them all or one of the less popular and less bad ones. Since this idea is sold as a way to attract outside contributors, it might lead to sacrificing usability for popularity ...
On Thu, Oct 17, 2019 at 07:57:46AM +0200, Peter Krempa wrote:
On Wed, Oct 16, 2019 at 17:08:20 +0100, Daniel Berrange wrote:
On Wed, Oct 16, 2019 at 05:03:31PM +0200, Ján Tomko wrote:
On Wed, Oct 16, 2019 at 02:22:56PM +0200, Andrea Bolognani wrote:
[...]
Here it deviates from the usual mailing list workflow where the patch has (in theory) a chance to be seen by all the developers.
But given that the requests will probably a) be close to trivial b) seen by a group of developers, not just one
I wouldn't expect the changes to be trivial. Current stuff is trivial largely because we tell people not to open merge requests. If we adopt use of web based review, then expect
I'd still want the message we'll put out to encourage them using e-mail.
people to submit non-trivial patches. I would do so myself for example. Thus I think we must make a clean switchover from email to a single web based tool.
I disagree. There is nothing really appealing to me in any of the web based frontends for git.
The user interface of them is designed to be flashy but that really hurts usability of git. We get cool icons but in return we must pay with always-online connection, loading bars if you click anywhere and the general necessity to interact with the browser which requires a lot of mousing around.
The commenting interface on individual patches is very poor given what email allows you and in many cases it's hard to access older versions after a pull-request is force-pushed.
I believe that most of us agree on this point and if we have a tool that will bring the review process closer to the email workflow we can actually try using it. Anyway, there is another email from Dan which summarizes it better and offers a reasonable approach which can answer some of the concerns you've addressed in this reply so I would suggest that we continue discussion based on that email [1] to not have multiple threads. Pavel [1] <https://www.redhat.com/archives/libvir-list/2019-October/msg01034.html>
On Thu, Oct 17, 2019 at 08:55:35AM +0200, Pavel Hrdina wrote:
On Thu, Oct 17, 2019 at 07:57:46AM +0200, Peter Krempa wrote:
On Wed, Oct 16, 2019 at 17:08:20 +0100, Daniel Berrange wrote:
On Wed, Oct 16, 2019 at 05:03:31PM +0200, Ján Tomko wrote:
On Wed, Oct 16, 2019 at 02:22:56PM +0200, Andrea Bolognani wrote:
[...]
Here it deviates from the usual mailing list workflow where the patch has (in theory) a chance to be seen by all the developers.
But given that the requests will probably a) be close to trivial b) seen by a group of developers, not just one
I wouldn't expect the changes to be trivial. Current stuff is trivial largely because we tell people not to open merge requests. If we adopt use of web based review, then expect
I'd still want the message we'll put out to encourage them using e-mail.
people to submit non-trivial patches. I would do so myself for example. Thus I think we must make a clean switchover from email to a single web based tool.
I disagree. There is nothing really appealing to me in any of the web based frontends for git.
The user interface of them is designed to be flashy but that really hurts usability of git. We get cool icons but in return we must pay with always-online connection, loading bars if you click anywhere and the general necessity to interact with the browser which requires a lot of mousing around.
The commenting interface on individual patches is very poor given what email allows you and in many cases it's hard to access older versions after a pull-request is force-pushed.
I believe that most of us agree on this point and if we have a tool that will bring the review process closer to the email workflow we can actually try using it.
Yes, the need for such a tool is the primary reason that I had not made this explicit suggestion to change to web based review yet for libvirt. I've been slowly trying to build something[1], but wanted to have a tool that actually does something useful before announcing it widely. Regards, Daniel [1] https://gitlab.com/bichon-project/bichon it doesn't do anything useful beyond displaying a list of PRs though, so don't get too excited. -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Wed, Oct 16, 2019 at 02:22:56PM +0200, Andrea Bolognani wrote:
As we look to make the libvirt project easier to contribute to, one fact that certainly comes to mind is that we only accept patches via the mailing list. While the core developers are comfortable with the email-based workflow and swear by it, many perspective contributors are used to the PR/MR workflow common to many Open Source projects these days, and similarly swear by it.
I still think email is a more productive workflow in many respects, but there's no denying the fact that web based workflow has come to dominate the open source development world. Projects that are as large, or larger, than libvirt are successfully using web based workflows, so it is not credible to claim it won't work for libvirt anymore. The challenges are all around the human factors. We have 15 years of using a email workflow for libvirt, so that's what our current regular contributors are all used to, and have optimized our daily routine around. Changing people's daily routines is always disruptive and met with resistance. If we're to reduce the on-ramp / learning curve to make libvirt more accessible for new contributors I think switching libvirt to use an web based tool is inevitable & desirable. Beyond that there are a number of ways it would benefit existing contributors too. Over the past 5 years there have been many times when mailman has just given up and stopped sending mails to the libvirt list. Some of these outages have lasted as long as an entire week & been quite risruptive to us. Getting anyone to care to fix the outages is challenging as few other people are impacted to the same degree as libvirt is with this redhat.com mailman. All to often patches from contributors get lost in the torrent. This can happen in web based tools too. The difference is that the web tools have much better facilities for identifying these missed patches & reporting on these problems, or helping organize patches to minimized missed stuff.
If we look at the PRs submitted on GitHub against the libvirt mirror repository[1], there's just three of them per year on average, which is arguably not a lot; however, it should be noted that each repository also carries a fairly loud "PULL REQUESTS ARE IGNORED" message right at the top, and it's not unlikely that a number of perspective contributors simply lost interest after seeing it.
Yep, that's certainly not an encouraging first impression.
As a way to make contributions easier without forcing core developers to give up their current workflow or making the project dependent on a third-party provider, I suggest we adopt a hybrid approach.
First of all, we'd remove the ominous message from GitHub mirror repositories (interestingly, the same is not present on GitLab).
That's an accident in GitLab config.
Then, we'd starting accepting PRs/MRs. The way we'd handle them is that, when one comes in, one among the handful of core developers who volunteered to do so would review the patches on the respective platform, working with the submitter to get it into shape just like they would do on the mailing list; once the series is finally ready to be merged, the core developer would update the PR/MR as necessary, for example picking up R-bs or fixing the kind of trivial issues that usually don't warrant a repost, and then push to master as usual.
I really don't like this hybrid approach for several reasons. Splitting attention between the web based tool and email list, with only a subset of people volunteering to use the web tool is really not attractive. Anyone who only pays attention to one of the two places is going to be missing out on what's being submitted and merging until it has already merged. I think it is fundamental principal that whatever workflow we use for patch submission & review *must* be one that is universally used by everyone. There's two tools being discussed here - both GitHub and GitLab. Splitting attention between email and a web based tool is bad, but splitting attention between email and two web based tools is even worse. Finally I have general desire to *NOT* designate GitHub as an official part of the libvirt workflow on the basis that it is a closed source tool. Yes, we are already using it, but it is largely an ancillary service working as a passive backup service for our git repos, not a core part of our workflow. I don't want to elevate it to be a core part.
One last important bit. In the spirit of not requiring core developers to alter their workflow, the developer who reviewed the patches on GitHub/GitLab will also be responsible to format them to the mailing list after merging them: this way, even those who don't have a GitHub/GitLab account will get a chance to take notice of the code changes and weigh in. Unlike what we're used to, this feedback will come after the fact, but assuming that issues are spotted only at that point we can either push the relevant fixes as follow-up commits or even revert the series outright, so I don't feel like it would be a massive problem overall.
If the only patches sent to the web tools are trivial patches, it is minor. For any non-trivial patches though, I think only seeing them after merge is a disaster. If we're going to officially use a web based tool we should expect that non-trivial patches will be sent that way. I think we have to take a different approach to this problem. As mentioned at the start, I think for the long term sustainability of the project, switching to a *single* web based tool for our patch submissions & review process is inevitable. Such a switch, however, *must* involve discontinuing use of email workflow so that we always have all contributors using the same tools & seeing the same patch submissions. To me picking GitLab, either hosted on gitlab.com, or self-hosted, is the only viable options, given that GitHub is closed source. I'm not seriously suggesting we self-host, as I don't think we need quite that level of control, nor is it productive for us to play at being sysadmins. Having the option available though is good. In adopting a web based tool we nmeed to think about what this means for our development practice more generally though, as it has broader implications. For example, currently we have a pratice of adding Reviewed-by tags on patches. This is possible, but inconvenient, when using the web tools. It is arguabley redundant, since the tools themselves record who commented, and who added approvals on the patch and who requested it merge. Dropping use of R-b assumes that we're 100% use the web tools and not email. One of the things the web tools do well is fully integrating CI testing into the merge process. New submissions would typically get CI jobs run against them and only approved for merge once the CI has succeeded. This largely eliminates the problem of developers accidentally pushing changes to master that break the build. Again though this benefit is only seen if we discontinue use of email workflow. Much of our existing CI setup should be easy to integrate into GitLab. Our current VMs that are used with Jenkins just need to have the Jenkins agent replaced with the GitLab runner agent. We can then drop Jenkins entirely and do everything though GitLab for CI[1]. Although I'm in favour of using the web based tools, I personally don't want to use a web UI for code review on a daily basis. My personal goal is to create a console based UI for code review with GitLab, so that my workflow would be largely unchanged from how it works with email in terms of user interface. There may be other things that are important to our frequent contributors that we must consider before making a switch to web based tools, in order to reduce the disruption. Ultimately though I think it is a mistake to try to find a way to 100% mirror everyone's existing email workflow with the web based tools. Setting such a goal is a way to ensure we would never switch away from email. With such a different paradigm between email & web based tools there is a period of guaranteed disruption. Accept that some things may be worse, but that this will be counterbalanced by other things being better. Some of the things which are worse (web based pointy clicky UI) can be mitigated. People are adaptable & will find ways to get what they need done. Picking an open source solution gives us the option to make contributions to improve it where it is lacking too. To deal with inevitable disruption, I think we should switch our repositories in two stages. First switch all the add-on repos over to GitLab. This means the language bindings, the wiki/web repos, the CI repos, etc. These are all reasonably low traffic in terms of patch submission. Switching them gives us time to learn to deal with the web basd tools, define our process, figure out the CI integration, identify any pain points we might want to mitigate, etc. Then, perhaps 6 months later, switch the main libvirt.git high traffic repo. To reduce confusion, I might go as far as saying we should delete all our GitHub repositories. Keep the libvirt project namespace, have some message to refer people to GitLab. There is a slight complication here wrt the Go languag binding as the hosting service is part of the code namespace for imports. What we should have done is declare the namespace for the Go imports to be under libvirt.org. We can still in fact do that and submit MRs to projects to get them to switch their codebase, so that we don't end up with gitlab.com in the namespace instead, leaving the same mistake. As for the git hosting in libvirt.org, we should reverse the mirroring process. eg GitLab is the master, and libvirt.org becomes the backup mirror. Regards, Daniel [1] We still have the macOS problem which we need Travis for. Sooner or later we need to either get a supportable macOS box to act as a runner, or drop macOS. I'm loathe todo the latter though, since we do have people who appear to be using macOS. -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Wed, 2019-10-16 at 17:03 +0100, Daniel P. Berrangé wrote:
There's two tools being discussed here - both GitHub and GitLab. Splitting attention between email and a web based tool is bad, but splitting attention between email and two web based tools is even worse.
Finally I have general desire to *NOT* designate GitHub as an official part of the libvirt workflow on the basis that it is a closed source tool. Yes, we are already using it, but it is largely an ancillary service working as a passive backup service for our git repos, not a core part of our workflow. I don't want to elevate it to be a core part.
I understand why you feel this way and mostly share your opinion on the matter, but from a pragmatic standpoint if our goal is to get more people involved in libvirt's development then cutting off GitHub is almost certainly the wrong way to go about it. Just from the data we already have: GitHub, with the scary "don't send PRs our way" warning at the top of the page, still got 13 PRs; GitLab, which doesn't have the warning, got exactly zero so far.
For example, currently we have a pratice of adding Reviewed-by tags on patches. This is possible, but inconvenient, when using the web tools. It is arguabley redundant, since the tools themselves record who commented, and who added approvals on the patch and who requested it merge. Dropping use of R-b assumes that we're 100% use the web tools and not email.
We should still record R-bs in git, because at the end of the day the git history is the only thing that we can reasonably expect to be able to migrate from provider to provider, or even from VCS to VCS, in a lossless fashion.
One of the things the web tools do well is fully integrating CI testing into the merge process. New submissions would typically get CI jobs run against them and only approved for merge once the CI has succeeded. This largely eliminates the problem of developers accidentally pushing changes to master that break the build. Again though this benefit is only seen if we discontinue use of email workflow. Much of our existing CI setup should be easy to integrate into GitLab. Our current VMs that are used with Jenkins just need to have the Jenkins agent replaced with the GitLab runner agent. We can then drop Jenkins entirely and do everything though GitLab for CI[1].
Looking at the repository containing the machine-executable description of our CI setup, these days the part that actually touches Jenkins is fairly small compared to the part dealing with bringing up and maintaining runners. Of course we'd be able to reuse pretty much all of it when moving to GitLab CI, but I just wanted to highlight the fact that dropping support for Jenkins will not be that much of a win in terms of maintenance. Anyway, all of this talk about switching everything to a Web-based workflow is entirely based on the assumption that, if we do that, then we will start getting more contribution; however, we have *zero evidence* that it would actually achieve that result. That's why I'm suggesting is that we open up to contributions from GitHub/GitLab while not giving up our current workflow, and after a reasonable amount of time has passed look at the actual data and base our decisions on what to do going forward on that. -- Andrea Bolognani / Red Hat / Virtualization
On Thu, Oct 17, 2019 at 09:40:08AM +0200, Andrea Bolognani wrote:
On Wed, 2019-10-16 at 17:03 +0100, Daniel P. Berrangé wrote:
There's two tools being discussed here - both GitHub and GitLab. Splitting attention between email and a web based tool is bad, but splitting attention between email and two web based tools is even worse.
Finally I have general desire to *NOT* designate GitHub as an official part of the libvirt workflow on the basis that it is a closed source tool. Yes, we are already using it, but it is largely an ancillary service working as a passive backup service for our git repos, not a core part of our workflow. I don't want to elevate it to be a core part.
I understand why you feel this way and mostly share your opinion on the matter, but from a pragmatic standpoint if our goal is to get more people involved in libvirt's development then cutting off GitHub is almost certainly the wrong way to go about it.
Just from the data we already have: GitHub, with the scary "don't send PRs our way" warning at the top of the page, still got 13 PRs; GitLab, which doesn't have the warning, got exactly zero so far.
Not that I crave being part of this discussion more deeply, but s/don't send PRs our way/please use gitlab for PRs/ could help. Another thing we completely missed (which could also be more effective) is to have a PR template [1] which would say "please open MRs on gitlab instead (or "we don't do GitHub" for current state). Another option is to have a bot reply to open PRs with that message designated for the specific user and then closing them right at the same time. [1] https://help.github.com/en/articles/creating-a-pull-request-template-for-you...
On Thu, 2019-10-17 at 10:33 +0200, Martin Kletzander wrote:
Another thing we completely missed (which could also be more effective) is to have a PR template [1] which would say "please open MRs on gitlab instead (or "we don't do GitHub" for current state).
Yup, that's definitely a good idea regardless!
Another option is to have a bot reply to open PRs with that message designated for the specific user and then closing them right at the same time.
Implementing or hooking up a bot sounds like it would be a lot more work than just writing a short template, so I'd rather go the lazy way first O:-) -- Andrea Bolognani / Red Hat / Virtualization
On Thu, Oct 17, 2019 at 09:40:08AM +0200, Andrea Bolognani wrote:
On Wed, 2019-10-16 at 17:03 +0100, Daniel P. Berrangé wrote:
There's two tools being discussed here - both GitHub and GitLab. Splitting attention between email and a web based tool is bad, but splitting attention between email and two web based tools is even worse.
Finally I have general desire to *NOT* designate GitHub as an official part of the libvirt workflow on the basis that it is a closed source tool. Yes, we are already using it, but it is largely an ancillary service working as a passive backup service for our git repos, not a core part of our workflow. I don't want to elevate it to be a core part.
I understand why you feel this way and mostly share your opinion on the matter, but from a pragmatic standpoint if our goal is to get more people involved in libvirt's development then cutting off GitHub is almost certainly the wrong way to go about it.
Attracting more contributors is not an absolute goal in isolation. It is one of many factors that need to be balanced. One of the factors is basing our development workflow on open source tools not closed source tools.
One of the things the web tools do well is fully integrating CI testing into the merge process. New submissions would typically get CI jobs run against them and only approved for merge once the CI has succeeded. This largely eliminates the problem of developers accidentally pushing changes to master that break the build. Again though this benefit is only seen if we discontinue use of email workflow. Much of our existing CI setup should be easy to integrate into GitLab. Our current VMs that are used with Jenkins just need to have the Jenkins agent replaced with the GitLab runner agent. We can then drop Jenkins entirely and do everything though GitLab for CI[1].
Looking at the repository containing the machine-executable description of our CI setup, these days the part that actually touches Jenkins is fairly small compared to the part dealing with bringing up and maintaining runners. Of course we'd be able to reuse pretty much all of it when moving to GitLab CI, but I just wanted to highlight the fact that dropping support for Jenkins will not be that much of a win in terms of maintenance.
I'm not really suggesting its a win for maint of the CI systems, I doubt it will be better or worse, just different. The point of integrating with GitLab is to get the pre-merge checking of patches, instead of post-merge that we have now. IOW, it is a win for maint of the code being developed.
That's why I'm suggesting is that we open up to contributions from GitHub/GitLab while not giving up our current workflow, and after a reasonable amount of time has passed look at the actual data and base our decisions on what to do going forward on that.
I don't think that will offer meaningful data, because it is setting it up as a second class citizen to the email workflow, and not taking advantage of the different features that GitLab offers. So we'll see the downsides of the new tool without experiancing the upsides. Growing contributors in any meaningful amount is something that will likely take a long time to have an effect too - I can easily imagine 12 to 24 months, not something we can measure after a 3-6 months IMHO. Even if it doesn't grow contributors I think it'll still the be right move to make long term, because it normalizes the libvirt development model with what's widely expected for open source projects these days, and other factors such as unreliability of our mailing list software impacting us significantly. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Thu, 2019-10-17 at 09:48 +0100, Daniel P. Berrangé wrote:
On Thu, Oct 17, 2019 at 09:40:08AM +0200, Andrea Bolognani wrote:
On Wed, 2019-10-16 at 17:03 +0100, Daniel P. Berrangé wrote: The point of integrating with GitLab is to get the pre-merge checking of patches, instead of post-merge that we have now. IOW, it is a win for maint of the code being developed.
Personally I see pre-merge CI as a mere nice-to-have: as long as we keep enforcing the pre-release freeze period, fixing things before merging them or immediately afterwards with follow-up patches doesn't really make a lot of difference in my eyes. For testing stuff or reproducing CI issues locally we already have our 'make ci-*' targets, which thanks to their interactivity are IMHO much nicer to use than the GitLab CI / Travis CI workflow of push → wait for the build to finish → scroll through the logs and try to figure out what went wrong → change some code → go back to the beginning and repeat multiple times until you finally manage to get it right.
That's why I'm suggesting is that we open up to contributions from GitHub/GitLab while not giving up our current workflow, and after a reasonable amount of time has passed look at the actual data and base our decisions on what to do going forward on that.
I don't think that will offer meaningful data, because it is setting it up as a second class citizen to the email workflow, and not taking advantage of the different features that GitLab offers. So we'll see the downsides of the new tool without experiancing the upsides.
Growing contributors in any meaningful amount is something that will likely take a long time to have an effect too - I can easily imagine 12 to 24 months, not something we can measure after a 3-6 months IMHO. Even if it doesn't grow contributors I think it'll still the be right move to make long term, because it normalizes the libvirt development model with what's widely expected for open source projects these days, and other factors such as unreliability of our mailing list software impacting us significantly.
If mailing list reliability was the problem, we could "simply" start running a mailman instance on libvirt.org and take ownership of our own mailing lists, no need to change workflows. Anyway, we're clearly seeing different goals in the potential move to Web-based tools, so it's not surprising that we'd approach it in fairly different ways :) To me the Web-based workflow is inferior to the mail-based one, this opinion being informed by using GitHub regularly for the past two months or so. I'm willing to take hit if it can be proven that the drawbacks are compensated with an increased number of contributors, but otherwise I'd much rather keep things the way they are. Once Bichon is usable perhaps I'll change my mind, but until then I don't think there's much of a chance of us agreeing. -- Andrea Bolognani / Red Hat / Virtualization
On Thu, Oct 17, 2019 at 04:22:49PM +0200, Andrea Bolognani wrote:
To me the Web-based workflow is inferior to the mail-based one, this opinion being informed by using GitHub regularly for the past two months or so. I'm willing to take hit if it can be proven that the drawbacks are compensated with an increased number of contributors, but otherwise I'd much rather keep things the way they are.
Once Bichon is usable perhaps I'll change my mind, but until then I don't think there's much of a chance of us agreeing.
I think it might be useful if we start a shared doc somewhere where any current contributors can describe what their current typical workflow is for dealing libvirt code submission/review/etc. This could be valuable information to understand the ways in which any change in tools will impact them, and where likely benefits & downsides will lie. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 10/16/19 9:22 AM, Andrea Bolognani wrote:
As we look to make the libvirt project easier to contribute to, one fact that certainly comes to mind is that we only accept patches via the mailing list. While the core developers are comfortable with the email-based workflow and swear by it, many perspective contributors are used to the PR/MR workflow common to many Open Source projects these days, and similarly swear by it.
If we look at the PRs submitted on GitHub against the libvirt mirror repository[1], there's just three of them per year on average, which is arguably not a lot; however, it should be noted that each repository also carries a fairly loud "PULL REQUESTS ARE IGNORED" message right at the top, and it's not unlikely that a number of perspective contributors simply lost interest after seeing it.
As a way to make contributions easier without forcing core developers to give up their current workflow or making the project dependent on a third-party provider, I suggest we adopt a hybrid approach.
First of all, we'd remove the ominous message from GitHub mirror repositories (interestingly, the same is not present on GitLab).
Then, we'd starting accepting PRs/MRs. The way we'd handle them is that, when one comes in, one among the handful of core developers who volunteered to do so would review the patches on the respective platform, working with the submitter to get it into shape just like they would do on the mailing list; once the series is finally ready to be merged, the core developer would update the PR/MR as necessary, for example picking up R-bs or fixing the kind of trivial issues that usually don't warrant a repost, and then push to master as usual.
This would mean that there will be patches that will get accepted without the ML being aware of. The core developer (or the author itself) would need to send the revised patches to the ML before pushing to make people aware of it before pushing to master, IMO. I have a 2 year experience maintaining a now defunct project in Github (a KVM web management tool called Kimchi) in which we used a hybrid approach like I think you're suggesting, with mailing list, people opening bugs in Github + Github PRs. It was annoying - people will inevitably discuss issues using the Github tracking system, which means that you'll have to subscribe via email to Github notifications if you didn't want to miss out. It was common for people that used the ML to start discussions that were already happening in the Web, and vice-versa. All that to make a case against Libvirt having additional ways of communication. I don't mind pull requests from Github/Gitlab as long as the ML is made aware of, before the patches are accepted. But people bringing up discussions in the ML and Github/Gitlab scales poorly, in my experience. DHB
More in detail: GitHub and GitLab have a feature that allows project members to update PRs/MRs: basically the way it works is that, if the PR/MR was created by the user 'foo' from their branch 'bar', the libvirt developer is allowed to (force-)push to the foo/libvirt/bar branch, and doing so updates the corresponding PR/MR; after that, if the updated branch is locally merged into master and master is pushed to the libvirt.org repo, once it gets mirrored GitHub/GitLab will recognize that the commit IDs match and automatically mark the PR/MR as merged. I have tested this behavior on both platforms (minus the mirroring part) with Martin's help.
One last important bit. In the spirit of not requiring core developers to alter their workflow, the developer who reviewed the patches on GitHub/GitLab will also be responsible to format them to the mailing list after merging them: this way, even those who don't have a GitHub/GitLab account will get a chance to take notice of the code changes and weigh in. Unlike what we're used to, this feedback will come after the fact, but assuming that issues are spotted only at that point we can either push the relevant fixes as follow-up commits or even revert the series outright, so I don't feel like it would be a massive problem overall.
Thoughts?
[1] https://github.com/libvirt/libvirt/pulls?q=is:pr+is:closed
On Wed, Oct 16, 2019 at 6:42 PM Daniel Henrique Barboza < danielhb413@gmail.com> wrote:
On 10/16/19 9:22 AM, Andrea Bolognani wrote:
As we look to make the libvirt project easier to contribute to, one fact that certainly comes to mind is that we only accept patches via the mailing list. While the core developers are comfortable with the email-based workflow and swear by it, many perspective contributors are used to the PR/MR workflow common to many Open Source projects these days, and similarly swear by it.
If we look at the PRs submitted on GitHub against the libvirt mirror repository[1], there's just three of them per year on average, which is arguably not a lot; however, it should be noted that each repository also carries a fairly loud "PULL REQUESTS ARE IGNORED" message right at the top, and it's not unlikely that a number of perspective contributors simply lost interest after seeing it.
As a way to make contributions easier without forcing core developers to give up their current workflow or making the project dependent on a third-party provider, I suggest we adopt a hybrid approach.
First of all, we'd remove the ominous message from GitHub mirror repositories (interestingly, the same is not present on GitLab).
Then, we'd starting accepting PRs/MRs. The way we'd handle them is that, when one comes in, one among the handful of core developers who volunteered to do so would review the patches on the respective platform, working with the submitter to get it into shape just like they would do on the mailing list; once the series is finally ready to be merged, the core developer would update the PR/MR as necessary, for example picking up R-bs or fixing the kind of trivial issues that usually don't warrant a repost, and then push to master as usual.
This would mean that there will be patches that will get accepted without the ML being aware of. The core developer (or the author itself) would need to send the revised patches to the ML before pushing to make people aware of it before pushing to master, IMO.
The activemq mailing list as a bot or something installed which posts events like a new PR or a merged PR to their mailing list. Best Regards, Roman
I have a 2 year experience maintaining a now defunct project in Github (a KVM web management tool called Kimchi) in which we used a hybrid approach like I think you're suggesting, with mailing list, people opening bugs in Github + Github PRs. It was annoying - people will inevitably discuss issues using the Github tracking system, which means that you'll have to subscribe via email to Github notifications if you didn't want to miss out. It was common for people that used the ML to start discussions that were already happening in the Web, and vice-versa.
All that to make a case against Libvirt having additional ways of communication. I don't mind pull requests from Github/Gitlab as long as the ML is made aware of, before the patches are accepted. But people bringing up discussions in the ML and Github/Gitlab scales poorly, in my experience.
DHB
More in detail: GitHub and GitLab have a feature that allows project members to update PRs/MRs: basically the way it works is that, if the PR/MR was created by the user 'foo' from their branch 'bar', the libvirt developer is allowed to (force-)push to the foo/libvirt/bar branch, and doing so updates the corresponding PR/MR; after that, if the updated branch is locally merged into master and master is pushed to the libvirt.org repo, once it gets mirrored GitHub/GitLab will recognize that the commit IDs match and automatically mark the PR/MR as merged. I have tested this behavior on both platforms (minus the mirroring part) with Martin's help.
One last important bit. In the spirit of not requiring core developers to alter their workflow, the developer who reviewed the patches on GitHub/GitLab will also be responsible to format them to the mailing list after merging them: this way, even those who don't have a GitHub/GitLab account will get a chance to take notice of the code changes and weigh in. Unlike what we're used to, this feedback will come after the fact, but assuming that issues are spotted only at that point we can either push the relevant fixes as follow-up commits or even revert the series outright, so I don't feel like it would be a massive problem overall.
Thoughts?
[1] https://github.com/libvirt/libvirt/pulls?q=is:pr+is:closed
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
participants (8)
-
Andrea Bolognani -
Daniel Henrique Barboza -
Daniel P. Berrangé -
Ján Tomko -
Martin Kletzander -
Pavel Hrdina -
Peter Krempa -
Roman Mohr