March 2020 - Devel - libvirt List Archives

[PATCH 0/2] Optimize initialization of storage files
by Peter Krempa 02 Apr '20

02 Apr '20

See 2/2. Peter Krempa (2): qemuSecurityChownCallback: Remove 'cleanup' section qemuSecurityChownCallback: Don't initialize storage file subsystem for local file src/qemu/qemu_driver.c | 31 +++++++++++++------------------ 1 file changed, 13 insertions(+), 18 deletions(-) -- 2.24.1

2 4

[PATCH 0/3] Fix use-after-free in block job reconnection code
by Peter Krempa 02 Apr '20

02 Apr '20

Peter Krempa (3): qemu: migration: Don't use return value of qemuBlockJobUpdate qemuBlockJobUpdate: Remove return value qemuBlockJobRefreshJobs: Warn readers that 'job' may be invalid after update src/qemu/qemu_blockjob.c | 9 +++------ src/qemu/qemu_blockjob.h | 7 ++++--- src/qemu/qemu_migration.c | 17 +++++++---------- 3 files changed, 14 insertions(+), 19 deletions(-) -- 2.24.1

3 7

Re: [PATCH 5/5] cpu: Introduce getHost supoort for ARM
by Daniel P. Berrangé 02 Apr '20

02 Apr '20

Re-adding libvir-list to the CC line. On Mon, Mar 30, 2020 at 08:20:44PM +0800, Zhenyu Zheng wrote: > Hi, yes, I think we can do that using inline assembly, I can check it out > if you think it is a better solution, > do you have any suggestions for the features(cpu flags) part? It seems that > ARM does not have a location/register > that holds all the flags, seems that we have to query alot of different > registers to check for features, which could > be quite messy. Perhaps there is a way to record the location/register info in the XML against each feature name, so that the code itself can stay simple and just be driven from the metadata ? I'm not familiar enough with Arm to know how feasiable this is though, so will have to leave that to others to give an opinion. > > On Mon, Mar 30, 2020 at 8:01 PM Daniel P. Berrangé <berrange(a)redhat.com> > wrote: > > > On Mon, Mar 30, 2020 at 07:32:36PM +0800, Zhenyu Zheng wrote: > > > Hi Daniel, > > > > > > Thanks for thre review and reply, my first implementation was going to > > > gather data from /proc/cpuinfo, but unlike X86, we can only get this kind > > > of info: > > > > > > processor : 0 > > > BogoMIPS : 200.00 > > > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > > > CPU implementer : 0x43 > > > CPU architecture: 8 > > > CPU variant : 0x1 > > > CPU part : 0x0a1 > > > CPU revision : 1 > > > > > > so we have to perform some translation to perform human readable > > > information, and I mentioned that 'lscpu' has done that too. So Andrea > > > Bolognani > > > suggested that maybe we can use it directly, to avoid re-implement the > > > translation. Here is the discussion: > > > https://www.redhat.com/archives/libvir-list/2020-March/msg00812.html > > > > On x86 we get majority of info straight from calling the CPUID instruction, > > not /proc/cpuinfo, and use our XML data files in src/cpu_map to translate > > things into human readable names. I see you're adding XML data files > > with names in the earlier patches. Is it possible to add the hex values > > for the CPU implementer/architecture/variant/part to these XML files so > > we can directly map them in libvirt, in the same way we do for x86 Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

2 1

[GSoC][RPC] Project Proposal: "Introducing Job control to the storage driver"
by Prathamesh Chavan 31 Mar '20

31 Mar '20

GSoC Proposal - 2020 ==================== Project Name: Introducing Job control to the storage driver =========================================================== About Me: ========= Name: Prathamesh Chavan University: Indian Institute of Technology Kharagpur Major: Computer Science and Engineering(Dual Degree) Email: pc44800(a)gmail.com Blog: pratham-pc.github.io Contact: +91-993-235-8333 Time Zone: IST (UTC +5:30) Background: =========== I am a final year dual degree (B.Tech & M.Tech) student from the Department of Computer Science and Engineering at IIT Kharagpur. During my first year of college, I got introduced to open source through Kharagpur Open Source Society and later I became part of it. As my master's thesis project, I'm working on a tiered file system with Software Wear Management for NVM Technologies. I always wanted to get involved in storage technologies and the development process and Google Summer of Code is a great way to achieve it. I've been part of Google Summer of code in my second year of college under the Git Organization. It was my first experience with a large codebase. Information related to it is available in GSoC - 2017's archive[1]. Last year summers, I also interned at Nutanix. I worked on Logbay, which is a configuration-based data collection, archiving and streaming tool used by all services on Nutanix Controller-VM (CVM). I added multi-platform support to Logbay by identifying all the dependencies of Logbay on the other CVM based services and introduced an interface between these dependencies and Logbay-Core for allowing it to be used on different platforms where they aren’t available. I also implemented the interface on a Dev-VM and added multi-port support in it for allowing multiple instances of Logbay to run on a single Dev-VM to simulate a multi-node cluster which allowed the developers to test their changes on their Dev-VM itself. The Project: ============ Introducing job control to the storage driver Summary: Implement abstract job control and use it to improve storage driver. Mentor: Pavel Hrdina Abstract: ========= Currently, libvirt support job cancellation and progress reporting on domains. That is, if there's a long-running job on a domain, e.g. migration, libvirt reports how much data has already been transferred to the destination and how much still needs to be transferred. However, libvirt lacks such information reporting in storage area, to which libvirt developers refer to as the storage driver. The aim is to report progress on several storage tasks, like volume wiping, file allocation an others. Job Control in Domain: ====================== In src/qemu/qemu_domain.h, we can find the struct qemuDomainJobObj, which is a job object in domains, and is used for: coordinating between jobs, help identify which API call owns the job object, and contain rest additional info regarding the normal job/agent job/async job. This qemuDomainJobObj is part of another struct qemuDomainObjPrivate, which majorly is the object, the driver's API majorly interacts with which calling jobs on. Whenever an API call is made, depending upon the type of the job, specific locks are acquired and then the job is carried out. Exact design details regarding the implementation of such APIs are present in `src/qemu/THREADS.txt`. Job Control in Storage: ======================= Whenever an API call is made to a storageVol, the member `in_use` of the struct `virStorageVolDef` is used as a reference counter. This allows us to check whether the storage volume is already in use or not, and whether the current API call can be carried out. Once the API call exits, the reference counter is decremented. Reporting of job-progress as done in case of Domains: ===================================================== Additionally, when an async job is running: it also contains qemuDomainJobInfo: which stores the progress data of the async job, and an another qemuDomainJobInfo stores the statistics data of a recently completed job. Functions virDomainGetJobInfo() and virDomainGetJobStats() present in libvirt-domain.c help extract information about progress of a background job on a domain. Plan to implement something similar in Storage Jobs: ==================================================== Firstly, it's important to bring in the notion of jobs to the storage driver. Right now the API calls get directly executed if the required mutex locks are acquired. But this gives the rest of the API calls less information about what is running currently or has the locks acquired. Further, the domain jobs additionally contain a lot of information which can even be useful in case of the storage API calls. Firstly, identification of what all API calls are occurring on Storage Volumes in storage driver, and classifying them into something similar to normal jobs and async jobs (the long-running ones). Also, some of the API calls will not be acquiring a job (ones which didn't change the reference counter). After this, a document similar to src/qemu/THREADS.txt needs to be created for storage job handling and should mention the new design of the existing storage APIs, acquiring jobs and appropriate locks as required. Additional new APIs need to be implemented for the creation, deletion of storage jobs. These would be similar to the domain job API present in qemu/qemu_domain.h such as qemuDomainObjBeginJob(), etc. This specifically also included storage equivalent functions of virDomainGetJobInfo() and virDomainGetJobStats(). These would be used by the long-running storage jobs to report completion progress. Existing storage API needs to be implemented with this new job notion and the reference counter member 'in_use' be removed. Other desired changes: ====================== 1. Unification of the notion of jobs throughout: one of the steps taking keeping this in mind could be to have the storage job API implementation as close to the domain job API, so that later on unification would be easier. Unification, IMO, is left in the future scope of the project. 2. Prediction of time-remaining for job completion as the progress for various jobs is reported. BiteSizedTask attempted: ======================== Clean up variables in tools/libvirt-guests.sh.in Mentor: Martin Kletzander A patch was floated on the mailing list and its latest version can be found here[2]. Rest of the plans for the summers: ================================== Due to the on-going COVID-19 Pandemic, my internship was canceled. Hence, I'll be available full-time for the project. In August, I'll be joining Nutanix as a Full-time Employee. PS: === 1. It was already very late by the I decided to take part in GSoC'20. Hence, I wasn't able to give the required amount of time for preparing this project's proposal. If it would be okay, I'll still like to keep updating this proposal after the deadline and add a few important things, such as solid project deliverables along with a timeline. 2. As Pavel Hrdina is the mentor of this project, and as this project was suggested by Michal Privoznik, I've cc'd both of them to this email. 3. Since I still haven't spent enough time understanding the details of the existing APIs, I might have gone wrong at a few places, and I would be glad to have them pointed out. 4. One of the mentioned requirements according to libvirt's GSoC FAQ page[3] is passing an interview. When does this interview typically take place, in the GSoC timeline? 5. A google-doc of this proposal can be found here[4]. Comments on the doc are welcomed. [1]: https://summerofcode.withgoogle.com/archive/2017/projects/5434523185577984/ [2]: https://www.redhat.com/archives/libvir-list/2020-March/msg01303.html [3]: https://wiki.libvirt.org/page/Google_Summer_of_Code_FAQ [4]: https://docs.google.com/document/d/1Js-yi1oNrl9fRhzvMJwBHyygYCAtr_q7Bby8k1j…

2 4

On the need to move to a merge request workflow
by Daniel P. Berrangé 31 Mar '20

31 Mar '20

We've discussed the idea of replacing our mailing list review workflow with a merge request workflow in various places, over the last 6 months or so, but never made a concrete decision. I understand that some people will not like the idea of using merge requests. It is ok to feel that way, however, we need to consider the bigger picture and be open to learning to make the most of the opportunities, even if we don't like certain things about the tools. We need to recognise that there are millions of open source developers who are successfully delivering code using the merge request workflow, and we are just as capable as them. To claim that merge requests won't/can't work for libvirt, means there is something unique about the libvirt project's needs, and/or all those other developers are doing a worse job than libvirt developers, and/or libvirt developers are incapable of using tools that everyone else copes with. I don't believe any of those points are true. While it is certainly possible to come up with a list of points why we might like mailing lists, on aggregate merge requests will be the better solution. There are examples of high profile projects moving to a git forge based workflow and new projects use them as a first choice. If mailing lists were truly the better approach, then there would be a notable flow of projects moving away from merge requests, back to mailing lists. There is no evidence of a trend towards mailing list workflows. Short summary ============= This has quickly become another very long email to read though, so I'll attempt a summary * The need to attract new contributors is critical to the long term health of the project. Since 2016 the number of new contributors has dropped 50% from 80-90 per year, down to just 40-45 in 2019. * The current libvirt mailing lists are suffering from unreliable delivery. Some contributors are unable to send. Other contributors are unable to receive. * The contributors affected by the problems have very limited ability to resolve the problem they have, aside from opening tickets or changing email providers entirely. * A mail workflow relies on all contributors agreeing on the conventions to use when sending & replying to messages. Many of these conventions are unwritten and impose a burden for new contributors, discouraging their participation. * A mail workflow has many problems even for long term contributors, most notably with keeping track of what mails are outstanding. Each person has invented their own solution to deal with the lack of central tracking and they are all flawed & duplicating time on triage. * A merge request workflow has a lower barrier to participation for new contributors to the project, by making libvirt use commonly found workflow for OSS projects. * A merge request workflow is technically capable of supporting projects of equal size, or greater, than libvirt which deliver features at an equivalent speed. There is nothing special or unique about libvirts needs. See KDE, GNOME, SystemD, Kubernetes, OpenShift, to name just a few * Trying to directly match each aspect of a developer's current mail setup to a merge request workflow is not practical. They are inherantly different tools with differing capabilities which will never map 1:1 in all cases. * The largest barrier to a change is not technical, but the relative lack of experience & familiarity with the tools. This can only be solved by using them in a real world environment. A period of productivity disruption is inevitable while learning to optimize use of the new tools. * A merge request workflow is better for new or infrequent contributors, and in the long term is certainly usable for frequent contributors. Switching to merge requests is thus better on aggregate, and with experience/hindsight may well turn out better for everyone. With the last few points in mind in particular, the idea is to not do a big bang switchover of everything at once, but instead split it into phases. Initially the focus is to be on all the non-core projects, which means the language bindings, application integrations, and supporting tools. After those are all switched successfully, highlighting any areas of difficulty and allowing for some learning/adaptation, the main libvirt.git would be switched over. The non-core projects can be switched more or less immediately as their review demands are much lower. It is anticipated that the timeframe for everything would be on the order of 3-6 months, at the max. The long story ============== The measurement question ------------------------ The key motivation behind this proposal is that it will make libvirt a more attractive project for new contributors. The quotes later illustrate real world cases where our use of email workflow has hurt contributors. These are just the cases we have become aware of from people who were motivated enough to continue their effort despite the barriers faced. It is an unknown how many more people faced similar barriers and just walked away without us knowing. It has been suggested in the past that we run both workflows in parallel as a way to test which is better. I don't believe such an approach will provide valid data. Splitting attention between two different tools will be detrimental to the effectiveness of both tools. It would be an unbalanced comparison because there is no practical way to compare both tools with the same data set unless we wish to review all patch series twice. In addition the timeframes needed for a meaningful comparison would be too long, spanning 1+ years to gather statistically significant data on contribution. If a merge request workflow was a new, unproven concept, then it would none the less be worth trying to do a comparison, however, difficult. This is not the case though, merge request workflows have been around many years & proven viable. There is nothing special about the libvirt project or developers skills that would invalidate this track record. Thus this mail will look at various aspects of the tools and consider how they impact our work, with a particular focus on issues which negatively impact new contributors, or which have a discriminatory effect on certain subsets of new or existing contributors. Contributors decline -------------------- Since 2016 libvirt has seen a significant decline in the number of new authors contributing patches to the project. This is illustrated in the 3rd slide from this KVM Forum presentation: http://people.redhat.com/berrange/kvm-forum-2019/kvm-forum-2019-libvirt.pdf For the data behind this graph, the commits to all GIT repos on libvirt.git were analysed, taking the year recorded for the git changeset author date. The email addresses were recorded each year, and a "new contributor" was considered to be someone who had never made a commit in any previous calendar year. We can see the project grew quite fast until about 2011 and then levelled out with a reasonably consistent number of long term vs new contributors. Since 2016, however, there has been a clear declining trend. The raw data shows for the last two years, the non-core git repos in particular saw a big decline, however, the same is also seen for libvirt.git itself. There is likely no single factor responsible for this decline. A combination of interest in containers instead of VMs, and maturity of the codebase likely play a fairly large part. None the less, the decline of new contributors is highly undesirable, whatever the reason, as there is still plenty of work that needs doing in libvirt. Thus steps need to be taken to make libvirt more attractive to new contributors. Lowering the barrier to entry for contributors is one key aspect to this. New contributor mistakes ------------------------ Some of the mistakes new users are liable to make when dealing with patch submissions * HTML mail instead of plain text * Mangled patches due to mail client breaking long lines * Incorrectly threaded patch series * Not labelling series with version numbers * Sending plain ‘diff’ output instead of a git patch * Not basing the patch on recent git master * Not subscribing to the list before posting * Corporate legal privacy / copyright signatures * Unintelligible mail quoting in replies (Outlook) We've tried to steer people towards git-publish, as that has encoded many of the conventions to prevent mistakes, but that is by no means a foolproof solution. It still requires setup and a working mail server, as well as following git best practice wrt branch usage. New contributor real experiences -------------------------------- Some quotes on this theme that I've collected from just the last three months. I'm removing the names as this isn't about the individuals involved, rather their first / early impressions & experience of libvirt: "I have sent the patches to the list. I had seen a while back that eventually merge requests might be fielded, so I figured it was worth a shot." "just not familiar with manual patch work, normally been used to gitlab, github. I installed the git-publish as per document, but not sure about the next step from this tool. And, I am not sure about how to connect my email with this tool either." "I'm happy to hear you are considering a switch to gitlab. I've never contributed to projects which are using email based submission" "Have you guys received an email from .... with the subject ..... I can't find it in the archives" "I don't know what has happened, but we seem to be missing this patch from the list. ... I have all the patches in my inbox from the cc, so have no idea what happened either. It was my first time using git-publish and it took a few tries to get anything done." There are a variety of reasons behind these problems, but the effects are clear. Contributing to libvirt via email is complex and error prone. Mailman lost delivery --------------------- For Mailman, we are entirely dependent on Red Hat IT support via a private ticketing system. The reliability of Mailman leaves alot to be desired. For the past 5 years or more there have been multiple periods when mail delivery either stopped entirely, or randomly lost a subset of mails. When this happens if often isn't obvious at first that we're missing mail, and once identified, it has usually taken 2-3 days between opening a ticket and having someone actively investigate and resolve it. In the worst case it was over 7 days with broken mail delivery. This is not a good situation to be in, but we've tolerated it as nothing was irretrievably lost, just delayed by days, and every subscriber was equally affected. No service is perfectly reliable and we have seen both GitLab and GitHub have downtime periodically. A key difference though is how the problems are handled. For the git forges, service downtime is a business critical problem, and so gets the prioritized attention that it demands. Mailing list delivery for libvirt is NOT considered a business critical problem for Red Hat, and so the low priorization of fixing mailman problems reflects this. MX server content mangling -------------------------- More recently we've faced worse problems with the mailing list. The new service acting as redhat.com border MX gateway is mangling mails in a way that breaks DKIM signatures. Thus any time someone sends a message with a DKIM signature, there are a large number of subscribers to the list whose mail server will bounce the delivery from mailan due to DKIM validation failures. We tried to fix this by removing the subject prefix modification and body text footer in mailman, but that merely reduced the frequency of the problem by a small amount. In theory it can be mitigated by having mailman replace the "From" field with the list server's address. qemu-devel did this originally before they switched to removing the subject prefix/body footer. The reason it worked for QEMU was because only mailman was mangling their messages, whereas for libvirt, both mailman and the redhat.com MX gateway are mangling. The old version of mailman on redhat.com doesn't appear to support From header rewriting. Similar to DKIM problems, there are the same problems with ARC headers, though we lack definitive proof on the cause. The effect is that when any SUSE contributor sends a mail, deliveries to all other SUSE subscribers will be bounced. We believe it affects some other domains, but SUSE is the most obvious one. Both these problems are pretty serious and discriminate against certain subscribers depending on how their mail service configures their incoming filters. Even redhat.com was rejecting messages with broken DKIM signatures for a time, until enough people complained about missing valid emails. Many other contributors are not lucky enough to have an IT staff who would respond to such complaints. MX server anti-spam blackholing ------------------------------- A further problem has impacted mails sent by one contributor, which have been silently dropped by redhat.com MX servers, with no bounce at all. After two months we tracked this down to a bogus hostname in the From field of the messages which caused redhat.com to consider them a spammer spoofing the sender address & so blackhole it. This is actually a case where I probably agree with the redhat.com MX server policy, but it is none the less incredibly unhelpful for the unlucky contributor(s) affected by it, as it is hard for them to discover what they did wrong. We were lucky this contributor noticed they had a problem and was very persistent over months in talking to us about it to rather than walking away, as I expect most people would. Loosing members through attrition --------------------------------- Even without any of the above problems, the mailing list workflow still has the problem of being yet another account for people to keep track of. Every week I have to purge multiple subscribers from libvirt lists due to repeated bounced delivery attempts. Most commonly this is when their MX server starts rejecting messages claiming the address no longer exists (due to leaving a job). IOW, when someone has changed their job, and thus changed their email address, they have to re-register with any projects they are using. In some cases this is ok, as they may no longer care about the project after switching jobs, but it means we lack any channel to attempt to keep their interest. If another contributor tries to contact them via their old email address, they'll just get bounces too. With Git Forges, users have a login identity that is distinct from their email address, and thus they one need update one place when changing jobs, and interactions with other users are via this persistent identifier. Mail diagnosis / resolution --------------------------- The difficulty in all of these cases is that the project contributors who are affected have very limited means to either diagnose or resolve the problem. They are dependent on raising tickets with their own IT admin staff, who might then have to raise tickets with support from a company they outsourced to, and or talk to libvirt mailman admins. The libvirt mailman admins are similarly dependent on raising tickets with Red Hat IT and an external company. Depending on the level of outsourcing, the problem can span across IT support staff from 4/5 different companies. This has a predictably bad effect on the chances of resolving problems. Mail lack of recovery --------------------- Many of the problems result in permanent loss of messages to the subscriber affected. Even if the problem was eventually solved, there is often no straightforward way to get back the messages that were lost in the mean time. One the mail server has a hard bounce for that person, it will stop retrying delivery. The only option is to visit the mailman archive page and download the mbox and re-import messages. This is a result of email being a push based communication mechanism, with only a weak delivery assurance. If the push fails and isn't retried, the information is gone for good. Contrast with a Git Forge, where the primary communication mechanism is a pull based model. If there is a transient problem with the contributors network connectivity, or the hosted service has an outage, the user will be unable to pull information for a period of time. Once the problem is resolved, the next attempt to pull information will include everything, with no loss from the end user's POV. Mail list providers ------------------- Moving the libvirt mailing lists to another provider is not a magic bullet solution. qemu-devel is hosted on nongnu.org and has suffered bounces due to DKIM breakage, though at least their more modern mailman let them workaround it better. nongnu.org though has had similar problems with mail delivery getting stopped for periods of time, not as bad as libvirt suffered, but still dependant on waiting until the US wakes up to get anyone to investigate. I know other projects hosting lists on groups.io, which have also had serious problems of late with mails getting silently discarded, or mangled such that 'git am' fails. There are other providers The impact on contributors -------------------------- All this conspires to negatively impact the ability of people to participate in the libvirt project. If you can't reliably send & receive patches to libvir-list you are effectively excluded from the project, given that email is the core of our workflow. It hasn't had a huge visible impact on libvirt, primarily because the Red Hat contributions have escaped the worst effects. This "luck" doesn't make me at all happy, because it is important that anyone involved the project has an equal ability to contribute, regardless of whether they are Red Hat people or not. This is not the case There are many conceptual reasons to like a distributed messaging system based on open standards. The reality of the implementation of email though is quite different from the theory. The distributed nature of email, combined with the plethora of implementations which don't agree on what the minimum acceptable standards are, is the direct cause of the problems we have with reliable operation of the libvirt list, and mail in general. The problems with email don't seem to be getting better, if anything they are worse, as the broader world increasingly shifts away from email, sadly often to closed protocols, or closed applications. Overall though, I struggle the with idea of continuing to recommend email as the best choice for collaboration since it is clearly failing & excluding or harming some of our contributors. Mail usage for new contributors ------------------------------- For those of us who have been involved in libvirt for any length of time, the usage of email feels quite straightforward and natural. Our familiarity and experience can be quite different from that of new contribtors to the project. The usage of email as a development workflow relies on a large number of, often undocumented, conventions. If a contributor gets the conventions wrong they'll have poor first experience with the project. In the early days of libvirt those new contributors with existing open source experiance would already be familiar with email conventions, but we had plenty of contributors from the corporate world for which open source was a completely new experiance. Over the years we've sent nasty-gram responses to plenty of contributors who missed the conventions, though thankfully we're getting more polite and friendly these days in how we point out these kind of mistakes. It would be better to use a system that doesn't need this to the same degree. These days open source is a well known development model even among corporate developers from the closed source world, but their OSS knowledge is almost always based on a GitForge workflow, not an email workflow. Maintainer patch submission overload ------------------------------------ Any long term contributors to libvirt will have become familiar with the mail based development workflow, however, long term usage and familiarity does not imply that the solution used is the optional one, rather that the chosen solution has been optimized. With the volume of patches submitted to libvirt there is always pressure on reviewer bandwidth. The imbalance between incoming patches and reviewer time is common across all major open source projects, regardless of development workflow and tools. What matters is thus how well the project deals with the overload, to priortize the right aspects. The efficiency with which developers can perform reviews is one factor and is typically the area where email is said to be most beneficial. I personally find email is efficient for writing comments inline to patches, but that is largely due to the use of a text based user interface (mutt), rather than the inherent use of email as a messaging platform. There are other aspects to the use of email which harm our productivity as reviewers. Personal patch tracking ----------------------- To review a patch you first have to realize and/or remember that it exists, and this is an area where email is quite poor. It is easy for patches to go unattended for a prolonged period of time, drowned out in the volume of patches on the list. We encourage users to send a "ping" if they think their patch has been ignored for too long. This is a poor workaround for our inability to keep track of what patches are pending. New contributors won't even know they're expected to send a "ping", or may wonder how long they should wait. Essentially the mail folder needs to be thought of as a mixture of signal and noise, where 'signal' refers to patches which need reviewer attention, and 'noise' refers to patches which are already processed in some way. The most recent mails in an folder have a high signal component, turning increasingly to noise as they age. The key task for email processing is thus to distinguish signal from noise, such that we never miss any valid messages. The supposed positive of using email is that contributors have the flexibility to customize their way of working to enable them to keep track of pending work in whatever way suits them best. People come up with clever mail filtering practices / rules / scripts to organize their incoming mail stream. Instead of considering this a benefit, we should be asking why our tools are so ill-suited to the task that every contributor has to spend (waste) time reinventing the wheel to distinguish signal from noise when processing email, duplicating that same work across all contributors ? This has a particularly high burden on contributors who don't engage with the project on a daily basis, as they get a firehose of emails, many of which are already obsolete by the time they look at their inbox. IOW, in-frequent contributors experiance high noise, low signal, when interacting with the libvirt mailing list. At first glance merge requests may feel like they are no better, since there would be the same number of patches arriving each day. The key difference is that there is good structure and metadata to the merge requests. Multiple versions of the same series, appear as a single entity. Merge requests which have been merged or abandoned no longer appear. Active merge requests can be tagged with labels & users to organize them. Thus looking at the list of merge requests gives you information with a high signal content, and little noise content. This is good for both regular contributors and infrequent or new contributors. Patch tracking tools -------------------- With every contributor organizing their own personal incoming stream, there is no visibility / interaction between contributors on pending work. There is no way of knowing whether any given patch series is on the to do list for another person to deal with. There have been tools created which attempt to fill the gap by providing a centralized view of incoming patches. To name just a few, there are patchwork, patches, patchew, and patchecker. We have tried many of the tools to some degree or another with libvirt, but none have had success. Some are also in use in QEMU but see periodic breakage in consuming the mail stream for a variety of reasons. First and foremost, they suffer from the difficulty of turning free format mail threads into well structured data. This can only be done reliably if all contributors follow a set of well defined conventions for sending patches, which immediately puts a burden on contributors, and is a barrier to new contributors. These tools typically assume that all patches can apply against git master and if this isn't possible drop them with an error. Contributors rarely intentionally send patches against old versions of the code, but with the rate of commits, git master can still conflict with a newly posted patch series, requiring human intervention. Some support special magic comments in the mail to indicate what git version the patch is based on, but this is more secret sauce for contributors to learn. The problem of patch conflict resolution also creates challenges for the tools to determine when a series has been merged & can thus be considered complete. Then there is the matter of correlating different versions of the same patch series. This again largely relies on contributors following a convention of using the same subject line for the leading patch / cover letter each time it is posted. Some people post URLs pointing to previous versions of the same series, but this is yet more manual work to work around the limits of email when used for patch submission. Keeping a fixed subject does not always make sense. This again makes it hard for the tools to determine whether a patch series has been merged or obsoleted. So overall, while it is possible to write tools to turn incoming patch series emails into structured data, their reliability is poor, which in turn makes it difficult to consider relying on them. These same problems are why it is difficult to have a bi-directional email gateway between a mailing list and a merge request system. Sending mails from a merge request is easy. Parsing contextual replies and getting them back into the merge request is incredibly hard, often impossible. Attaching CI to submissions --------------------------- Without an ability to reliably process incoming patches in an automated manner, it is impractical to automate downstream tasks. It is generally considered best practice to run tests on all code before it is accepted for merge. Libvirt CI testing is currently all done post-merge. Contributors are generally expected to make sure code compiles and passes tests before posting it. The testing done this way rarely has the same breadth of coverage as that done by post-merge CI and may be done against an outdated git master. Reviewers may or may not take the time to actually apply the patches and run builds and tests. Where this is done, the coverage is usually just one or two platforms, a small subset of post-merge CI. The result of this is the frequent build & test breakage of git master, which has to be addressed through "brown paper bag" commits, mostly pushed without review under a "build breaker" rule. This is a prudent way to fix the breakage, however, it would be better if the breakage was prevented in the first place. If automation can catch the build problems before merge, there ceases to be any need for having special exceptions to allow pushing code without review. The result will be a higher quality output overall, with less time wasted on manually pointing out silly mistakes, or fixing them up after the fact. We know there is insufficient reviewer time to handle all the incoming patches submitted, and thus we should not expect them to spend time on tasks that can be automated. Reviewers should never be building code or running tests in general, or spending time pointing out problems in the code that can be caught by the tests. The tools should do all testing the moment the code is submitted, with reviewers only spending time once CI has passed & the submitted had a chance to re-upload any obvious fixes. The patchew tool attempts to do this kind of automation, but as noted above it suffers from the hard problem of reliably parsing email, determining what git master patches apply to, and needs secret comment syntax to be used by contributors for cases where it can't do the right thing. Barriers to new contributors ---------------------------- The only real solution to contributor overload is to spread the work across more people. This in turn implies a need to attract new contributors to the project. There need to be interesting problems to attract their attention. The task is then need to turn them into regular contributors, and this is where first impressions of a project are important. Simply put, the first interaction needs to be a positive experience. This might be via a bug tracker or via a code submission, or something else. Anything which may hurt first impressions needs to be eliminate or mitigated as much as practical. A need to learn unusual workflows, or configure new tools, or learn about project specific (unwritteN) conventions, are all things which stand in the way of succesful participation. The key value of the merge request workflow is the standardization across projects, such that it has immediate familiarity enabling people to contribution without having to read docs. This is a bit of an over simplication, since there is always some project specific knowledge required. Ideally this knowledge will be primarily around the code and not on the development process. I've increasingly recognised this in my own interactions with projects. Since the widespread adoption of Git, I find it a significant turn off to find a project is still using CVS, Subversion, or any other non-Git SCM. Increasingly I feel the same way about development workflow and the git forge. If I can fork a project on GitLab/GitHub, and open a pull request, I'm more likely to send a patch upstream, than if I have to register for a new account on a custom bug tracker or mailing list. On the flip side, while one of my own personal projects still has a mailing list, it is more enjoyable dealing with contributors if they open a pull request, as I see the CI results and know the patch can be applied with 1 click if desired. The more I use the tools, the more effective they become for me. Keeping new contributors interested ----------------------------------- The response to a new contributor is also important. The earlier someone engages with them, the better. With this in mind it is useful to be able to quickly identify new contributor and prioritize a response to them over other work. Automation (CI) can help in this respect by picking up & reporting problems without needing human attention. None the less, triage of incoming requests will be needed, primarily though tagging to classify incoming requests, and nominating certain people as reviewers. The power of the merge request model is that such triage is centralized & thus visible to participates, as opposed to being repeated by every individual when using email. People can see this triage and more effectively direct their attention where it is needed. The Web UI problem ------------------ The use of a web UI is often seen as the biggest downside to using a Git Forge, especially amongst people who are used to an email based code review workflow. This is a position I have great sympathy with, as I do largely prefer text based tools with strong keyboard navigation, over graphical tools with strong mouse navigation. The downsides of the web UI for merge requests becomes more apparent the larger the change is, especially if split across multiple patches. While common, this viewpoint on the merits of text UI over graphical UI is not universally held. The popularity of Git Forges amongst the open souce world demonstrates that many developers have no trouble with the web UI, or at least are willing to accept it in order to benefit from the other features offered. My preference for a text based UI has lead me to spend time investigating the API support for GitLab and from there developing a custom application for interacting with GitLab merge requests. https://gitlab.com/bichon-project/bichon The goal of the tool is to provide a user interface for merge requests that is familiar to users of the mutt email client. It is not especially advanced in its impl yet, and certainly has rough edges, but for me it has proved the general point that we can build useful tools to integrate with GitLab, which can mitigate the worst aspects of a Web UI. The need to build such a tool, however, has a cost as it diverts time away from working on libvirt itself, which negates some of the benefits of adopting GitLab in the first place. It is hoped, however, that this is a relatively short term cost that could ultimately be amoritized across multiple developers who use GitLab for their projects, while the benefits are repaid continuously over the long term. The review comment model ------------------------ Those familiar with a email based workflow using clients like mutt will often cherish the deeply threaded discussion model. It is worth bearing in mind that not all email clients will follow this, with many turning the threaded discussion into a flat discussion. The GitLab review comment system certainly doesn't offer the kind of threading that mutt does, but it is a step above what many email clients do. Against any single patch, it is possible to one or more standalone comments, and one or more discussion threads, attached to specific lines of code. The distinction between the two is whether replies are permitted. The threads are, however, flat, not deeply nested. There are two interesting and perhaps useful things to be aware of. First each discussion thread on a patch has a notion of whether it has been "resolved" or not. It is possible to (optionally) have threads auto-resolve when a new version of the patch series is posted to the merge request, or they can be resolved manually by a contributor. This can be valuable in keeping track of whether feedback on v1 was addressed in changes for v2, which is a challenge with email review workflows. The second interesting thing is that threads can be "rebased". This is both useful and annoying at the same time. What appears to happen behind the scenes is that the (commit hash, file, line num) that is associated with a comment thread can be auto-updated to point to the latest commit that contains the matching (file, line num). This is useful when a v2 is posted, as the comments from v1 are moved to v2, helping identify if the problem was addressed. It is annoying when dealing with patch series though, as it appears that a comment created on patch 3, might be reported against patch 15 when later queried, if patch 15 has context covering this same source file location. This makes some sense when considering the web UI, because by default it will show the combined diff of all patches in a series. It is less helpful for my bichon tool which is trying to present a per-patch view. In theory it should be possible to follow the diff context backwards to display the comment against every single patch which has context for the same line of code. This would in fact make it quite useful, but until that is done it is quite annoying. More generally, comments have nice integration with the rest of the toolset making it easy to refer to issue trackers, merge requests via magic markdown extensions, and to tag individual contributors to get their opinion on a question. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

8 25

Re: GSoC'20 Interested Student: Adding support to Jailhouse Hypervisor
by Jan Kiszka 31 Mar '20

31 Mar '20

Hi Prakhar, On 23.03.20 07:53, PRAKHAR BANSAL wrote: > Hello All, > > My name is Prakhar Bansal and I am a graduate student in Computer > Engineering at Iowa State University, US. > I have experience with Analysing Performance of Applications running > inside multiple virtual machines hosted by the libvirt QEMU-KVM through > virt-manager. > > I am interested in working on the project to develop a Libvirt driver > for the Jailhouse hypervisor. I looked into the initial attempt on the > Jailhouse driver which seems to be based on the Jailhouse command-line > interface. I am currently looking into learning and understanding the > kernel APIs for jailhouse hypervisor. Thanks for your interest! > I followed the below articles mentioned by Valentine Sinitsyn to begin > learning about the Jailhouse hypervisor. > > https://lwn.net/Articles/578295/ > https://lwn.net/Articles/578852/ > > I have a few questions regarding this project, please let me know if > someone can help me out. Sure, go ahead. Depending on the scope of the question, libvirt might be the better community to ask. Therefore, I'm adding its list to this thread. Jan > > Thanks & Regards, > Prakhar Bansal > -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux

3 10

[libvirt PATCH] travis: delete all Linux jobs
by Daniel P. Berrangé 31 Mar '20

31 Mar '20

The Fedora rawhide job started failing with the latest container build. Since we now have broad CI coverage on GitLab CI, there's not a strong reason to continue using Travis for Linux jobs. Deleting the redundant jobs is a better use of time than trying to debug the failure. Signed-off-by: Daniel P. Berrangé <berrange(a)redhat.com> --- .travis.yml | 47 ----------------------------------------------- 1 file changed, 47 deletions(-) diff --git a/.travis.yml b/.travis.yml index f400f76118..0548b28b01 100644 --- a/.travis.yml +++ b/.travis.yml @@ -18,53 +18,6 @@ addons: matrix: include: - - services: - - docker - env: - - IMAGE="ubuntu-1804" - - MAKE_ARGS="syntax-check distcheck" - script: - - make -C ci/ ci-build@$IMAGE CI_MAKE_ARGS="$MAKE_ARGS" - - services: - - docker - env: - - IMAGE="centos-7" - - MAKE_ARGS="syntax-check distcheck" - script: - - make -C ci/ ci-build@$IMAGE CI_MAKE_ARGS="$MAKE_ARGS" - - services: - - docker - env: - - IMAGE="debian-9" - - MAKE_ARGS="syntax-check distcheck" - script: - - make -C ci/ ci-build@$IMAGE CI_MAKE_ARGS="$MAKE_ARGS" - - services: - - docker - env: - - IMAGE="fedora-31" - - MAKE_ARGS="syntax-check distcheck" - script: - - make -C ci/ ci-build@$IMAGE CI_MAKE_ARGS="$MAKE_ARGS" - - services: - - docker - env: - - IMAGE="fedora-rawhide" - - MAKE_ARGS="syntax-check distcheck" - script: - - make -C ci/ ci-build@$IMAGE CI_MAKE_ARGS="$MAKE_ARGS" - - services: - - docker - env: - - IMAGE="fedora-30-cross-mingw32" - script: - - make -C ci/ ci-build@$IMAGE - - services: - - docker - env: - - IMAGE="fedora-30-cross-mingw64" - script: - - make -C ci/ ci-build@$IMAGE - compiler: clang language: c os: osx -- 2.24.1

2 4

[PATCH] Add blog for ARM Datacenter Project.
by Robert Foley 31 Mar '20

31 Mar '20

Our blog covers QEMU topics. Signed-off-by: Robert Foley <robert.foley(a)linaro.org> --- updater/virt-tools/config.ini | 3 +++ 1 file changed, 3 insertions(+) diff --git a/updater/virt-tools/config.ini b/updater/virt-tools/config.ini index 9375c82..4e53029 100644 --- a/updater/virt-tools/config.ini +++ b/updater/virt-tools/config.ini @@ -197,3 +197,6 @@ faceheight = 96 [https://fidencio.org/tags/virt/index.xml] name = Fabiano Fidêncio + +[https://futurewei-cloud.github.io/ARM-Datacenter/feed.xml] +name = ARM Datacenter Project -- 2.17.1

2 2

[PATCH v2 00/11] Another round of qemu:///embed fixes
by Michal Privoznik 30 Mar '20

30 Mar '20

v2 of: https://www.redhat.com/archives/libvir-list/2020-March/msg00992.html Basically, a lot of patches from v1 was ACKed, but there were some suggestions which I put into separate commits and now am resending as v2. diff to v1: - Renamed virDomainDriverHashRoot() to virDomainDriverGenerateRootHash() - Generate hashed path only if the base (prefix) would point outside of the root dir - Use "qemu" instead of "QEMU" when generating hashed path - New patches 2/11, 7/11 Michal Prívozník (11): tests: Fix virQEMUDriverConfigNew() calling with respect to @root qemu: Drop two layers of nesting of memoryBackingDir conf: Move virDomainGenerateMachineName to hypervisor/ virDomainDriverGenerateMachineName: Factor out embed path hashing qemu: Introduce virQEMUDriverGetEmbedRoot qemuDomainGetMachineName: Access embeddedRoot from driver rather than cfg qemu: use virQEMUDriverGetEmbedRoot() to access embeddedRoot Revert "qemu_conf: Track embed root dir" qemu: Make hugepages path generation embed driver aware qemu: Make memory path generation embed driver aware qemu: Make auto dump path generation embed driver aware src/conf/domain_conf.c | 72 --------------- src/conf/domain_conf.h | 7 -- src/hypervisor/domain_driver.c | 88 +++++++++++++++++++ src/hypervisor/domain_driver.h | 11 +++ src/libvirt_private.syms | 3 +- src/lxc/lxc_domain.c | 3 +- src/qemu/qemu_command.c | 19 ++-- src/qemu/qemu_conf.c | 69 +++++++++------ src/qemu/qemu_conf.h | 23 +++-- src/qemu/qemu_domain.c | 9 +- src/qemu/qemu_driver.c | 26 +++--- src/qemu/qemu_process.c | 9 +- tests/domaincapstest.c | 2 +- .../qemuxml2argvdata/cpu-numa-memshared.args | 8 +- .../fd-memory-no-numa-topology.args | 2 +- .../fd-memory-numa-topology.args | 4 +- .../fd-memory-numa-topology2.args | 8 +- .../fd-memory-numa-topology3.args | 12 +-- .../hugepages-memaccess2.args | 12 +-- .../qemuxml2argvdata/pages-dimm-discard.args | 4 +- ...vhost-user-fs-fd-memory.x86_64-latest.args | 2 +- tests/testutilsqemu.c | 2 +- tests/virsystemdtest.c | 5 +- 23 files changed, 217 insertions(+), 183 deletions(-) -- 2.24.1

3 25

[PATCH v2 0/4] add support for QEMU 9pfs 'multidevs' option
by Christian Schoenebeck 30 Mar '20

30 Mar '20

QEMU 4.2 added a new option 'multidevs' for 9pfs. The following patch adds support for this new option to libvirt. In short, what is this about: to distinguish files uniquely from each other in general, numeric file IDs are typically used for comparison, which in practice is the combination of a file's device ID and the file's inode number. Unfortunately 9p protocol's QID field used for this purpose, currently is too small to fit both the device ID and inode number in, which hence is a problem if one 9pfs export contains multiple devices and may thus lead to misbheaviours on guest (e.g. with SAMBA file servers) in that case due to potential file ID collisions. To mitigate this problem with 9pfs a 'multidevs' option was introduced in QEMU 4.2 for defining how to deal with this, e.g. multidevs=remap will cause QEMU's 9pfs implementation to remap all inodes from host side to different inode numbers on guest side in a way that prevents file ID collisions. NOTE: In the libvirt docs changes of this libvirt patch I simply assumed "since 6.2.0". So the final libvirt version number would need to be adjusted in that text if necessary. See QEMU discussion with following Message-ID for details: 8a2ffe17fda3a86b9a5a437e1245276881f1e235.1567680121.git.qemu_oss(a)crudebyte.com v1->v2: * Unrelated docs/formatdomain.html.in changes to separate patch. [patch 1] * Added new capability QEMU_CAPS_VIRTFS_MULTIDEVS. [patch 2] * XML changes as isolated patch. [patch 3] * Code style fix. [patch 3] * QEMU 'multidevs' command handling as isolated patch. [patch 4] * Error out if not QEMU_CAPS_VIRTFS_MULTIDEVS capability. [patch 4] * Error out on virtiofs (since it does not have the 'multidevs' option). [patch 4] TODO: * Capabilities test cases would fail if <flag name='virtfs-multidevs'/> was added to the other architectures' test case xml files, why? [patch 2] * The requested test cases to add: Sorry, the libvirt test case environment is yet a mystery to me, I would not even know where to start here. Message-ID of v1: E1jEFpL-00028n-Qj(a)lizzy.crudebyte.com Christian Schoenebeck (4): docs: virtfs: add section separators qemu: capabilities: add QEMU_CAPS_VIRTFS_MULTIDEVS conf: add 'multidevs' option qemu: add support for 'multidevs' option docs/formatdomain.html.in | 47 ++++++++++++++++++- docs/schemas/domaincommon.rng | 10 ++++ src/conf/domain_conf.c | 29 ++++++++++++ src/conf/domain_conf.h | 13 +++++ src/qemu/qemu_capabilities.c | 5 ++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 7 +++ src/qemu/qemu_domain.c | 12 +++++ .../caps_4.2.0.x86_64.xml | 1 + .../caps_5.0.0.aarch64.xml | 1 + .../caps_5.0.0.x86_64.xml | 1 + 11 files changed, 126 insertions(+), 1 deletion(-) -- 2.20.1

2 11