Re: [libvirt] Redesigning Libvirt: Adopting use of a safe language

Thursday, 16 November 2017

On 11/14/2017 12:27 PM, Daniel P. Berrange wrote:
...
 The Problem(s)
 ============== 
First off I'll state I'm not opposed to considering adopting or
integrating a newer language. Still, I have my concerns, fears,
uncertainty, and doubts. In a world where one must "adapt or die", I'm
not opposed to being more accepting to see GO type contributions, but I
also know there's a learning curve involved with adapting and forcing
myself to learn a new language especially one that's touted as being
more C like, but really isn't necessarily what I've been used to for a
long time (at least at first glance).

...

 When libvirt was created, C was the only viable choice for anything aiming to be
 a core system library component. At that time 2005, aside from C there were
 common choices of Java, Python, Perl. Java was way too heavy for a low level
 system component, Python was becoming popular but not widely used for low level
 system services and Perl was on a downward trend. None of them are accessible to
 arbitrary languages as libraries, without providing a RPC based API service. As
 it turns out libvirt did end up having RPC based approach for many virt drivers,
 but the original approach was to be a pure library component.

 IOW it is understandable why C was chosen back in 2005, but 12 years on the world
 around us has changed significantly. It has long been accepted that C is a very
 challenging language to write "safe" applications. By "safe" I mean
avoiding the
 many problems that lead to critical security bugs. In particular the lack of a
 safe memory management framework leads to memory leaks, double free's, stack or
 heap corruption and more. The lack of strict type safety just compounds these
 problems. We've got many tools to help us in this area, and at times have tried
 to design our APIs to avoid problems, but there's no getting away from fact that
 even the best programmers will continually screw up memory management leading to
 crashes & security flaws. It is just a fact of life when using C, particularly if
 you want to be fast at accepting new feature proposals.

 It is no surprise that there have been no new mainstream programming languages in
 years (decades) which provide an inherantly unsafe memory management framework.
 Even back in 2005 security was a serious challenge, but in the last 10+ years
 the situation has only got worse with countless high profile security bugs a
 direct result of the choice to use C. Given the threat's faced today, one has to
 seriously consider the wisdom of writing any new system software in C. In another
 10 years time, it would not surprise me if any system software still using C is
 considered an obsolete relic, and ripe for a rewrite in a memory safe language.

Programming languages come and (well) go - it just so happens the C has
been a survivor. There's always been challengers promising something,
but eventually falling by the wayside when either they fail to deliver
their promise or the next sexy language comes along. In my 30 years
(eeks) even with all warts C has been there. It was certainly better
than writing in assembly/macro or BLISS. I recall converting a lot of
BLISS to C when I first started.

There is something to be said about the "devil you know" vs. the one you
don't! Just as much as there is a need to keep yourself "current" with
technology trends. The latter becomes harder to do the longer I do this.

...
 There are long term implications for the potential pool of
contributors in the
 future. There has always been a limited pool of programmers able todo a good job
 in C, compared to those who know higher level languages like Python/Java. A
 programmer write bad code in any language, but in C/C++ that bad code quickly
 turns into a serious problem. Libvirt has done ok despite this, but I feel our
 level of contribution, particularly "drive by" patch submissions, is held back
 by use of C. Move forward another 10 years, and while C will certainly exist, I
 struggle to imagine the talent pool being larger. On the contrary I would expect
 it to shrink, certainly in relative terms, and possibly in absolute terms, as
 other new languages take C's place for low level systems programming. 10 years
 ago, Docker would have been written in C, but they took the sensible decision to
 pick Go instead. This is happening everywhere I look, and if not Go, then Rust. 
I'm not convinced that "drive by" patch submissions are those we seek.
As stated, libvirt is a fairly complex project. I would think drive by
submissions lead to more problems regardless of the language chosen
because a reviewer spends so much of his/her valuable time trying to
assist the new contributor only to eventually learn that it is a drive
by. Then those that are committed to the project are left to decide to
drop the drive by submission or support it for years to come. Invariably
there's some integration interaction that was missed.

I would hope our long term goal would be build up not only contributors,
but more importantly reviewers. Again, doesn't matter what language is
chosen, since libvirt has review requirements then it needs reviewers.
If GO is a language from which to draw new contributors and more
importantly reviewers, then great.

With respect to the limited pool of C developers able to do a good job
in C - by flipping a switch to GO what kind of confidence level do you
have that new wealth of talent will have the necessary skills/experience
and/or desire to understand the nuances that do exist for project like
libvirt and in particular the complicated libvirtd problem to be solved?
Maybe it's a bit of 'bias' and terminology, but I've always thought
there is a difference between programmer and software engineer. My FUD
is that we attract too many of the former and not enough of the latter
that are necessary to solve that complex issue.

There are certain "things" you learn through years of trial and error
that perhaps are "less important" at the application level. It seems
today the theory is if an App crashes - so what, restart it. That's not
something for library, daemon, or OS development. If a Daemon crashes,
oh crap... host crashes, oh double crap. Once you do this long enough
you get involved in many aspects of OS, daemon, and library code such as
timing, threads, inter-process communication, locking, fd/socket mgmt,
backdoor hooks, etc. Does GO make those less relevant or just shift the
onus to learn the language and its limitations and quirks? Yes, I
understand C is callable from it, but if the long term goal is C
independence, then we ought to weigh and understand the risks before
jumping into the ocean.

...

 We push up against the boundaries of what's sane todo in C in other ways too.
 For portability across operating systems, we have to rely on GNULIB to try
 to sanitize the platform inconsistencies where we use POSIX, and assume that
 any 3rd party libraries we use have done likewise.

 Even then, we've tried to avoid using the platform APIs because their designs
 are often too unsafe to risk using directly (strcat, malloc, free), or are not
 thread safe (APIs lacking _r variants). So we build our own custom C platform
 library on top of the base POSIX system, re-inventing the same wheel that every
 other project written in C invents. Every time we have to do work at the core C
 platform level, it is diverting time away from doing working managing higher
 level concepts.

 Our code is following an object oriented design in many areas, but such a notion
 is foreign to C, so we have to bolt a poor-mans OO framework on the side. This
 feeds back into the memory safety problem, because our OO invention cannot be
 type checked reliably at compile time, making it easy to do unsafe things with
 objects. It relies on reference counting because there's no automatic memory
 management.

 The other big trend of the past 10 years has been the increase in CPU core
 counts. My first libvirt dev machine had 1 physical CPU with no cores or threads
 or NUMA. My current libvirt dev machine has 2 CPUs, each with 6 cores, for 12
 logical CPUs. Common server machines have 32/64 logical CPUs, and high end has
 100's of CPUs. In 10 years, we'll see high end machines with 1000's of CPUs
and
 entry level with mere 100's. IOW good concurrency is going to be key for any
 scalable application. Libvirt is actually doing reasonably well in this respect
 via our heavily threaded libvirtd daemon. It is not without cost though with
 ever more complex threading & locking models, which still have scalability
 problems. Part of the problem is that, despite Linux having very low overhead
 thread spawning, threads still consume non-trivial resources, so we try to
 constrain how many we use, which forces an M:N relationship between jobs we need
 to process and threads we have available.

So GO's process/thread model is then lightweight?  What did they learn
that the rest of us ought to know! Or is this just a continuation of the
libvirtd discussion?

Still it seems the pendulum has swung back to hardware and software
needs to catch up. It used to be quantum leaps in processor speed as it
related to chip size/density - now it's just leaps in the ability to
partition/thread at the chip level. I'd hate to tell you about the boat
anchor I had on my desktop when I first started!

...

 The Solution(s)
 ===============

 Two fairly recent languages, Go & Rust, have introduced new credible options for
 writing systems applications without sacrificing the performance of C, while
 achieving the kind of ease of use / speed of development seen with languages
 like Python. It goes without saying that both of them are memory safe languages,
 immediately solving the biggest risk of using C / C++.

If memory mgmt and security flaws are the driving force to convert to
GO, then can it be claimed unequivocally that GO will be the panacea to
solve all those problems? Even the best intentions don't always work out
the best. If as pointed out in someone else's response there have been
CVE's from/for GO centric apps - how many of those are GO related and
how many are App related? Not that it matters, but the point is we're
shifting some amount of risk for timely fixes elsewhere and shifting the
backwards compatible story elsewhere which could be the most
problematic. Not everyone has the same end goal for ABI/API
compatibility. Add to that the complexity of ensuring that a specific
version of some package you've based your product/reputation on.

Curious, is the performance rated vs. libc memory alloc/free or
something else? I don't recall ever being on a project that didn't have
some sort of way to "rewrite" the memory mgmt code. Whether it was shims
to handle project specific needs or usage of caches to avoid the awful
*alloc/free performance. Doing the GC is great, but what is the cost.
Perhaps something we don't know until we got further down that path.

...
 The particularly interesting & relevant innovation of Go is the
concept of
 Goroutines for concurrent programming, which provide a hybrid kernel/userspace
 threading model. This lowers the overhead of concurrency to the point where you
 can consider spawning a new goroutine for each logical job. For example, instead
 of having a single thread or limited pool of threads servicing all QEMU monitor
 sockets & API clients, can you afford to have a new goroutine dedicated to each
 monitor socket and API client. That has the potential to dramatically simplify
 use of concurrency while at the same time allowing the code to make even better
 use of CPUs with massive core counts. 
Sounds promising and complicated, but is the risk of libvirt discovering
some flaw or limitation in goroutine's worth it?  IOW: Would libvirt be
blazing a new trail or are other consumers that have "helped" work
through the initial issues.

...

 It of course provides a cross platform portable core library of features, and has
 a massive ecosystem of developers providing further 3rd party libraries for a
 wide variety of features. This means developers can focus more time on solving
 the interesting problems in their application space. The Go code is still low
 level enough that it can interface with C code easily. FFI calls to C APIs can be
 made inline in the Go code, with no need to switch out to write a low level
 binding in C itself. In many ways, Go can be said to have the ease of use, fast
 learning & safety of Python, combined with the expressiveness of C. IOW it is a
 better C than C.

But still requiring a learning curve to get through the nuances. I think
you may be underestimating the learning curve, but I could be wrong. It
would seem to be far more than a google search (as pointed out in a
different response). It would probably also include gaining an
understanding how whatever 3rd party library was chosen works (but maybe
that's just the trust factor).

If there's so many Go developers out there - one would hope there would
a "swarm" willing to help convert existing projects from C to Go. ;-)

Oh, and license wise it would seem we'd have to be careful, true? At
least w/r/t attempting to utilize packages written or listed on the wiki
page link. From just a quick scan there, it seems to be numerous
"packages" available and some list difference licenses.

Also, once chosen what happens if/when issues or incompatibilities are
discovered in some package? Do we follow the same principle of GNULIB
and try to fix it ourselves or somehow work around it? As I've learned
through time - "how" someone else fixes a problem may not work out best
and the degree of importance of the problem can result in delays in
getting a resolution. Having some amount of control is nice and we just
have to weigh the risk(s) of giving some of that away.

...
 I don't have direct experiance in Rust, but it has the same kind
of benefits over
 C as Go does, again without the downsides of languages like Python or Java. There
 are some interesting unique features to Rust that can be important to some apps.
 In particular it does not use garbage collection, instead the user must still do
 manual memory management as you would with C/C++. This allows Rust to be used in
 performance critical cases where it is unacceptable to have a garbage collector
 run. Despite a requirement for manual allocation/deallocation, Rust still
 provides a safe memory model. This approach of avoiding abstractions which will
 introduce performance overhead is a theme of Rust. The cost of such an approach
 is that development has a higher learning curve and ongoing cost in Rust, as
 compared to Go. 

 I don't believe that the unique features of Rust, over Go, are important to the
 needs of libvirt. eg while for QEMU it would be critical to not have a GC
 doing asynchronous memory deallocation, this is not at all important to libvirt.
 In fact precisely the opposite, libvirt would benefit much more from having GC
 take care of deallocation, letting developers focus attention other areas. In
 general, as from having a memory safe language, what libvirt would most benefit
 from is productivity gains & ease of contribution. This is the core competancy
 of Go, and why it is the right choice for usage in libvirt. 
Depends on the GC, right? Is GC context/scope based? or overall APP
based? There are certainly some particularly hairy uses of memory and
arguments in libvirt code.

...

 The obvious question / difficulty is deciding how to adopt usage of a new
 language, without throwing everything away and starting from scratch. It needs
 to be possible for contributors to continue working on every other aspect of the
 project while adoption takes place over the long term. Blocking ongoing feature
 work for prolonged periods of time is not acceptable.

Not an easy task because one way or another you're taking resources from
one pile to put on another pile. Throwing new resources at the problem
isn't necessarily the solution either because they need to "learn the
environment".

...
 There is also a question of scope of the work. A possible target
would be to aim
 for 100% elimination of C in N years time (for a value of N that is certainly
 greater than 5, possibly as much as 10). There is a question of just whether that
 is a good use of resources, and even practical. In terms of management of KVM
 guests the bulk of ongoing development work, and complexity is in the libvirtd
 daemon. The libvirt.so library merely provides the remote driver client which is
 largely stable & unchanging. So with this in the mind the biggest benefits would
 be in tackling the daemon part of the code where all the complexity lives. 
N = ∞ (infinity ;-))

...

 As mentioned earlier, Go has a very effective FFI mechanism for calling C code
 from Go, and also allows Go code to be called from C. There are some caveats to
 be aware of with passing data between the languages, however, generally it is
 neccessary to copy data structures as C code is not permitted to derefence
 pointers that are owned by the Go GC system. There are two possible approaches
 to take, which can be crudely described as top down, or bottom up.

 In the top down approach, the C file providing the main() method gets replaced
 by a Go file providing an equivalent main() method, which then simply does an
 FFI call to the existing libvirt C APIs to run the code. For example it would
 just call virNetServer APIs to setup the RPC layer. Effectively have a Go program
 where 90% of the code is an FFI call to existing libvirt C code. Then we would
 gradually iterate downwards converting increasing areas of C code to Go code.

 In the bottom up approach, the program remains a C program, but we built .a files
 containing Go code for core pieces of functionality. The C code can thus call
 into this archive and end up executing Go code for certain pieces. Then we would
 gradually iterate upwards converting increasing areas of C code to Go code, until
 eventually reaching the top main() method.

 Or a hybrid of both approaches can be taken. Whichever way is chosen is going to
 be a long process and many bumps in the road.

 The best way to start, however, is probably to focus on a simple self-contained
 area of libvirt code. Specifically attack the virtlockd, and/or virtlogd daemons,
 converting them to use Go. This still need not be done in a "big bang". A
first
 phase would be to develop the server side framework for handling our RPC protocol
 deserialization. This could then just dispatch RPC calls to the existing C impls.
 As a second phase, the RPC method impls would be converted to Go. Both of these
 daemons are small enough that the conversion would be possible across the time
 of a couple of releases. The hardest bit is likely ensuring compatibility for
 the re-exec() upgrade model they support, but this is none the less doable.
 The lessons learned in this would go a long way towards informing the best way
 to tackle the bigger task of the monolithic libvirtd (or equivalently the swarm
 of daemons the previous proposal suggests)

It will take though "someone" who knows GO and libvirt well enough
start. At this time, I submit that pool of talent is quite limited. Not
necessarily GO contributors, but those that understand the libvirt build
system, how to mash things together, how to write good GO code, and what
types of considerations one has to make when developing at the OS,
daemon, and library level.

In the end I'm not sure I see a 'requirement' to switch to GO. It seems
more a 'strong desire' based primarily on the factors of GC,
availability of language packages (whether inherent or provided) and
some possibility that libvirt would attract more developers. It doesn't
seem like GO will "fix" something that cannot be resolved in C.

Thanks for the thought provoking topic and the new diversion!

John

...
 Regards,
 Daniel

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] Redesigning Libvirt: Adopting use of a safe language