On 16.09.2019 22:44, Laine Stump wrote:
On 9/16/19 10:44 AM, Nikolay Shirokovskiy wrote:
> Hi, all
>
> I recently discovered that <target dev=''/> input is allowed[*]
So you're saying this previously failed? Catastrophically (SEGV)? Or did it log an
error and refuse to start the domain?
Without [*] domain start fails with internal error, something like "value too
long" caused by empty string behaviour of virStrncpy.
> by qemu driver's defineXML for <interface> element. On start it will
> turn into 'tap<N>'. At the same time empty value is not allowed by
> schema. This name is generated by kernel on linux.
Hmm, yeah I hadn't ever thought about it, but I guess that makes sense, since
that's the way the tunctl utility works by default (opens /dev/net/tun, then does a
TUNSETIFF ioctl with an empty string as the name of the device), and it get devices name
tap<N>.
>
> I wonder how to deal with this.
>
> 1. Fix schema to allow empty value for dev attribute. Additionally fix
> migrating to clear out this autogenerated name. This way we have 2
> options to autogenerate dev name. First as described above. Second
> is by ommiting target element. The naming will be 'vnet<N>' in the
> latter case. Having these 2 options looks redundant.
Yeah, this option is bad/undesirable.
>
> 2. Don't allow to start domains with such configuration. We can not
> fail definition because disappearing VMs on libvirt update looks
> too unpleasant.
1) (Nevermind - you answered below) How long has this worked? If it's *very* new, and
defining a domain with <target dev=''/> used to fail, then it's actually
a *good* thing to fail immediately.
2) If you put the check for present-but-empty target dev in virDomainNetDefValidate(),
then the validation will be done any time a new device is parsed, but specifically
*won't* be done during the domain parse when libvirtd starts up. This avoids the
"disappearing domains" problem you mention (and is specifically why the separate
validation functions were added - there is one hypervisor-agnostic function for each type
of device, and a separate function for the entire domain (all in conf/domain_conf.c), as
well as similar hypervisor-specific validation functions in qemu/qemu_domain.c (look for
the functions below qemuDomainDeviceDefValidate(), e.g. the function for network
interfaces is qemuDomainDeviceDefValidateNetwork(); no, don't ask me why the name uses
a different pattern than the hypervisor-agnostic functions!).
(You'll want to add this validation in the hypervisor-agnostic function, BTW.)
> This way we don't have two different set of
> autogenerated names. This can even go unnoticed given we fail
> such starts anyway after [1] (which is fixed by [*]).
So you're saying that <target dev=''/> has worked since
"forever" (as far as you know), until it stopped working in 4.6.0, and the patch
you posted this morning fixes it, right?
Exactly.
Nikolay
I vote for putting a check in virDomainNetDefValidate() that errors out if it finds
target dev present but empty. There may be some who would say "it's existing
behavior and has been like this for a long time, so we have to preserve it", but
since the schema doesn't allow it and we've never documented it, I think it's
reasonable to disallow it (except for, as you say, existing domain definitions at reboot
time; NB: there will still be an error if a domain containing this now-illegal setting is
ever edited).
> [1] is
> commited on 2018 Jul 23 and released in 4.6.0. However Centos 7
> is based on 4.5.0 and virNetDevTapCreate has logic to copy
> generated tap name from kernel like since the beginning (I managed
> to follow the history to 2011).
Yes, we rely on it to fill in the numbers of the vnet<N> devices. That way we let
the kernel prevent duplicate device names rather than needing to (attempt to) prevent
races between threads, as we have to do with macvtap devices (since macvtap doesn't
have this capability).
>
> Nikolay
>
> [*] at least with recent "virStrncpy: fix to successfully copy empty
string" applied
> [1] 7d70a63b util: Improve virStrncpy() implementation
>