On 4/18/23 1:18 PM, Nick Guenther wrote:
April 18, 2023 3:37 AM, "Peter Krempa"
<pkrempa(a)redhat.com> wrote:
> cases of code style not being aligned from what libvirt does normally ...
I'm very happy to conform my style as needed. I just want my users to be able to use
libvirt (if they can't I'll teach them to use qemu directly I suppose but
that's annoying). I looked for style guidelines in
https://libvirt.org/hacking.html
but I didn't find them. The style robot did nag me at
https://gitlab.com/kousu/libvirt/-/pipelines/840365326 though, so I will try my best to
learn from it!
> This will not work in a multi-threaded process, which can potentially
> call this function multiple times at the same time for starting multiple
> VMs. One of the threads changes the directory of the process, second thread
> changes it again and then one of the threads creates the socket in the
> wrong directory.
I didn't think about multithreading. I'll look around the codebase to learn what
locking API you're using.
> Also any failure here doesn't restore the directory.
Ah true, that's obvious in retrospect. My python is showing. I'll fix that, but
maybe after discussing the better ideas below.
April 18, 2023 4:12 AM, "Daniel P. Berrangé" <berrange(a)redhat.com>
wrote:
> On Tue, Apr 18, 2023 at 02:59:26AM -0400, Nick Guenther wrote:
>
>> The qemu driver creates IPC sockets using absolute paths,
>> but under POSIX socket paths are constrained pretty tightly.
>> On systems with homedirs on an unusual mount point, like
>> network homedirs, or just particularly long usernames, this
>> could make starting VMs under qemu:///session impossible.
>>
>> Resolves
https://gitlab.com/libvirt/libvirt/-/issues/466
>
> Example path from that bug
>
>
/home/WAVES.EXAMPLE.ORG/u123456/.config/libvirt/qemu/channel/target/domain-11-alpinelinux3.14/org.qe
> u.guest_agent.0
>
> IMHO the key problem here is not your $HOME location, but rather
> libvirt being ridiculously verbose in the nested dir structuion
>
> Looking at the pieces
>
> * /home/WAVES.EXAMPLE.ORG/u123456 -> 31 chars
>
> $HOME, we can't change this
>
> * /.config/libvirt/qemu -> 21 chars
>
> Libvirt's config directory scope to the driver, we don't
> want to change this
>
> * /channel/target/domain-11-alpinelinux3.14 => 42 chars
>
> Arbitrarily inventing nesting for the sockets, we can
> change at will
>
> * /org.qemu.guest_agent.0 -> 23 chars
>
> Either user specified, or matches the port name, ideally
> don't change this
>
> So we've got 31 + 21 + 23 == 75 chars we don't want to /can't
> change, but that leaves 33 to play with
>
> I feel like having 'domain-' in the path is redudant as
> everything under /channel is for a domain. Having 'target'
> also feels redundant to me.
>
> We could use '/channel/tgt-11-alpinelinux3.14' == 31 chars
> or just /channel/11-alpinelinux3.14 == 27 chars, which
> gives extra scope for a longer $HOME.
>
I agree that these paths are a bit excessive and reducing them would help. I'm all
for that.
But any patch that fiddles with lengths while still using $HOME is not a long-term
solution,
a sysadmin can and someone always will set a longer home directory. And in those cases
the
end-user is out of luck, because they can't control where their home directory is.
Plus the user could also replace ".config" by setting XDG_CONFIG_DIR,
True, but to be fair XDG_CONFIG_DIR (which is retrieved vit
virGetUserConfigDirectory() --> g_get_user_config_dir()) shouldn't have
been used for the user agent socket in the first place; should have been
XDG_RUNTIME_DIR (virGetRuntimeDirectory() --> g_get_user_runtime_dir())
instead.
Also, if you change XDG_CONFIG_DIR, btw, then any other path based on
virGetUserConfigDirectory() (see src/*/*_conf.c) also will change.
Hopefully there's nothing in that path that's actual config, and needs
to be there and not in /var/run/user/$UID/libvirt/* (I think not, but
I'm just thinking of as many potential pitfalls as possible, since
that's what I'm best at :-))
lengthen the port name
(org.qemu.guest_agent.0) by specifying it, or the user-specified and the name of the
domain-specific subfolder (domain-11-alpinelinux3.14) is user-specified (partly: libvirt
clips it to a maximum length, experimentally, 30?).
These are more nuisances than outright bugs because, being user-controlled, the user
could, in theory, figure out how to shorten them if they wanted to use libvirt. But it
would be nicer if we didn't make them fight libvirt.
Instead of fighting with $HOME, what about using /run? That's where sockets usually
go. ~/.config is a strange place to be sticking live sockets -- they aren't
configuration at all.
If we moved everything to /run/user/$UID/libvirt/qemu/ then as $UID is a 32 bit integer
it's <= 10 decimal digits which makes this path length <= 34
Note that changing the location of any runtime file/socket of libvirt
will likely lead to problems during a libvirt upgrade, unless the path
for every one of those objects is stored in the domain status XML - if
libvirt gets updated to use a different path while there are active
domains (or networks), then the new libvirt binary will attempt to use
the new path assuming the file/socket/whatever is in the new location,
but it will still be in the old location. In order to change the path
for any runtime object without requiring a total shutdown for an upgrade
(which is something that we *never* want to do), we also need to store
the full path for every affected file/socket in the
domain/network(/storage?/other?) status XML.
Experimentally the "/channel/target/domain-11-alpinelinux3.14" part is clipped
to 36 chars (maybe 38? I haven't tested what happens with hundreds of VMs, presumably
they get named "domain-2424-alpinelinux3.14")
And as you said "org.qemu.guest_agent.0" is always 23.
So in total that's <= 93 chars. That's still tight, pretty close to the 107
limit (or 104 on macOS), but feasible.
In fact libvirt is already using this folder for tracking qemu's PIDs:
/run/user/703204575/libvirt/qemu
└── run
├── abcdefg.pid
├── abcdefg.xml
├── abcdef.pid
├── abcdef.xml
├── abcde.pid
├── abcde.xml
├── abcd.pid
├── abcd.xml
├── a.pid
├── autostarted
├── a.xml
├── dbus
├── driver.pid
└── slirp
3 directories, 12 files
We could move the rest in there and fix the bug. I'd also like to incorporate your
suggestion of dropping the "target" subfolder, and maybe picking either just
`domain-$N` or `$name` instead of `domain-$N-$name`.
Once we overcome the upgrade problem, I think there's a *lot* of excess
in many paths. For example, cfg->stateDir is
"/var/run/user/$UID/qemu/run" - the 2nd "run" is completely
superfluous.
Also, cfg->slirpDir is /var/run/user/$UID/qemu/run/slirp, and
passtStateDir and dbusStateDir are similarly repetitive (and the
associated files/sockets for each of those all have slirp/passt/dbus in
the name anyway, so *two* levels of the path are repetitively redundant).
But again, it's not as simple as just changing the path in the code - we
need to make sure that when upgrading a live system from the "old paths"
to the "new paths" that everything works properly without requiring
either the host or the guests to be restarted.