On 04/15/2014 10:31 AM, Laine Stump wrote:
On 04/15/2014 03:27 PM, John Ferlan wrote:
>
> On 04/10/2014 09:19 AM, Laine Stump wrote:
>> This eliminates problems with upgrading from older libvirt that stores
>> network status in /var/lib/libvirt/network to newer libvirt that
>> stores it in /var/run/libvirt/network, by also reading any status
>> files from the old location, saving them to the new location, and
>> removing them from the old location.
>>
>> This will not help those trying to downgrade, but in practice this
>> will only be problematic in two cases
>>
>> 1) If there are networks with network-wide bandwidth limits configured
>> and in use by a guest during a downgrade to "old" libvirt. In this
>> case, the class ID's used for that network's tc rules, as well as
>> the currently in-use bandwidth "floor" will be forgotten
>>
>> 2) If someone does this: 1) upgrade, 2) downgrade, 3) modify running
>> state of network (e.g. add a static dhcp host, etc), 4) upgrade. In
>> this case, the modifications to the running network will be lost
>> (but not any persistent changes to the network's config).
>>
>> I have an idea of how to make these cases work properly as well
>> (involving symlinks in /var/lib pointing to /var/run) but want to
>> avoid keeping such an ugly legacy around (forever) unlesss others
>> think it is absolutely necessary to support flawless downgrades across
>> this line *while remaining in service*.
> I would have to believe/think the downgrade issue has happened before
> with some other change to config or status file locations.
Thinking about it, there was one similar situation - when sessionmode
libvirtd switched from using gnome2 config directories to gnome3. I
don't recall if anything special was done in that case, or if it was
just considered not important because ... session mode.
AFAIK, the run directory for the qemu, lxc, and network drivers hasn't
before been changed, so there's no precendent to follow there.
> Of course, if
> some state/status is meant to "live" beyond the current execution, then
> perhaps /run/ isn't the right place to store that. Wouldn't something
> like that end up in /etc/libvirt... somewhere? That is usage of
> '--live' vs. '--config' when configuring limits?
No, this is exactly what the qemu and lxc drivers do. Their persistent
config is stored in /etc/libvirt, and their runtime status is stored in
/var/run/libvirt. This is specifically so that when the host is
rebooted, stale status is automatically cleared out to avoid confusing
libvirtd when it is started the first time after a reboot. When libvirtd
is subsequently restarted, its pre-restart state for all domains and
networks *must* be maintained to avoid interruption of service, and
/var/run is just the right place to do that.
Right - perhaps I understated my thoughts... I was specifically
thinking/typing out loud that if someone made a change to their
configuration "live only", then downgraded - should they really expect
for that to be persistent when the network is restarted after the
downgrade? I guess I would think perhaps not. If they really wanted a
change to persist, then one would think (or assume - hah!) that they
would ensure their changes were persistent before they downgraded.
(As an aside, in the case of the network driver, much of the state of a
network is reconstructed when libvirtd is restarted rather than relying
on a correct statedir; this was mostly necessary due to the
unreliability of the state data in the past. The only reason I'm going
to all this trouble now is that I have a need for a piece of status that
can't be reconstructed from other host and domain information but can
only come the status file itself - the answer to the basic "is this
network active?" question.
Right - been down this path before with a previous product and yeah it's
ugly/painful. Of course I had a lot more history with that product so I
knew which rabbit holes to not jump into for fear of never coming back
out :-)... Still navigating my way through the libvirt mazes of twisty
little passages...
>
>> ---
>> src/network/bridge_driver.c | 40 ++++++++++++++++++++++++++++++++++++
>> src/network/bridge_driver_platform.h | 3 ++-
>> 2 files changed, 42 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c
>> index a027b47..98d10bd 100644
>> --- a/src/network/bridge_driver.c
>> +++ b/src/network/bridge_driver.c
>> @@ -459,6 +459,8 @@ networkStateInitialize(bool privileged,
>> SYSCONFDIR "/libvirt/qemu/networks/autostart")
< 0 ||
>> VIR_STRDUP(driverState->stateDir,
>> LOCALSTATEDIR "/run/libvirt/network") < 0
||
>> + VIR_STRDUP(driverState->oldStateDir,
>> + LOCALSTATEDIR "/lib/libvirt/network") < 0
||
> You'll need a corresponding VIR_FREE() in networkStateCleanup()
Oops!
>> VIR_STRDUP(driverState->pidDir,
>> LOCALSTATEDIR "/run/libvirt/network") < 0
||
>> VIR_STRDUP(driverState->dnsmasqStateDir,
>> @@ -498,6 +500,44 @@ networkStateInitialize(bool privileged,
>> /* if this fails now, it will be retried later with dnsmasqCapsRefresh() */
>> driverState->dnsmasqCaps = dnsmasqCapsNewFromBinary(DNSMASQ);
>>
>> + /* Due to a change in location of network state xml beginning in
>> + * libvirt 1.2.4, we must check for state files in two
>> + * locations. Anything found in the old location must be written
>> + * to the new location, then erased from the old location.
>> + */
> ahh... so this isn't done for the non privileged case...
Right. Not needed because we haven't changed the location of the
stateDir for non-privileged (and, as I said previously, the network
driver doesn't actually work for non-privileged mode, beyond simply not
crashing :-P)
> So if
> privileged and the network state exists in the old state dir... Also
> could use STRNEQ_NULLABLE() right?
No. STRNEQ_NULLABLE(a, b) will be TRUE in the case that b is NULL, but
we only want the body of the conditional executed if oldStateDir is
NON-NULL and it doesn't equal stateDir.
>
>> + if (driverState->oldStateDir &&
>> + STRNEQ(driverState->stateDir, driverState->oldStateDir)) {
>> + size_t i;
>> +
>> + /* read all */
>> + if (virNetworkLoadAllState(&driverState->networks,
>> + driverState->oldStateDir) < 0)
>> + goto error;
> So any/all failures to read the old entry/style causes failure... Could
> this be problematic if something ends up existing in the "new" format
> that isn't in the "old" format and thus causes exit by the changed
> parsing code?
Well, we haven't changed the grammar at all, only the location we're
reading from. And we know that, by definition, if we change the grammar
in the future, it will be backward compatible with the current grammar,
so there will be no failure caused by incompatible grammar during an
upgrade.
In the case that there is some other problem reading the status files
during upgrade, the reaction would be no different from what would
happen if there was a problem reading the status files when we *aren't*
upgrading - the network driver would fail to load, and the status files
would still exist in the oldStateDir, ready to a retry the next time we
start.
For downgrades, we're not trying to move the state files back to the old
location anyway, so I don't see any extra problem beyond the fact that
the status directory known by the old libvirt will be empty, so some
bandwidth state will be lost, and macvtap/hostdev networks will be
marked inactive (but that's already covered in the comments)
Yep - OK reasonable... again typing/thinking out loud while reading
> This is perhaps the opposite problem to flawless
> downgrades. It also points out the question of how long do we keep this
> old reading code around before it's removed?
In a theoretic sense, we should never get rid of it. But practically
speaking someday somebody may decide to eliminate it (when libvirt has
advanced so many versions that we're sure everyone is running on a
"post-epoch" version, or at least that they can't do a live upgrade from
a pre-epoch version to the newest version anyway).
This is one of those "policy" things - that is - does libvirt have time
policy for code to live... Perhaps you just need to set yourself a
little reminder that in libvirt 1.2.6 or somewhere thereafter you'll
come back and delete the code. It doesn't hurt to have it other than
being able to recall history.
>
>> +
>> + for (i = 0; i < driverState->networks.count; i++) {
>> + virNetworkObjPtr network = driverState->networks.objs[i];
>> + char *oldFile;
>> + int status;
>> +
>> + /* save in new location */
>> + virNetworkObjLock(network);
>> + status = virNetworkSaveStatus(driverState->stateDir, network);
>> + virNetworkObjUnlock(network);
>> + if (status < 0)
>> + goto cleanup;
>> +
>> + /* remove from old location */
>> + oldFile = virNetworkConfigFile(driverState->oldStateDir,
>> + network->def->name);
>> + if (!oldFile)
>> + goto cleanup;
>> + unlink(oldFile);
> Alternatively to removing - rename to some suffix which at least gives a
> chance for someone going through a downgrade process to recover the old
> file. Of course documenting this would be nice too.
In that case, the most recent status file will exist in the new stateDir
anyway, but if I renamed any file I found in oldStateDir and left it
there for potential re-use during a future downgrade, it might be
several *months* old and have survived multiple edits to the network +
reboots of the host. In the meantime, if we *don't* downgrade, we'll
have extra files cluttering up /var/lib/libvirt confusing people
essentially forever.
Much better to get rid of it as soon as possible.
right - but if there ever was a need to downgrade - the last known
version of something is available. I'm fine with deleting - just
providing the obvious alternative since no code other than a downgraded
environment would read it. Far more detrimental to keep cruft around.
In the end it comes down to which is preferred to support - the hassle
of having some shrapnel from a downgrade from someone that lost their
changes or the hassle of having stale state/status data in a downgrade.
>
>> + VIR_FREE(oldFile);
>> + }
>> + /* these will all be re-read from their new location */
>> + virNetworkObjListFree(&driverState->networks);
>> + }
>> +
> This hunk should be it's own local/static function...
Any particular reason for this? It is only used once in one place, and
the function it's in isn't all that long. I suppose I can move it into a
static function declared just above this one though.
Mostly readability/flow/isolation and 24 lines of code in it's own
separate function that could very easily be eliminated. Just seems
other code in the function will make calls and then there's this hunk of
code that doesn't... I'm not against leaving it either.
I'm OK with your other explanations - I don't see anyone squawking
loudly about moving the location and/or saving old state "just in case".
Seems just an addition of the VIR_FREE() then the combination of 2&3 to
keep the bisection from generating a false positive for someone and
we're good to go.
John
>
> ACK pending your thoughts regarding my comments...
>
> John
>
>> if (virNetworkLoadAllState(&driverState->networks,
>> driverState->stateDir) < 0)
>> goto error;
>> diff --git a/src/network/bridge_driver_platform.h
b/src/network/bridge_driver_platform.h
>> index 6a571da..491068a 100644
>> --- a/src/network/bridge_driver_platform.h
>> +++ b/src/network/bridge_driver_platform.h
>> @@ -1,7 +1,7 @@
>> /*
>> * bridge_driver_platform.h: platform specific routines for bridge driver
>> *
>> - * Copyright (C) 2006-2013 Red Hat, Inc.
>> + * Copyright (C) 2006-2014 Red Hat, Inc.
>> * Copyright (C) 2006 Daniel P. Berrange
>> *
>> * This library is free software; you can redistribute it and/or
>> @@ -39,6 +39,7 @@ struct _virNetworkDriverState {
>> char *networkConfigDir;
>> char *networkAutostartDir;
>> char *stateDir;
>> + char *oldStateDir;
>> char *pidDir;
>> char *dnsmasqStateDir;
>> char *radvdStateDir;
>>