On 09/23/2016 08:24 AM, Nehal J Wani wrote:
On Fri, Sep 23, 2016 at 4:51 PM, Peter Krempa
<pkrempa(a)redhat.com> wrote:
> On Fri, Sep 23, 2016 at 15:33:46 +0530, Nehal J Wani wrote:
>> On Fri, Sep 23, 2016 at 1:20 PM, Peter Krempa <pkrempa(a)redhat.com> wrote:
>>> On Fri, Sep 23, 2016 at 12:19:49 +0530, Nehal J Wani wrote:
>>>> The default dhcp lease time set by dnsmasq is only one hour, which can
be
>>>> pretty small for developers relying on ip address(es) to be consistent
>>>> across reboots.
>>> This should be handled through the leases file which is supposed to
>>> remember the IP addresses given to a host despite a short lease time. Is
>>> that broken for you?
>> We can't ask developers to parse the .status file and the DHCP leases
>> API would never show expired leases. Or am I misinterpreting what you
> If a lease is expired then the VM is necessarily not using it for a
> while.
>
>> mean by 'remember'? Even dnsmasq depends on the output for the init
> I meant 'remember' as dnsmasq should assign the same address even if the
> lease was expired for a while.
Yes, but then you are assuming that dnsmasq hasn't been restarted
after the lease has expired. User may switch distros in between or
shutdown is computer, etc.
>> event sent to the leaseshelper program, which will not output lease
>> which have expired.
> So then maybe this is the actual bug. The expired leases still can be
> renewed and AFAIK dnsmasq reloads them even for the expired entries so
> we probably should remove that part from the leaseshelper.
After addition of leaseshelper, because of the --leasefile-ro option
that we specify to dnsmasq, the leases database is entirely controlled
by it. If the developer shuts down his VM, the lease expires and then
he restarts the computer (which will involve restarting the dnsmasq
service), dnsmasq will rely on the leaseshelper program for the
initial input (aka init event). The first parameter from the 'init'
output is the expiry time of the lease. If the lease has already
expired, then we can't print it in virLeasePrintLeases(). Hence I fail
to understand why you would think that this may be a bug.
Why would libvirt/leaseshelper renew a certain lease? Isn't that the
job of the client?
I'm pretty sure that's not what Peter is suggesting. What he's saying is
that dnsmasq normally keeps track of even leases that have expired and
tries not to give them out to others who are simply requesting "give me
an address", so that when a client returns at some later time they have
a chance of getting back the same address they had before even if their
lease had expired (as long as it wasn't needed for some other client due
to all the other available addresses being used).
I at least *thought* this worked with a standard dnsmasq setup.
>>>> This patch adds support for setting a lease time in
the network definition.
>>>> For now, all IP ranges under one <dhcp> will share a common
leasetime.
>>>>
>>>> An example:
>>>> <dhcp>
>>>> <leasetime units='hours'>12</leasetime>
>>>> </dhcp>
>>> This sounds like a reasonable feature, but I don't really want to sell
>>> it as a workaround for the problem you've described above. Setting the
>>> lease time is useful, but guests should get the same IP addresses
>>> thanks to the lease file.
>> Actually, the major reason behind this feature was mentioned at
>>
https://www.redhat.com/archives/libvir-list/2016-April/msg01071.html
I'm a bit curious if you had any contact with the author of that patch
other than CC'ing them to your patch mail. From the BZ
(
https://bugzilla.redhat.com/show_bug.cgi?id=913446 ), it sounds like he
was still intending to update his patch but was bogged down implementing
some extra functionality.
> As I've said, adding ability to specify longer lease time may
be
> desired, but the fact that you don't get the address most probably is
> not related to this and should be addressed separately.
>
>> Let's say that the lease time is 2 hours and I shutdown my VM. Now I
>> reboot it after 2hours 5mins, the lease will be gone, since it will be
> The lease will expire, but dnsmasq still can reuse that address for the
> particular mac address until the pool is depleted.
>
>> expired. The .status file is read and the leases which have expired
>> are removed. So, there is no guarantee that my VM will get the same IP
> Again, this looks like the actual bug.
Agreed. Here's another report of what I think is the same bug:
https://www.redhat.com/archives/libvir-list/2016-September/msg00739.html
and a patch that attempts to fix it in a different manner (by adding
"dhcp-authoritative" to the dnsmasq.conf file):
https://www.redhat.com/archives/libvir-list/2016-September/msg00909.html
I actually don't think that either method is the correct solution to the
problem.
To back up for a minute - I think the source of everyone's problems is
that dnsmasq as configured by current libvirt *completely* forgets
leases as soon as they have expired, so that when a guest asks for the
same address again, it is refused (because 1) although dnsmasq doesn't
have any info that the address is currently assigned to anyone else, it
is concerned that it may not be the only dhcp server on the network, and
some other server may have given it out, and 2) what business does some
lowly client have demanding which address it wants from dnsmasq
anyway?!? </tongueInCheek>).
Martin's patch tries to solve the problem with "dhcp-authoritative"
which, as far as I understand, tells dnsmasq "you are the keeper of
*all* lease information on this network, so if you think the address is
unused, it really is unused"; dnsmasq uses this information to freely
grant a guest any address it asks for, as long as there is no current
lease for it. This sounds troublesome to me - If client A's lease has
expired (because it's turned off for a bit), is it okay for client B to
come barging in and insist on grabbing the address that client A has
just lost even though there are many other addresses still available?
Sure, technically it's legal, but it seems unnecessarily disruptive.
On the other hand, this patch tries to solve the problem by providing a
way to lengthen the lease. As Peter points out a few paragraphs below,
unnecessarily lengthening the lease is a bad idea too, because it will
lead to complete depletion of the address pool in a network that has a
lot of clients who are only transiently online (think of a test setup
where maybe 2000 new guests are created, used, and destroyed in a day).
I think the *real* solution is to fix the lease handling so that dnsmasq
remembers leases after they've expired (assuming that can be done once
leasefile-ro is set). They would be marked as "expired", and probably
not even reported externally, but all their info would still be there
internally for dnsmasq's use when considering what to do with incoming
requests for specific IP addresses.
>
>> address again. Hence the developer would want to increase the lease
> No it's not guaranteed, but very likely.
>
>> time to something big, like 1 week, and go to sleep happily when the
>> VM is turned off at night :)
> That's a rather weak point. Setting too long lease time may on the other
> hand exhaust the pool. Usually the configured DHCP lease times are
> rather short.
Again agreed (see above).
>
>>>> The value for attribute 'units' can be
seconds/hours/days/weeks/s/h/d/w.
>>>> If the leasetime specified is -1, it is considered to be infinite.
> [...]
>
>>>> diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng
>>>> index 1b4f980..023edfb 100644
>>>> --- a/docs/schemas/basictypes.rng
>>>> +++ b/docs/schemas/basictypes.rng
>>>> @@ -13,6 +13,12 @@
>>>> <param name='pattern'>[0-9]+</param>
>>>> </data>
>>>> </define>
>>>> + <!-- Our long doesn"t allow a leading "+" in its
lexical form -->
>>>> + <define name='long'>
>>>> + <data type='long'>
>>> This looks weird. Also why would you need a separate type for this?
>>>
>> I didn't find any available data type for signed integers. Which one
>> would you rather use? (Signed, because we want to support -1 too)
> A quick grep revealled that there's a builtin "long" type used in at
> least one case.
>
>>>> + <param name='pattern'>-?[0-9]+</param>
>>>> + </data>
> [...]
>
>>>> diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c
>>>> index aa39776..d2372f2 100644
>>>> --- a/src/conf/network_conf.c
>>>> +++ b/src/conf/network_conf.c
>>>> @@ -1032,21 +1032,76 @@ virNetworkDHCPHostDefParseXML(const char
*networkName,
> [...]
>
>>>> +
>>>> + save = ctxt->node;
>>>> + ctxt->node = node;
>>>> +
>>>> + leasetimeRV = virXPathLongLong("string(./text())", ctxt,
&leasetime);
>>>> + if (leasetimeRV == -2) {
>>>> + virReportError(VIR_ERR_XML_ERROR,
>>>> + _("Invalid long long value specified for
leasetime "
>>>> + "in definition of network
'%s'"),
>>>> + networkName);
>>>> + goto cleanup;
>>>> + } else {
>>>> + if (leasetime < 0) {
>>>> + leasetime = -1; /* infinite */
>>>> + } else {
>>>> + unit = virXPathString("string(./@unit)", ctxt);
>>>> + if (virScaleTime(&leasetime, unit, scale,
>>>> + VIR_NETWORK_DHCP_LEASE_TIME_MAX) < 0) {
>>>> + // let virScaleTime() report the appropriate error
>>> '//' comments should not be used in libvirt. Also the comment does
not
>>> make sense since it's obvious what is happening.
>> I have a query regarding this. If we rely on virScaleTime to report
>> the appropriate error, it wouldn't be including the network name. Is
> That happens in quite a few cases.
>
>> that bad? Should I be modifying virScaleTime to return different
>> return values for different errors and shape the error message
>> appropriately here? (That would sort be inconsistent with
> No, not really.
>
>> virScaleInteger which doesn't have different error codes for different
>> scenarios)
>>
>> Whoops on the comment. Should have read hacking.html more carefully!
> [...]
>
>>>> @@ -2675,6 +2731,13 @@ virNetworkIPDefFormat(virBufferPtr buf,
>>>> virBufferAddLit(buf, "<dhcp>\n");
>>>> virBufferAdjustIndent(buf, 2);
>>>>
>>>> + if (def->leasetime) {
>>>> + virBufferAddLit(buf, "<leasetime");
>>>> + virBufferAsprintf(buf, "
unit='seconds'>%lld",
>>> Why aren't we remembering the unit the user used originally? Other
>>> places can't do that since the unit was added later, but with this newly
>>> added code you can remember and format back the unit just fine.
>>>
>> Actually, I followed what was originally done for memory units. Even
>> if user specifies MiB, everything is stored in KiB and on dumpxml, the
> This was because libvirt added support for scaled integers later than
> the original code. Thus we needed to keep compatibility. This is not
> required for new elements.
>
>> user is shown the unit in KiB. In case we want to remember the unit
>> specified by the user, we'll have to add another member in the
>> _virNetworkIPDef struct. Since we haven't done this anywhere, I was
>> apprehensive about having inconsistent behaviour for 'unit'.
>>
>>>> + def->leasetime);
>>>> + virBufferAddLit(buf, "</leasetime>\n");
>>>> + }
> [...]
>
>>>> diff --git a/src/util/virutil.c b/src/util/virutil.c
>>>> index b57a195..81a60e6 100644
>>>> --- a/src/util/virutil.c
>>>> +++ b/src/util/virutil.c
>>>> @@ -275,6 +275,76 @@ virHexToBin(unsigned char c)
>>>> }
>>>> }
>>>>
>>>> +/**
>>>> + * virScaleTime:
>>> This change should be done as a separate patch.
>>>
>>>> + * @value: pointer to the integer which is supposed to hold value
>>>> + * @unit: pointer to the string holding the unit
>>>> + * @scale: integer holding the value of scale
>>> ^^ at least this value is not documented enough.
>>>
>>>> + * @limit: upper limit on scaled value
>>>> + *
>>>> + * Scale an integer @value in-place by an optional case-insensitive
@unit,
>>> The statement about case insensitivity doesn't look to be true according
>>> to the code blow.
>> With case insensitivity, 'Seconds' and 'SEconDs' would also be
valid
>> inputs, and it would be difficult to accommodate it in the .rng file,
>> plus I was told that if attributes are words, then we shouldn't be
>> going for case-insensitivity anyway. I'll fix this in the next patch
>> set.
I had already said in IRC when you asked about representing case
insensitivity in the RNG that any attribute values should *not* be made
case-insensitive. Case insensitivity is for the FAT32 filesystem, not
for libvirt XML.
> Either case, the comment and code should not behave differently.
The
> comment looks copied from the memory number scaler, but the code
> certainly doesn't behave that way.
>
>>>> + * defaulting to @scale if @unit is NULL or empty. Recognized units
include
>>>> + * (w)eeks, (d)ays, (h)ours, (m)inutes and (s)econds. Ensure that the
result
We don't want to have "w" and "weeks" represent the same thing. A
single
string should represent a single value. This can be easily handled by
setting up and enum, then adding VIR_ENUM_DECL() and VIR_ENUM_IMPL() for
it. We're not reimplementing /usr/bin/date.
>>>> + * does not exceed @limit. Return 0 on success, -1
with error message raised
>>>> + * on failure.
>>> This does not document the scale of the returned value at all.
>>>
>>>> + */
>>>> +int
>>>> +virScaleTime(long long *value, const char *unit,
>>>> + long long scale, long long limit)
>>> Scale can't really be negative. Also the result being negative does not
>>> make much sense to me.
>> Should this be changed to unsigned long long or unsigned long? Since
>> leasetime has to be a signed integer, what would be the ideal way to
>> pass it to the modified function? Should I typecast it?
From above, it seems you want it to accept a negative value so you can
specify "-1" to be "unlimited"?' What about "0"? For
that matter, do you
*really* want to allow that?
> Negative scale doesn't make sense. For time values, negative
time value
> might make sense for relative time values. You need to check then that
> the overflow check works for negative numbers as well, which it doesn't
> since it was just copied from the memory integer scaler.
>
>