[libvirt] [PATCH] network: Add support for configuring dhcp lease time

The default dhcp lease time set by dnsmasq is only one hour, which can be pretty small for developers relying on ip address(es) to be consistent across reboots. This patch adds support for setting a lease time in the network definition. For now, all IP ranges under one <dhcp> will share a common leasetime. An example: <dhcp> <leasetime units='hours'>12</leasetime> </dhcp> The value for attribute 'units' can be seconds/hours/days/weeks/s/h/d/w. If the leasetime specified is -1, it is considered to be infinite. docs/schema/{basictypes,network}.rng * Add datatype long * Introduce timeUnits, scaledTime * Add schema for tag: leasetime src/util/virutil.{c,h} * Introduce virScaleTime() src/conf/network_conf.h * Extend virNetworkIPDef to accomodate leastime * Define VIR_NETWORK_DHCP_LEASE_TIME_MAX src/conf/network_conf.c * Introduce virNetworkDHCPLeaseTimeParseXML * Modify virNetworkDHCPDefParseXML to have XPath context pointer tests/* * Add test cases for xml parsing and conversion Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=913446 --- docs/schemas/basictypes.rng | 20 +++++++ docs/schemas/network.rng | 10 +++- src/conf/network_conf.c | 67 ++++++++++++++++++++- src/conf/network_conf.h | 11 ++++ src/network/bridge_driver.c | 8 +++ src/util/virutil.c | 70 ++++++++++++++++++++++ src/util/virutil.h | 3 + .../isolated-network-with-lease-time.conf | 17 ++++++ .../isolated-network-with-lease-time.xml | 24 ++++++++ tests/networkxml2conftest.c | 1 + .../isolated-network-with-lease-time.xml | 24 ++++++++ .../isolated-network-with-lease-time.xml | 24 ++++++++ tests/networkxml2xmltest.c | 1 + 13 files changed, 276 insertions(+), 4 deletions(-) create mode 100644 tests/networkxml2confdata/isolated-network-with-lease-time.conf create mode 100644 tests/networkxml2confdata/isolated-network-with-lease-time.xml create mode 100644 tests/networkxml2xmlin/isolated-network-with-lease-time.xml create mode 100644 tests/networkxml2xmlout/isolated-network-with-lease-time.xml diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng index 1b4f980..023edfb 100644 --- a/docs/schemas/basictypes.rng +++ b/docs/schemas/basictypes.rng @@ -13,6 +13,12 @@ <param name='pattern'>[0-9]+</param> </data> </define> + <!-- Our long doesn"t allow a leading "+" in its lexical form --> + <define name='long'> + <data type='long'> + <param name='pattern'>-?[0-9]+</param> + </data> + </define> <define name='hexuint'> <data type='string'> @@ -283,6 +289,20 @@ <ref name='unsignedLong'/> </define> + <define name='timeUnit'> + <data type='string'> + <param name='pattern'>(seconds|minutes|hours|days|weeks|s|m|h|d|w)</param> + </data> + </define> + <define name='scaledTime'> + <optional> + <attribute name='timeUnit'> + <ref name='timeUnit'/> + </attribute> + </optional> + <ref name='long'/> + </define> + <define name="pciDomain"> <ref name="uint16"/> </define> diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 1a18e64..66b65bc 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -337,9 +337,15 @@ </element> </optional> <optional> - <!-- Define the range(s) of IP addresses that the DHCP - server should hand out --> <element name="dhcp"> + <optional> + <element name="leasetime"> + <ref name="scaledTime"/> + <attribute name="unit"><ref name="timeUnit"/></attribute> + </element> + </optional> + <!-- Define the range(s) of IP addresses that the DHCP + server should hand out --> <zeroOrMore> <element name="range"> <attribute name="start"><ref name="ipAddr"/></attribute> diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index aa39776..d2372f2 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1032,21 +1032,76 @@ virNetworkDHCPHostDefParseXML(const char *networkName, } static int +virNetworkDHCPLeaseTimeParseXML(const char *networkName, + virNetworkIPDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node) +{ + int ret = -1; + int scale = 1; + xmlNodePtr save; + char *unit = NULL; + int leasetimeRV = 0; + long long leasetime; + + save = ctxt->node; + ctxt->node = node; + + leasetimeRV = virXPathLongLong("string(./text())", ctxt, &leasetime); + if (leasetimeRV == -2) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid long long value specified for leasetime " + "in definition of network '%s'"), + networkName); + goto cleanup; + } else { + if (leasetime < 0) { + leasetime = -1; /* infinite */ + } else { + unit = virXPathString("string(./@unit)", ctxt); + if (virScaleTime(&leasetime, unit, scale, + VIR_NETWORK_DHCP_LEASE_TIME_MAX) < 0) { + // let virScaleTime() report the appropriate error + goto cleanup; + } + } + } + + def->leasetime = leasetime; + ret = 0; + + cleanup: + VIR_FREE(unit); + ctxt->node = save; + return ret; +} + +static int virNetworkDHCPDefParseXML(const char *networkName, xmlNodePtr node, + xmlXPathContextPtr ctxt, virNetworkIPDefPtr def) { int ret = -1; - xmlNodePtr cur; + xmlNodePtr cur, save; virSocketAddrRange range; virNetworkDHCPHostDef host; memset(&range, 0, sizeof(range)); memset(&host, 0, sizeof(host)); + save = ctxt->node; + ctxt->node = node; + cur = node->children; while (cur != NULL) { if (cur->type == XML_ELEMENT_NODE && + xmlStrEqual(cur->name, BAD_CAST "leasetime")) { + + if (virNetworkDHCPLeaseTimeParseXML(networkName, def, ctxt, cur) < 0) + goto cleanup; + + } else if (cur->type == XML_ELEMENT_NODE && xmlStrEqual(cur->name, BAD_CAST "range")) { if (virSocketAddrRangeParseXML(networkName, def, cur, &range) < 0) @@ -1095,6 +1150,7 @@ virNetworkDHCPDefParseXML(const char *networkName, ret = 0; cleanup: virNetworkDHCPHostDefClear(&host); + ctxt->node = save; return ret; } @@ -1607,7 +1663,7 @@ virNetworkIPDefParseXML(const char *networkName, while (cur != NULL) { if (cur->type == XML_ELEMENT_NODE && xmlStrEqual(cur->name, BAD_CAST "dhcp")) { - if (virNetworkDHCPDefParseXML(networkName, cur, def) < 0) + if (virNetworkDHCPDefParseXML(networkName, cur, ctxt, def) < 0) goto cleanup; } else if (cur->type == XML_ELEMENT_NODE && xmlStrEqual(cur->name, BAD_CAST "tftp")) { @@ -2675,6 +2731,13 @@ virNetworkIPDefFormat(virBufferPtr buf, virBufferAddLit(buf, "<dhcp>\n"); virBufferAdjustIndent(buf, 2); + if (def->leasetime) { + virBufferAddLit(buf, "<leasetime"); + virBufferAsprintf(buf, " unit='seconds'>%lld", + def->leasetime); + virBufferAddLit(buf, "</leasetime>\n"); + } + for (i = 0; i < def->nranges; i++) { char *saddr = virSocketAddrFormat(&def->ranges[i].start); if (!saddr) diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index 3b227db..965fd3b 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -26,6 +26,15 @@ # define DNS_RECORD_LENGTH_SRV (512 - 30) /* Limit minus overhead as mentioned in RFC-2782 */ +/** + * VIR_NETWORK_DHCP_LEASE_TIME_MAX: + * + * Macro providing the upper limit on DHCP Lease Time (in seconds). + * Libvirt supports only dnsmasq as of now, and it stores the lease + * time in an unsigned int. + */ +# define VIR_NETWORK_DHCP_LEASE_TIME_MAX UINT_MAX + # include <libxml/parser.h> # include <libxml/tree.h> # include <libxml/xpath.h> @@ -167,6 +176,8 @@ struct _virNetworkIPDef { size_t nhosts; /* Zero or more dhcp hosts */ virNetworkDHCPHostDefPtr hosts; + long long leasetime; /* DHCP lease time for all ranges */ + char *tftproot; char *bootfile; virSocketAddr bootserver; diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 7b99aca..fa628d3 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -1222,6 +1222,14 @@ networkDnsmasqConfContents(virNetworkObjPtr network, saddr, eaddr); if (VIR_SOCKET_ADDR_IS_FAMILY(&ipdef->address, AF_INET6)) virBufferAsprintf(&configbuf, ",%d", prefix); + + if (ipdef->leasetime) { + if (ipdef->leasetime == -1) + virBufferAddLit(&configbuf, ",infinite"); + else + virBufferAsprintf(&configbuf, ",%lld", ipdef->leasetime); + } + virBufferAddLit(&configbuf, "\n"); VIR_FREE(saddr); diff --git a/src/util/virutil.c b/src/util/virutil.c index b57a195..81a60e6 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -275,6 +275,76 @@ virHexToBin(unsigned char c) } } +/** + * virScaleTime: + * @value: pointer to the integer which is supposed to hold value + * @unit: pointer to the string holding the unit + * @scale: integer holding the value of scale + * @limit: upper limit on scaled value + * + * Scale an integer @value in-place by an optional case-insensitive @unit, + * defaulting to @scale if @unit is NULL or empty. Recognized units include + * (w)eeks, (d)ays, (h)ours, (m)inutes and (s)econds. Ensure that the result + * does not exceed @limit. Return 0 on success, -1 with error message raised + * on failure. + */ +int +virScaleTime(long long *value, const char *unit, + long long scale, long long limit) +{ + size_t i; + static char const* const allowed_units[] = {"s", "m", "h", "d", "w", + "seconds", "minutes", "hours", "days", "weeks"}; + + size_t n_allowed_units = ARRAY_CARDINALITY(allowed_units); + + if (!unit || !*unit) { + if (!scale) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid scale %lld"), scale); + return -1; + } + unit = ""; + } else { + for (i = 0; i < n_allowed_units; i++) { + if (STREQ(unit, allowed_units[i])) { + switch (*unit) { + case 'w': + scale *= 7; + /* fall through */ + case 'd': + scale *= 24; + /* fall through */ + case 'h': + scale *= 60; + /* fall through */ + case 'm': + scale *= 60; + /* fall through */ + case 's': + break; + } + break; + } + } + if (i == n_allowed_units) { + virReportError(VIR_ERR_INVALID_ARG, + _("Unknown time unit '%s'"), unit); + return -1; + } + } + + if (*value && *value > (limit / scale)) { + virReportError(VIR_ERR_OVERFLOW, _("Value too large: %lld %s"), + *value, unit); + return -1; + } + + *value *= scale; + return 0; +} + + /* Scale an integer VALUE in-place by an optional case-insensitive * SUFFIX, defaulting to SCALE if suffix is NULL or empty (scale is * typically 1 or 1024). Recognized suffixes include 'b' or 'bytes', diff --git a/src/util/virutil.h b/src/util/virutil.h index 703ec53..5c11f77 100644 --- a/src/util/virutil.h +++ b/src/util/virutil.h @@ -51,6 +51,9 @@ int virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups, unsigned long long capBits, bool clearExistingCaps); +int virScaleTime(long long *value, const char *unit, + long long scale, long long limit); + int virScaleInteger(unsigned long long *value, const char *suffix, unsigned long long scale, unsigned long long limit) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; diff --git a/tests/networkxml2confdata/isolated-network-with-lease-time.conf b/tests/networkxml2confdata/isolated-network-with-lease-time.conf new file mode 100644 index 0000000..cd353f5 --- /dev/null +++ b/tests/networkxml2confdata/isolated-network-with-lease-time.conf @@ -0,0 +1,17 @@ +##WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE +##OVERWRITTEN AND LOST. Changes to this configuration should be made using: +## virsh net-edit private +## or other application using the libvirt API. +## +## dnsmasq conf file created by libvirt +strict-order +except-interface=lo +bind-dynamic +interface=virbr1 +dhcp-range=192.168.123.2,192.168.123.254,infinite +dhcp-no-override +dhcp-range=2001:db8:ca2:2:1::10,2001:db8:ca2:2:1::ff,64,108000 +dhcp-lease-max=493 +dhcp-hostsfile=/var/lib/libvirt/dnsmasq/private.hostsfile +addn-hosts=/var/lib/libvirt/dnsmasq/private.addnhosts +enable-ra diff --git a/tests/networkxml2confdata/isolated-network-with-lease-time.xml b/tests/networkxml2confdata/isolated-network-with-lease-time.xml new file mode 100644 index 0000000..be8ea56 --- /dev/null +++ b/tests/networkxml2confdata/isolated-network-with-lease-time.xml @@ -0,0 +1,24 @@ +<network> + <name>private</name> + <uuid>81ff0d90-c91e-6742-64da-4a736edb9a9b</uuid> + <forward mode='nat'> + <nat> + <port start='1024' end='65535'/> + </nat> + </forward> + <bridge name='virbr1' stp='on' delay='0'/> + <mac address='52:54:00:10:2f:05'/> + <ip address='192.168.123.1' netmask='255.255.255.0'> + <dhcp> + <leasetime unit="minutes">-1</leasetime> + <range start='192.168.123.2' end='192.168.123.254'/> + </dhcp> + </ip> + <ip family="ipv6" address="2001:db8:ca2:2::1" prefix="64" > + <dhcp> + <leasetime unit="h">30</leasetime> + <range start="2001:db8:ca2:2:1::10" end="2001:db8:ca2:2:1::ff" /> + <host id="0:4:7e:7d:f0:7d:a8:bc:c5:d2:13:32:11:ed:16:ea:84:63" name="nwani" ip="2001:db8:ca2:2:3::4" /> + </dhcp> + </ip> +</network> diff --git a/tests/networkxml2conftest.c b/tests/networkxml2conftest.c index 65a0e32..501954d 100644 --- a/tests/networkxml2conftest.c +++ b/tests/networkxml2conftest.c @@ -112,6 +112,7 @@ mymain(void) } while (0) DO_TEST("isolated-network", restricted); + DO_TEST("isolated-network-with-lease-time", dhcpv6); DO_TEST("netboot-network", restricted); DO_TEST("netboot-proxy-network", restricted); DO_TEST("nat-network-dns-srv-record-minimal", restricted); diff --git a/tests/networkxml2xmlin/isolated-network-with-lease-time.xml b/tests/networkxml2xmlin/isolated-network-with-lease-time.xml new file mode 100644 index 0000000..75c9f22 --- /dev/null +++ b/tests/networkxml2xmlin/isolated-network-with-lease-time.xml @@ -0,0 +1,24 @@ +<network> + <name>private</name> + <uuid>81ff0d90-c91e-6742-64da-4a736edb9a9b</uuid> + <forward mode='nat'> + <nat> + <port start='1024' end='65535'/> + </nat> + </forward> + <bridge name='virbr1' stp='on' delay='0'/> + <mac address='52:54:00:10:2f:05'/> + <ip address='192.168.123.1' netmask='255.255.255.0'> + <dhcp> + <leasetime unit="minutes">-1</leasetime> + <range start='192.168.123.2' end='192.168.123.254'/> + </dhcp> + </ip> + <ip family="ipv6" address="2001:db8:ca2:2::1" prefix="64" > + <dhcp> + <leasetime unit="d">30</leasetime> + <range start="2001:db8:ca2:2:1::10" end="2001:db8:ca2:2:1::ff" /> + <host id="0:4:7e:7d:f0:7d:a8:bc:c5:d2:13:32:11:ed:16:ea:84:63" name="nwani" ip="2001:db8:ca2:2:3::4" /> + </dhcp> + </ip> +</network> diff --git a/tests/networkxml2xmlout/isolated-network-with-lease-time.xml b/tests/networkxml2xmlout/isolated-network-with-lease-time.xml new file mode 100644 index 0000000..d9d6ff4 --- /dev/null +++ b/tests/networkxml2xmlout/isolated-network-with-lease-time.xml @@ -0,0 +1,24 @@ +<network> + <name>private</name> + <uuid>81ff0d90-c91e-6742-64da-4a736edb9a9b</uuid> + <forward mode='nat'> + <nat> + <port start='1024' end='65535'/> + </nat> + </forward> + <bridge name='virbr1' stp='on' delay='0'/> + <mac address='52:54:00:10:2f:05'/> + <ip address='192.168.123.1' netmask='255.255.255.0'> + <dhcp> + <leasetime unit='seconds'>-1</leasetime> + <range start='192.168.123.2' end='192.168.123.254'/> + </dhcp> + </ip> + <ip family='ipv6' address='2001:db8:ca2:2::1' prefix='64'> + <dhcp> + <leasetime unit='seconds'>2592000</leasetime> + <range start='2001:db8:ca2:2:1::10' end='2001:db8:ca2:2:1::ff'/> + <host id='0:4:7e:7d:f0:7d:a8:bc:c5:d2:13:32:11:ed:16:ea:84:63' name='nwani' ip='2001:db8:ca2:2:3::4'/> + </dhcp> + </ip> +</network> diff --git a/tests/networkxml2xmltest.c b/tests/networkxml2xmltest.c index 01cd6f7..665f695 100644 --- a/tests/networkxml2xmltest.c +++ b/tests/networkxml2xmltest.c @@ -126,6 +126,7 @@ mymain(void) DO_TEST("dhcp6host-routed-network"); DO_TEST("empty-allow-ipv6"); DO_TEST("isolated-network"); + DO_TEST("isolated-network-with-lease-time"); DO_TEST("routed-network"); DO_TEST("routed-network-no-dns"); DO_TEST_PARSE_ERROR("routed-network-no-dns-extra-elements"); -- 2.7.4

On Fri, Sep 23, 2016 at 12:19:49 +0530, Nehal J Wani wrote:
The default dhcp lease time set by dnsmasq is only one hour, which can be pretty small for developers relying on ip address(es) to be consistent across reboots.
This should be handled through the leases file which is supposed to remember the IP addresses given to a host despite a short lease time. Is that broken for you?
This patch adds support for setting a lease time in the network definition. For now, all IP ranges under one <dhcp> will share a common leasetime.
An example: <dhcp> <leasetime units='hours'>12</leasetime> </dhcp>
This sounds like a reasonable feature, but I don't really want to sell it as a workaround for the problem you've described above. Setting the lease time is useful, but guests should get the same IP addresses thanks to the lease file.
The value for attribute 'units' can be seconds/hours/days/weeks/s/h/d/w. If the leasetime specified is -1, it is considered to be infinite.
docs/schema/{basictypes,network}.rng * Add datatype long * Introduce timeUnits, scaledTime * Add schema for tag: leasetime
src/util/virutil.{c,h} * Introduce virScaleTime()
src/conf/network_conf.h * Extend virNetworkIPDef to accomodate leastime * Define VIR_NETWORK_DHCP_LEASE_TIME_MAX
src/conf/network_conf.c * Introduce virNetworkDHCPLeaseTimeParseXML * Modify virNetworkDHCPDefParseXML to have XPath context pointer
tests/* * Add test cases for xml parsing and conversion
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=913446 --- docs/schemas/basictypes.rng | 20 +++++++ docs/schemas/network.rng | 10 +++- src/conf/network_conf.c | 67 ++++++++++++++++++++- src/conf/network_conf.h | 11 ++++ src/network/bridge_driver.c | 8 +++ src/util/virutil.c | 70 ++++++++++++++++++++++ src/util/virutil.h | 3 + .../isolated-network-with-lease-time.conf | 17 ++++++ .../isolated-network-with-lease-time.xml | 24 ++++++++ tests/networkxml2conftest.c | 1 + .../isolated-network-with-lease-time.xml | 24 ++++++++ .../isolated-network-with-lease-time.xml | 24 ++++++++ tests/networkxml2xmltest.c | 1 + 13 files changed, 276 insertions(+), 4 deletions(-) create mode 100644 tests/networkxml2confdata/isolated-network-with-lease-time.conf create mode 100644 tests/networkxml2confdata/isolated-network-with-lease-time.xml create mode 100644 tests/networkxml2xmlin/isolated-network-with-lease-time.xml create mode 100644 tests/networkxml2xmlout/isolated-network-with-lease-time.xml
diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng index 1b4f980..023edfb 100644 --- a/docs/schemas/basictypes.rng +++ b/docs/schemas/basictypes.rng @@ -13,6 +13,12 @@ <param name='pattern'>[0-9]+</param> </data> </define> + <!-- Our long doesn"t allow a leading "+" in its lexical form --> + <define name='long'> + <data type='long'>
This looks weird. Also why would you need a separate type for this?
+ <param name='pattern'>-?[0-9]+</param> + </data> + </define>
<define name='hexuint'> <data type='string'> @@ -283,6 +289,20 @@ <ref name='unsignedLong'/> </define>
+ <define name='timeUnit'> + <data type='string'> + <param name='pattern'>(seconds|minutes|hours|days|weeks|s|m|h|d|w)</param> + </data> + </define> + <define name='scaledTime'> + <optional> + <attribute name='timeUnit'> + <ref name='timeUnit'/> + </attribute> + </optional> + <ref name='long'/> + </define> + <define name="pciDomain"> <ref name="uint16"/> </define> diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 1a18e64..66b65bc 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -337,9 +337,15 @@ </element> </optional> <optional> - <!-- Define the range(s) of IP addresses that the DHCP - server should hand out --> <element name="dhcp"> + <optional> + <element name="leasetime"> + <ref name="scaledTime"/> + <attribute name="unit"><ref name="timeUnit"/></attribute> + </element> + </optional> + <!-- Define the range(s) of IP addresses that the DHCP + server should hand out --> <zeroOrMore> <element name="range"> <attribute name="start"><ref name="ipAddr"/></attribute> diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index aa39776..d2372f2 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1032,21 +1032,76 @@ virNetworkDHCPHostDefParseXML(const char *networkName, }
static int +virNetworkDHCPLeaseTimeParseXML(const char *networkName, + virNetworkIPDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node) +{ + int ret = -1; + int scale = 1;
Scale doesn't need a separate value since it won't change in this case.
+ xmlNodePtr save; + char *unit = NULL; + int leasetimeRV = 0;
This variable is written and read just once, so it's not needed.
+ long long leasetime;
This variable needs to be initialized, since if the XPath evaluation does not find the node it will be undefined. This won't happen here but the compiler might complain.
+ + save = ctxt->node; + ctxt->node = node; + + leasetimeRV = virXPathLongLong("string(./text())", ctxt, &leasetime); + if (leasetimeRV == -2) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid long long value specified for leasetime " + "in definition of network '%s'"), + networkName); + goto cleanup; + } else { + if (leasetime < 0) { + leasetime = -1; /* infinite */ + } else { + unit = virXPathString("string(./@unit)", ctxt); + if (virScaleTime(&leasetime, unit, scale, + VIR_NETWORK_DHCP_LEASE_TIME_MAX) < 0) { + // let virScaleTime() report the appropriate error
'//' comments should not be used in libvirt. Also the comment does not make sense since it's obvious what is happening.
+ goto cleanup; + } + } + } + + def->leasetime = leasetime; + ret = 0; + + cleanup: + VIR_FREE(unit); + ctxt->node = save; + return ret; +} + +static int virNetworkDHCPDefParseXML(const char *networkName, xmlNodePtr node, + xmlXPathContextPtr ctxt, virNetworkIPDefPtr def) { int ret = -1; - xmlNodePtr cur; + xmlNodePtr cur, save;
Please don't declare multiple variables per line.
virSocketAddrRange range; virNetworkDHCPHostDef host;
memset(&range, 0, sizeof(range)); memset(&host, 0, sizeof(host));
+ save = ctxt->node; + ctxt->node = node; + cur = node->children; while (cur != NULL) { if (cur->type == XML_ELEMENT_NODE && + xmlStrEqual(cur->name, BAD_CAST "leasetime")) { + + if (virNetworkDHCPLeaseTimeParseXML(networkName, def, ctxt, cur) < 0) + goto cleanup; + + } else if (cur->type == XML_ELEMENT_NODE && xmlStrEqual(cur->name, BAD_CAST "range")) {
if (virSocketAddrRangeParseXML(networkName, def, cur, &range) < 0)
[...]
@@ -2675,6 +2731,13 @@ virNetworkIPDefFormat(virBufferPtr buf, virBufferAddLit(buf, "<dhcp>\n"); virBufferAdjustIndent(buf, 2);
+ if (def->leasetime) { + virBufferAddLit(buf, "<leasetime"); + virBufferAsprintf(buf, " unit='seconds'>%lld",
Why aren't we remembering the unit the user used originally? Other places can't do that since the unit was added later, but with this newly added code you can remember and format back the unit just fine.
+ def->leasetime); + virBufferAddLit(buf, "</leasetime>\n"); + } + for (i = 0; i < def->nranges; i++) { char *saddr = virSocketAddrFormat(&def->ranges[i].start); if (!saddr) diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index 3b227db..965fd3b 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h
[...]
@@ -167,6 +176,8 @@ struct _virNetworkIPDef { size_t nhosts; /* Zero or more dhcp hosts */ virNetworkDHCPHostDefPtr hosts;
+ long long leasetime; /* DHCP lease time for all ranges */
You should explain the special values of 0 and -1 here, also state the unit.
+ char *tftproot; char *bootfile; virSocketAddr bootserver; diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 7b99aca..fa628d3 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -1222,6 +1222,14 @@ networkDnsmasqConfContents(virNetworkObjPtr network, saddr, eaddr); if (VIR_SOCKET_ADDR_IS_FAMILY(&ipdef->address, AF_INET6)) virBufferAsprintf(&configbuf, ",%d", prefix); + + if (ipdef->leasetime) { + if (ipdef->leasetime == -1) + virBufferAddLit(&configbuf, ",infinite"); + else + virBufferAsprintf(&configbuf, ",%lld", ipdef->leasetime); + } + virBufferAddLit(&configbuf, "\n");
VIR_FREE(saddr); diff --git a/src/util/virutil.c b/src/util/virutil.c index b57a195..81a60e6 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -275,6 +275,76 @@ virHexToBin(unsigned char c) } }
+/** + * virScaleTime:
This change should be done as a separate patch.
+ * @value: pointer to the integer which is supposed to hold value + * @unit: pointer to the string holding the unit + * @scale: integer holding the value of scale
^^ at least this value is not documented enough.
+ * @limit: upper limit on scaled value + * + * Scale an integer @value in-place by an optional case-insensitive @unit,
The statement about case insensitivity doesn't look to be true according to the code blow.
+ * defaulting to @scale if @unit is NULL or empty. Recognized units include + * (w)eeks, (d)ays, (h)ours, (m)inutes and (s)econds. Ensure that the result + * does not exceed @limit. Return 0 on success, -1 with error message raised + * on failure.
This does not document the scale of the returned value at all.
+ */ +int +virScaleTime(long long *value, const char *unit, + long long scale, long long limit)
Scale can't really be negative. Also the result being negative does not make much sense to me.
+{ + size_t i; + static char const* const allowed_units[] = {"s", "m", "h", "d", "w", + "seconds", "minutes", "hours", "days", "weeks"}; + + size_t n_allowed_units = ARRAY_CARDINALITY(allowed_units); + + if (!unit || !*unit) { + if (!scale) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid scale %lld"), scale); + return -1; + } + unit = ""; + } else {
Scale is not reset here, which is wrong since it would depend on user input.
+ for (i = 0; i < n_allowed_units; i++) { + if (STREQ(unit, allowed_units[i])) { + switch (*unit) { + case 'w': + scale *= 7; + /* fall through */ + case 'd': + scale *= 24; + /* fall through */ + case 'h': + scale *= 60; + /* fall through */ + case 'm': + scale *= 60; + /* fall through */ + case 's': + break; + } + break; + } + } + if (i == n_allowed_units) { + virReportError(VIR_ERR_INVALID_ARG, + _("Unknown time unit '%s'"), unit); + return -1; + } + } + + if (*value && *value > (limit / scale)) { + virReportError(VIR_ERR_OVERFLOW, _("Value too large: %lld %s"), + *value, unit); + return -1; + } + + *value *= scale; + return 0; +} + + /* Scale an integer VALUE in-place by an optional case-insensitive * SUFFIX, defaulting to SCALE if suffix is NULL or empty (scale is * typically 1 or 1024). Recognized suffixes include 'b' or 'bytes',

On Fri, Sep 23, 2016 at 1:20 PM, Peter Krempa <pkrempa@redhat.com> wrote:
On Fri, Sep 23, 2016 at 12:19:49 +0530, Nehal J Wani wrote:
The default dhcp lease time set by dnsmasq is only one hour, which can be pretty small for developers relying on ip address(es) to be consistent across reboots.
This should be handled through the leases file which is supposed to remember the IP addresses given to a host despite a short lease time. Is that broken for you?
We can't ask developers to parse the .status file and the DHCP leases API would never show expired leases. Or am I misinterpreting what you mean by 'remember'? Even dnsmasq depends on the output for the init event sent to the leaseshelper program, which will not output lease which have expired.
This patch adds support for setting a lease time in the network definition. For now, all IP ranges under one <dhcp> will share a common leasetime.
An example: <dhcp> <leasetime units='hours'>12</leasetime> </dhcp>
This sounds like a reasonable feature, but I don't really want to sell it as a workaround for the problem you've described above. Setting the lease time is useful, but guests should get the same IP addresses thanks to the lease file.
Actually, the major reason behind this feature was mentioned at https://www.redhat.com/archives/libvir-list/2016-April/msg01071.html Let's say that the lease time is 2 hours and I shutdown my VM. Now I reboot it after 2hours 5mins, the lease will be gone, since it will be expired. The .status file is read and the leases which have expired are removed. So, there is no guarantee that my VM will get the same IP address again. Hence the developer would want to increase the lease time to something big, like 1 week, and go to sleep happily when the VM is turned off at night :)
The value for attribute 'units' can be seconds/hours/days/weeks/s/h/d/w. If the leasetime specified is -1, it is considered to be infinite.
docs/schema/{basictypes,network}.rng * Add datatype long * Introduce timeUnits, scaledTime * Add schema for tag: leasetime
src/util/virutil.{c,h} * Introduce virScaleTime()
src/conf/network_conf.h * Extend virNetworkIPDef to accomodate leastime * Define VIR_NETWORK_DHCP_LEASE_TIME_MAX
src/conf/network_conf.c * Introduce virNetworkDHCPLeaseTimeParseXML * Modify virNetworkDHCPDefParseXML to have XPath context pointer
tests/* * Add test cases for xml parsing and conversion
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=913446 --- docs/schemas/basictypes.rng | 20 +++++++ docs/schemas/network.rng | 10 +++- src/conf/network_conf.c | 67 ++++++++++++++++++++- src/conf/network_conf.h | 11 ++++ src/network/bridge_driver.c | 8 +++ src/util/virutil.c | 70 ++++++++++++++++++++++ src/util/virutil.h | 3 + .../isolated-network-with-lease-time.conf | 17 ++++++ .../isolated-network-with-lease-time.xml | 24 ++++++++ tests/networkxml2conftest.c | 1 + .../isolated-network-with-lease-time.xml | 24 ++++++++ .../isolated-network-with-lease-time.xml | 24 ++++++++ tests/networkxml2xmltest.c | 1 + 13 files changed, 276 insertions(+), 4 deletions(-) create mode 100644 tests/networkxml2confdata/isolated-network-with-lease-time.conf create mode 100644 tests/networkxml2confdata/isolated-network-with-lease-time.xml create mode 100644 tests/networkxml2xmlin/isolated-network-with-lease-time.xml create mode 100644 tests/networkxml2xmlout/isolated-network-with-lease-time.xml
diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng index 1b4f980..023edfb 100644 --- a/docs/schemas/basictypes.rng +++ b/docs/schemas/basictypes.rng @@ -13,6 +13,12 @@ <param name='pattern'>[0-9]+</param> </data> </define> + <!-- Our long doesn"t allow a leading "+" in its lexical form --> + <define name='long'> + <data type='long'>
This looks weird. Also why would you need a separate type for this?
I didn't find any available data type for signed integers. Which one would you rather use? (Signed, because we want to support -1 too)
+ <param name='pattern'>-?[0-9]+</param> + </data> + </define>
<define name='hexuint'> <data type='string'> @@ -283,6 +289,20 @@ <ref name='unsignedLong'/> </define>
+ <define name='timeUnit'> + <data type='string'> + <param name='pattern'>(seconds|minutes|hours|days|weeks|s|m|h|d|w)</param> + </data> + </define> + <define name='scaledTime'> + <optional> + <attribute name='timeUnit'> + <ref name='timeUnit'/> + </attribute> + </optional> + <ref name='long'/> + </define> + <define name="pciDomain"> <ref name="uint16"/> </define> diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 1a18e64..66b65bc 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -337,9 +337,15 @@ </element> </optional> <optional> - <!-- Define the range(s) of IP addresses that the DHCP - server should hand out --> <element name="dhcp"> + <optional> + <element name="leasetime"> + <ref name="scaledTime"/> + <attribute name="unit"><ref name="timeUnit"/></attribute> + </element> + </optional> + <!-- Define the range(s) of IP addresses that the DHCP + server should hand out --> <zeroOrMore> <element name="range"> <attribute name="start"><ref name="ipAddr"/></attribute> diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index aa39776..d2372f2 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1032,21 +1032,76 @@ virNetworkDHCPHostDefParseXML(const char *networkName, }
static int +virNetworkDHCPLeaseTimeParseXML(const char *networkName, + virNetworkIPDefPtr def, + xmlXPathContextPtr ctxt, + xmlNodePtr node) +{ + int ret = -1; + int scale = 1;
Scale doesn't need a separate value since it won't change in this case.
+ xmlNodePtr save; + char *unit = NULL; + int leasetimeRV = 0;
This variable is written and read just once, so it's not needed.
+ long long leasetime;
This variable needs to be initialized, since if the XPath evaluation does not find the node it will be undefined.
This won't happen here but the compiler might complain.
+ + save = ctxt->node; + ctxt->node = node; + + leasetimeRV = virXPathLongLong("string(./text())", ctxt, &leasetime); + if (leasetimeRV == -2) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid long long value specified for leasetime " + "in definition of network '%s'"), + networkName); + goto cleanup; + } else { + if (leasetime < 0) { + leasetime = -1; /* infinite */ + } else { + unit = virXPathString("string(./@unit)", ctxt); + if (virScaleTime(&leasetime, unit, scale, + VIR_NETWORK_DHCP_LEASE_TIME_MAX) < 0) { + // let virScaleTime() report the appropriate error
'//' comments should not be used in libvirt. Also the comment does not make sense since it's obvious what is happening.
I have a query regarding this. If we rely on virScaleTime to report the appropriate error, it wouldn't be including the network name. Is that bad? Should I be modifying virScaleTime to return different return values for different errors and shape the error message appropriately here? (That would sort be inconsistent with virScaleInteger which doesn't have different error codes for different scenarios) Whoops on the comment. Should have read hacking.html more carefully!
+ goto cleanup; + } + } + } + + def->leasetime = leasetime; + ret = 0; + + cleanup: + VIR_FREE(unit); + ctxt->node = save; + return ret; +} + +static int virNetworkDHCPDefParseXML(const char *networkName, xmlNodePtr node, + xmlXPathContextPtr ctxt, virNetworkIPDefPtr def) { int ret = -1; - xmlNodePtr cur; + xmlNodePtr cur, save;
Please don't declare multiple variables per line.
virSocketAddrRange range; virNetworkDHCPHostDef host;
memset(&range, 0, sizeof(range)); memset(&host, 0, sizeof(host));
+ save = ctxt->node; + ctxt->node = node; + cur = node->children; while (cur != NULL) { if (cur->type == XML_ELEMENT_NODE && + xmlStrEqual(cur->name, BAD_CAST "leasetime")) { + + if (virNetworkDHCPLeaseTimeParseXML(networkName, def, ctxt, cur) < 0) + goto cleanup; + + } else if (cur->type == XML_ELEMENT_NODE && xmlStrEqual(cur->name, BAD_CAST "range")) {
if (virSocketAddrRangeParseXML(networkName, def, cur, &range) < 0)
[...]
@@ -2675,6 +2731,13 @@ virNetworkIPDefFormat(virBufferPtr buf, virBufferAddLit(buf, "<dhcp>\n"); virBufferAdjustIndent(buf, 2);
+ if (def->leasetime) { + virBufferAddLit(buf, "<leasetime"); + virBufferAsprintf(buf, " unit='seconds'>%lld",
Why aren't we remembering the unit the user used originally? Other places can't do that since the unit was added later, but with this newly added code you can remember and format back the unit just fine.
Actually, I followed what was originally done for memory units. Even if user specifies MiB, everything is stored in KiB and on dumpxml, the user is shown the unit in KiB. In case we want to remember the unit specified by the user, we'll have to add another member in the _virNetworkIPDef struct. Since we haven't done this anywhere, I was apprehensive about having inconsistent behaviour for 'unit'.
+ def->leasetime); + virBufferAddLit(buf, "</leasetime>\n"); + } + for (i = 0; i < def->nranges; i++) { char *saddr = virSocketAddrFormat(&def->ranges[i].start); if (!saddr) diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index 3b227db..965fd3b 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h
[...]
@@ -167,6 +176,8 @@ struct _virNetworkIPDef { size_t nhosts; /* Zero or more dhcp hosts */ virNetworkDHCPHostDefPtr hosts;
+ long long leasetime; /* DHCP lease time for all ranges */
You should explain the special values of 0 and -1 here, also state the unit.
+ char *tftproot; char *bootfile; virSocketAddr bootserver; diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 7b99aca..fa628d3 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -1222,6 +1222,14 @@ networkDnsmasqConfContents(virNetworkObjPtr network, saddr, eaddr); if (VIR_SOCKET_ADDR_IS_FAMILY(&ipdef->address, AF_INET6)) virBufferAsprintf(&configbuf, ",%d", prefix); + + if (ipdef->leasetime) { + if (ipdef->leasetime == -1) + virBufferAddLit(&configbuf, ",infinite"); + else + virBufferAsprintf(&configbuf, ",%lld", ipdef->leasetime); + } + virBufferAddLit(&configbuf, "\n");
VIR_FREE(saddr); diff --git a/src/util/virutil.c b/src/util/virutil.c index b57a195..81a60e6 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -275,6 +275,76 @@ virHexToBin(unsigned char c) } }
+/** + * virScaleTime:
This change should be done as a separate patch.
+ * @value: pointer to the integer which is supposed to hold value + * @unit: pointer to the string holding the unit + * @scale: integer holding the value of scale
^^ at least this value is not documented enough.
+ * @limit: upper limit on scaled value + * + * Scale an integer @value in-place by an optional case-insensitive @unit,
The statement about case insensitivity doesn't look to be true according to the code blow.
With case insensitivity, 'Seconds' and 'SEconDs' would also be valid inputs, and it would be difficult to accommodate it in the .rng file, plus I was told that if attributes are words, then we shouldn't be going for case-insensitivity anyway. I'll fix this in the next patch set.
+ * defaulting to @scale if @unit is NULL or empty. Recognized units include + * (w)eeks, (d)ays, (h)ours, (m)inutes and (s)econds. Ensure that the result + * does not exceed @limit. Return 0 on success, -1 with error message raised + * on failure.
This does not document the scale of the returned value at all.
+ */ +int +virScaleTime(long long *value, const char *unit, + long long scale, long long limit)
Scale can't really be negative. Also the result being negative does not make much sense to me.
Should this be changed to unsigned long long or unsigned long? Since leasetime has to be a signed integer, what would be the ideal way to pass it to the modified function? Should I typecast it?
+{ + size_t i; + static char const* const allowed_units[] = {"s", "m", "h", "d", "w", + "seconds", "minutes", "hours", "days", "weeks"}; + + size_t n_allowed_units = ARRAY_CARDINALITY(allowed_units); + + if (!unit || !*unit) { + if (!scale) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid scale %lld"), scale); + return -1; + } + unit = ""; + } else {
Scale is not reset here, which is wrong since it would depend on user input.
+ for (i = 0; i < n_allowed_units; i++) { + if (STREQ(unit, allowed_units[i])) { + switch (*unit) { + case 'w': + scale *= 7; + /* fall through */ + case 'd': + scale *= 24; + /* fall through */ + case 'h': + scale *= 60; + /* fall through */ + case 'm': + scale *= 60; + /* fall through */ + case 's': + break; + } + break; + } + } + if (i == n_allowed_units) { + virReportError(VIR_ERR_INVALID_ARG, + _("Unknown time unit '%s'"), unit); + return -1; + } + } + + if (*value && *value > (limit / scale)) { + virReportError(VIR_ERR_OVERFLOW, _("Value too large: %lld %s"), + *value, unit); + return -1; + } + + *value *= scale; + return 0; +} + + /* Scale an integer VALUE in-place by an optional case-insensitive * SUFFIX, defaulting to SCALE if suffix is NULL or empty (scale is * typically 1 or 1024). Recognized suffixes include 'b' or 'bytes',
-- Nehal J Wani

On Fri, Sep 23, 2016 at 15:33:46 +0530, Nehal J Wani wrote:
On Fri, Sep 23, 2016 at 1:20 PM, Peter Krempa <pkrempa@redhat.com> wrote:
On Fri, Sep 23, 2016 at 12:19:49 +0530, Nehal J Wani wrote:
The default dhcp lease time set by dnsmasq is only one hour, which can be pretty small for developers relying on ip address(es) to be consistent across reboots.
This should be handled through the leases file which is supposed to remember the IP addresses given to a host despite a short lease time. Is that broken for you?
We can't ask developers to parse the .status file and the DHCP leases API would never show expired leases. Or am I misinterpreting what you
If a lease is expired then the VM is necessarily not using it for a while.
mean by 'remember'? Even dnsmasq depends on the output for the init
I meant 'remember' as dnsmasq should assign the same address even if the lease was expired for a while.
event sent to the leaseshelper program, which will not output lease which have expired.
So then maybe this is the actual bug. The expired leases still can be renewed and AFAIK dnsmasq reloads them even for the expired entries so we probably should remove that part from the leaseshelper.
This patch adds support for setting a lease time in the network definition. For now, all IP ranges under one <dhcp> will share a common leasetime.
An example: <dhcp> <leasetime units='hours'>12</leasetime> </dhcp>
This sounds like a reasonable feature, but I don't really want to sell it as a workaround for the problem you've described above. Setting the lease time is useful, but guests should get the same IP addresses thanks to the lease file.
Actually, the major reason behind this feature was mentioned at https://www.redhat.com/archives/libvir-list/2016-April/msg01071.html
As I've said, adding ability to specify longer lease time may be desired, but the fact that you don't get the address most probably is not related to this and should be addressed separately.
Let's say that the lease time is 2 hours and I shutdown my VM. Now I reboot it after 2hours 5mins, the lease will be gone, since it will be
The lease will expire, but dnsmasq still can reuse that address for the particular mac address until the pool is depleted.
expired. The .status file is read and the leases which have expired are removed. So, there is no guarantee that my VM will get the same IP
Again, this looks like the actual bug.
address again. Hence the developer would want to increase the lease
No it's not guaranteed, but very likely.
time to something big, like 1 week, and go to sleep happily when the VM is turned off at night :)
That's a rather weak point. Setting too long lease time may on the other hand exhaust the pool. Usually the configured DHCP lease times are rather short.
The value for attribute 'units' can be seconds/hours/days/weeks/s/h/d/w. If the leasetime specified is -1, it is considered to be infinite.
[...]
diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng index 1b4f980..023edfb 100644 --- a/docs/schemas/basictypes.rng +++ b/docs/schemas/basictypes.rng @@ -13,6 +13,12 @@ <param name='pattern'>[0-9]+</param> </data> </define> + <!-- Our long doesn"t allow a leading "+" in its lexical form --> + <define name='long'> + <data type='long'>
This looks weird. Also why would you need a separate type for this?
I didn't find any available data type for signed integers. Which one would you rather use? (Signed, because we want to support -1 too)
A quick grep revealled that there's a builtin "long" type used in at least one case.
+ <param name='pattern'>-?[0-9]+</param> + </data>
[...]
diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index aa39776..d2372f2 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1032,21 +1032,76 @@ virNetworkDHCPHostDefParseXML(const char *networkName,
[...]
+ + save = ctxt->node; + ctxt->node = node; + + leasetimeRV = virXPathLongLong("string(./text())", ctxt, &leasetime); + if (leasetimeRV == -2) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid long long value specified for leasetime " + "in definition of network '%s'"), + networkName); + goto cleanup; + } else { + if (leasetime < 0) { + leasetime = -1; /* infinite */ + } else { + unit = virXPathString("string(./@unit)", ctxt); + if (virScaleTime(&leasetime, unit, scale, + VIR_NETWORK_DHCP_LEASE_TIME_MAX) < 0) { + // let virScaleTime() report the appropriate error
'//' comments should not be used in libvirt. Also the comment does not make sense since it's obvious what is happening.
I have a query regarding this. If we rely on virScaleTime to report the appropriate error, it wouldn't be including the network name. Is
That happens in quite a few cases.
that bad? Should I be modifying virScaleTime to return different return values for different errors and shape the error message appropriately here? (That would sort be inconsistent with
No, not really.
virScaleInteger which doesn't have different error codes for different scenarios)
Whoops on the comment. Should have read hacking.html more carefully!
[...]
@@ -2675,6 +2731,13 @@ virNetworkIPDefFormat(virBufferPtr buf, virBufferAddLit(buf, "<dhcp>\n"); virBufferAdjustIndent(buf, 2);
+ if (def->leasetime) { + virBufferAddLit(buf, "<leasetime"); + virBufferAsprintf(buf, " unit='seconds'>%lld",
Why aren't we remembering the unit the user used originally? Other places can't do that since the unit was added later, but with this newly added code you can remember and format back the unit just fine.
Actually, I followed what was originally done for memory units. Even if user specifies MiB, everything is stored in KiB and on dumpxml, the
This was because libvirt added support for scaled integers later than the original code. Thus we needed to keep compatibility. This is not required for new elements.
user is shown the unit in KiB. In case we want to remember the unit specified by the user, we'll have to add another member in the _virNetworkIPDef struct. Since we haven't done this anywhere, I was apprehensive about having inconsistent behaviour for 'unit'.
+ def->leasetime); + virBufferAddLit(buf, "</leasetime>\n"); + }
[...]
diff --git a/src/util/virutil.c b/src/util/virutil.c index b57a195..81a60e6 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -275,6 +275,76 @@ virHexToBin(unsigned char c) } }
+/** + * virScaleTime:
This change should be done as a separate patch.
+ * @value: pointer to the integer which is supposed to hold value + * @unit: pointer to the string holding the unit + * @scale: integer holding the value of scale
^^ at least this value is not documented enough.
+ * @limit: upper limit on scaled value + * + * Scale an integer @value in-place by an optional case-insensitive @unit,
The statement about case insensitivity doesn't look to be true according to the code blow.
With case insensitivity, 'Seconds' and 'SEconDs' would also be valid inputs, and it would be difficult to accommodate it in the .rng file, plus I was told that if attributes are words, then we shouldn't be going for case-insensitivity anyway. I'll fix this in the next patch set.
Either case, the comment and code should not behave differently. The comment looks copied from the memory number scaler, but the code certainly doesn't behave that way.
+ * defaulting to @scale if @unit is NULL or empty. Recognized units include + * (w)eeks, (d)ays, (h)ours, (m)inutes and (s)econds. Ensure that the result + * does not exceed @limit. Return 0 on success, -1 with error message raised + * on failure.
This does not document the scale of the returned value at all.
+ */ +int +virScaleTime(long long *value, const char *unit, + long long scale, long long limit)
Scale can't really be negative. Also the result being negative does not make much sense to me.
Should this be changed to unsigned long long or unsigned long? Since leasetime has to be a signed integer, what would be the ideal way to pass it to the modified function? Should I typecast it?
Negative scale doesn't make sense. For time values, negative time value might make sense for relative time values. You need to check then that the overflow check works for negative numbers as well, which it doesn't since it was just copied from the memory integer scaler.

On Fri, Sep 23, 2016 at 4:51 PM, Peter Krempa <pkrempa@redhat.com> wrote:
On Fri, Sep 23, 2016 at 15:33:46 +0530, Nehal J Wani wrote:
On Fri, Sep 23, 2016 at 1:20 PM, Peter Krempa <pkrempa@redhat.com> wrote:
On Fri, Sep 23, 2016 at 12:19:49 +0530, Nehal J Wani wrote:
The default dhcp lease time set by dnsmasq is only one hour, which can be pretty small for developers relying on ip address(es) to be consistent across reboots.
This should be handled through the leases file which is supposed to remember the IP addresses given to a host despite a short lease time. Is that broken for you?
We can't ask developers to parse the .status file and the DHCP leases API would never show expired leases. Or am I misinterpreting what you
If a lease is expired then the VM is necessarily not using it for a while.
mean by 'remember'? Even dnsmasq depends on the output for the init
I meant 'remember' as dnsmasq should assign the same address even if the lease was expired for a while.
Yes, but then you are assuming that dnsmasq hasn't been restarted after the lease has expired. User may switch distros in between or shutdown is computer, etc.
event sent to the leaseshelper program, which will not output lease which have expired.
So then maybe this is the actual bug. The expired leases still can be renewed and AFAIK dnsmasq reloads them even for the expired entries so we probably should remove that part from the leaseshelper.
After addition of leaseshelper, because of the --leasefile-ro option that we specify to dnsmasq, the leases database is entirely controlled by it. If the developer shuts down his VM, the lease expires and then he restarts the computer (which will involve restarting the dnsmasq service), dnsmasq will rely on the leaseshelper program for the initial input (aka init event). The first parameter from the 'init' output is the expiry time of the lease. If the lease has already expired, then we can't print it in virLeasePrintLeases(). Hence I fail to understand why you would think that this may be a bug. Why would libvirt/leaseshelper renew a certain lease? Isn't that the job of the client?
This patch adds support for setting a lease time in the network definition. For now, all IP ranges under one <dhcp> will share a common leasetime.
An example: <dhcp> <leasetime units='hours'>12</leasetime> </dhcp>
This sounds like a reasonable feature, but I don't really want to sell it as a workaround for the problem you've described above. Setting the lease time is useful, but guests should get the same IP addresses thanks to the lease file.
Actually, the major reason behind this feature was mentioned at https://www.redhat.com/archives/libvir-list/2016-April/msg01071.html
As I've said, adding ability to specify longer lease time may be desired, but the fact that you don't get the address most probably is not related to this and should be addressed separately.
Let's say that the lease time is 2 hours and I shutdown my VM. Now I reboot it after 2hours 5mins, the lease will be gone, since it will be
The lease will expire, but dnsmasq still can reuse that address for the particular mac address until the pool is depleted.
expired. The .status file is read and the leases which have expired are removed. So, there is no guarantee that my VM will get the same IP
Again, this looks like the actual bug.
address again. Hence the developer would want to increase the lease
No it's not guaranteed, but very likely.
time to something big, like 1 week, and go to sleep happily when the VM is turned off at night :)
That's a rather weak point. Setting too long lease time may on the other hand exhaust the pool. Usually the configured DHCP lease times are rather short.
The value for attribute 'units' can be seconds/hours/days/weeks/s/h/d/w. If the leasetime specified is -1, it is considered to be infinite.
[...]
diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng index 1b4f980..023edfb 100644 --- a/docs/schemas/basictypes.rng +++ b/docs/schemas/basictypes.rng @@ -13,6 +13,12 @@ <param name='pattern'>[0-9]+</param> </data> </define> + <!-- Our long doesn"t allow a leading "+" in its lexical form --> + <define name='long'> + <data type='long'>
This looks weird. Also why would you need a separate type for this?
I didn't find any available data type for signed integers. Which one would you rather use? (Signed, because we want to support -1 too)
A quick grep revealled that there's a builtin "long" type used in at least one case.
+ <param name='pattern'>-?[0-9]+</param> + </data>
[...]
diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index aa39776..d2372f2 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1032,21 +1032,76 @@ virNetworkDHCPHostDefParseXML(const char *networkName,
[...]
+ + save = ctxt->node; + ctxt->node = node; + + leasetimeRV = virXPathLongLong("string(./text())", ctxt, &leasetime); + if (leasetimeRV == -2) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid long long value specified for leasetime " + "in definition of network '%s'"), + networkName); + goto cleanup; + } else { + if (leasetime < 0) { + leasetime = -1; /* infinite */ + } else { + unit = virXPathString("string(./@unit)", ctxt); + if (virScaleTime(&leasetime, unit, scale, + VIR_NETWORK_DHCP_LEASE_TIME_MAX) < 0) { + // let virScaleTime() report the appropriate error
'//' comments should not be used in libvirt. Also the comment does not make sense since it's obvious what is happening.
I have a query regarding this. If we rely on virScaleTime to report the appropriate error, it wouldn't be including the network name. Is
That happens in quite a few cases.
that bad? Should I be modifying virScaleTime to return different return values for different errors and shape the error message appropriately here? (That would sort be inconsistent with
No, not really.
virScaleInteger which doesn't have different error codes for different scenarios)
Whoops on the comment. Should have read hacking.html more carefully!
[...]
@@ -2675,6 +2731,13 @@ virNetworkIPDefFormat(virBufferPtr buf, virBufferAddLit(buf, "<dhcp>\n"); virBufferAdjustIndent(buf, 2);
+ if (def->leasetime) { + virBufferAddLit(buf, "<leasetime"); + virBufferAsprintf(buf, " unit='seconds'>%lld",
Why aren't we remembering the unit the user used originally? Other places can't do that since the unit was added later, but with this newly added code you can remember and format back the unit just fine.
Actually, I followed what was originally done for memory units. Even if user specifies MiB, everything is stored in KiB and on dumpxml, the
This was because libvirt added support for scaled integers later than the original code. Thus we needed to keep compatibility. This is not required for new elements.
user is shown the unit in KiB. In case we want to remember the unit specified by the user, we'll have to add another member in the _virNetworkIPDef struct. Since we haven't done this anywhere, I was apprehensive about having inconsistent behaviour for 'unit'.
+ def->leasetime); + virBufferAddLit(buf, "</leasetime>\n"); + }
[...]
diff --git a/src/util/virutil.c b/src/util/virutil.c index b57a195..81a60e6 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -275,6 +275,76 @@ virHexToBin(unsigned char c) } }
+/** + * virScaleTime:
This change should be done as a separate patch.
+ * @value: pointer to the integer which is supposed to hold value + * @unit: pointer to the string holding the unit + * @scale: integer holding the value of scale
^^ at least this value is not documented enough.
+ * @limit: upper limit on scaled value + * + * Scale an integer @value in-place by an optional case-insensitive @unit,
The statement about case insensitivity doesn't look to be true according to the code blow.
With case insensitivity, 'Seconds' and 'SEconDs' would also be valid inputs, and it would be difficult to accommodate it in the .rng file, plus I was told that if attributes are words, then we shouldn't be going for case-insensitivity anyway. I'll fix this in the next patch set.
Either case, the comment and code should not behave differently. The comment looks copied from the memory number scaler, but the code certainly doesn't behave that way.
+ * defaulting to @scale if @unit is NULL or empty. Recognized units include + * (w)eeks, (d)ays, (h)ours, (m)inutes and (s)econds. Ensure that the result + * does not exceed @limit. Return 0 on success, -1 with error message raised + * on failure.
This does not document the scale of the returned value at all.
+ */ +int +virScaleTime(long long *value, const char *unit, + long long scale, long long limit)
Scale can't really be negative. Also the result being negative does not make much sense to me.
Should this be changed to unsigned long long or unsigned long? Since leasetime has to be a signed integer, what would be the ideal way to pass it to the modified function? Should I typecast it?
Negative scale doesn't make sense. For time values, negative time value might make sense for relative time values. You need to check then that the overflow check works for negative numbers as well, which it doesn't since it was just copied from the memory integer scaler.
-- Nehal J Wani

On 09/23/2016 08:24 AM, Nehal J Wani wrote:
On Fri, Sep 23, 2016 at 15:33:46 +0530, Nehal J Wani wrote:
On Fri, Sep 23, 2016 at 1:20 PM, Peter Krempa <pkrempa@redhat.com> wrote:
The default dhcp lease time set by dnsmasq is only one hour, which can be pretty small for developers relying on ip address(es) to be consistent across reboots. This should be handled through the leases file which is supposed to remember the IP addresses given to a host despite a short lease time. Is
On Fri, Sep 23, 2016 at 12:19:49 +0530, Nehal J Wani wrote: that broken for you? We can't ask developers to parse the .status file and the DHCP leases API would never show expired leases. Or am I misinterpreting what you If a lease is expired then the VM is necessarily not using it for a while.
mean by 'remember'? Even dnsmasq depends on the output for the init I meant 'remember' as dnsmasq should assign the same address even if the lease was expired for a while. Yes, but then you are assuming that dnsmasq hasn't been restarted after the lease has expired. User may switch distros in between or shutdown is computer, etc. event sent to the leaseshelper program, which will not output lease which have expired. So then maybe this is the actual bug. The expired leases still can be renewed and AFAIK dnsmasq reloads them even for the expired entries so we probably should remove that part from the leaseshelper. After addition of leaseshelper, because of the --leasefile-ro option
On Fri, Sep 23, 2016 at 4:51 PM, Peter Krempa <pkrempa@redhat.com> wrote: that we specify to dnsmasq, the leases database is entirely controlled by it. If the developer shuts down his VM, the lease expires and then he restarts the computer (which will involve restarting the dnsmasq service), dnsmasq will rely on the leaseshelper program for the initial input (aka init event). The first parameter from the 'init' output is the expiry time of the lease. If the lease has already expired, then we can't print it in virLeasePrintLeases(). Hence I fail to understand why you would think that this may be a bug.
Why would libvirt/leaseshelper renew a certain lease? Isn't that the job of the client?
I'm pretty sure that's not what Peter is suggesting. What he's saying is that dnsmasq normally keeps track of even leases that have expired and tries not to give them out to others who are simply requesting "give me an address", so that when a client returns at some later time they have a chance of getting back the same address they had before even if their lease had expired (as long as it wasn't needed for some other client due to all the other available addresses being used). I at least *thought* this worked with a standard dnsmasq setup.
This patch adds support for setting a lease time in the network definition. For now, all IP ranges under one <dhcp> will share a common leasetime.
An example: <dhcp> <leasetime units='hours'>12</leasetime> </dhcp> This sounds like a reasonable feature, but I don't really want to sell it as a workaround for the problem you've described above. Setting the lease time is useful, but guests should get the same IP addresses thanks to the lease file. Actually, the major reason behind this feature was mentioned at https://www.redhat.com/archives/libvir-list/2016-April/msg01071.html
I'm a bit curious if you had any contact with the author of that patch other than CC'ing them to your patch mail. From the BZ (https://bugzilla.redhat.com/show_bug.cgi?id=913446 ), it sounds like he was still intending to update his patch but was bogged down implementing some extra functionality.
As I've said, adding ability to specify longer lease time may be desired, but the fact that you don't get the address most probably is not related to this and should be addressed separately.
Let's say that the lease time is 2 hours and I shutdown my VM. Now I reboot it after 2hours 5mins, the lease will be gone, since it will be The lease will expire, but dnsmasq still can reuse that address for the particular mac address until the pool is depleted.
expired. The .status file is read and the leases which have expired are removed. So, there is no guarantee that my VM will get the same IP Again, this looks like the actual bug.
Agreed. Here's another report of what I think is the same bug: https://www.redhat.com/archives/libvir-list/2016-September/msg00739.html and a patch that attempts to fix it in a different manner (by adding "dhcp-authoritative" to the dnsmasq.conf file): https://www.redhat.com/archives/libvir-list/2016-September/msg00909.html I actually don't think that either method is the correct solution to the problem. To back up for a minute - I think the source of everyone's problems is that dnsmasq as configured by current libvirt *completely* forgets leases as soon as they have expired, so that when a guest asks for the same address again, it is refused (because 1) although dnsmasq doesn't have any info that the address is currently assigned to anyone else, it is concerned that it may not be the only dhcp server on the network, and some other server may have given it out, and 2) what business does some lowly client have demanding which address it wants from dnsmasq anyway?!? </tongueInCheek>). Martin's patch tries to solve the problem with "dhcp-authoritative" which, as far as I understand, tells dnsmasq "you are the keeper of *all* lease information on this network, so if you think the address is unused, it really is unused"; dnsmasq uses this information to freely grant a guest any address it asks for, as long as there is no current lease for it. This sounds troublesome to me - If client A's lease has expired (because it's turned off for a bit), is it okay for client B to come barging in and insist on grabbing the address that client A has just lost even though there are many other addresses still available? Sure, technically it's legal, but it seems unnecessarily disruptive. On the other hand, this patch tries to solve the problem by providing a way to lengthen the lease. As Peter points out a few paragraphs below, unnecessarily lengthening the lease is a bad idea too, because it will lead to complete depletion of the address pool in a network that has a lot of clients who are only transiently online (think of a test setup where maybe 2000 new guests are created, used, and destroyed in a day). I think the *real* solution is to fix the lease handling so that dnsmasq remembers leases after they've expired (assuming that can be done once leasefile-ro is set). They would be marked as "expired", and probably not even reported externally, but all their info would still be there internally for dnsmasq's use when considering what to do with incoming requests for specific IP addresses.
address again. Hence the developer would want to increase the lease No it's not guaranteed, but very likely.
time to something big, like 1 week, and go to sleep happily when the VM is turned off at night :) That's a rather weak point. Setting too long lease time may on the other hand exhaust the pool. Usually the configured DHCP lease times are rather short.
Again agreed (see above).
The value for attribute 'units' can be seconds/hours/days/weeks/s/h/d/w. If the leasetime specified is -1, it is considered to be infinite. [...]
diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng index 1b4f980..023edfb 100644 --- a/docs/schemas/basictypes.rng +++ b/docs/schemas/basictypes.rng @@ -13,6 +13,12 @@ <param name='pattern'>[0-9]+</param> </data> </define> + <!-- Our long doesn"t allow a leading "+" in its lexical form --> + <define name='long'> + <data type='long'> This looks weird. Also why would you need a separate type for this?
I didn't find any available data type for signed integers. Which one would you rather use? (Signed, because we want to support -1 too) A quick grep revealled that there's a builtin "long" type used in at least one case.
+ <param name='pattern'>-?[0-9]+</param> + </data> [...]
diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index aa39776..d2372f2 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1032,21 +1032,76 @@ virNetworkDHCPHostDefParseXML(const char *networkName, [...]
+ + save = ctxt->node; + ctxt->node = node; + + leasetimeRV = virXPathLongLong("string(./text())", ctxt, &leasetime); + if (leasetimeRV == -2) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid long long value specified for leasetime " + "in definition of network '%s'"), + networkName); + goto cleanup; + } else { + if (leasetime < 0) { + leasetime = -1; /* infinite */ + } else { + unit = virXPathString("string(./@unit)", ctxt); + if (virScaleTime(&leasetime, unit, scale, + VIR_NETWORK_DHCP_LEASE_TIME_MAX) < 0) { + // let virScaleTime() report the appropriate error '//' comments should not be used in libvirt. Also the comment does not make sense since it's obvious what is happening. I have a query regarding this. If we rely on virScaleTime to report the appropriate error, it wouldn't be including the network name. Is That happens in quite a few cases.
that bad? Should I be modifying virScaleTime to return different return values for different errors and shape the error message appropriately here? (That would sort be inconsistent with No, not really.
virScaleInteger which doesn't have different error codes for different scenarios)
Whoops on the comment. Should have read hacking.html more carefully! [...]
@@ -2675,6 +2731,13 @@ virNetworkIPDefFormat(virBufferPtr buf, virBufferAddLit(buf, "<dhcp>\n"); virBufferAdjustIndent(buf, 2);
+ if (def->leasetime) { + virBufferAddLit(buf, "<leasetime"); + virBufferAsprintf(buf, " unit='seconds'>%lld", Why aren't we remembering the unit the user used originally? Other places can't do that since the unit was added later, but with this newly added code you can remember and format back the unit just fine.
Actually, I followed what was originally done for memory units. Even if user specifies MiB, everything is stored in KiB and on dumpxml, the This was because libvirt added support for scaled integers later than the original code. Thus we needed to keep compatibility. This is not required for new elements.
user is shown the unit in KiB. In case we want to remember the unit specified by the user, we'll have to add another member in the _virNetworkIPDef struct. Since we haven't done this anywhere, I was apprehensive about having inconsistent behaviour for 'unit'.
+ def->leasetime); + virBufferAddLit(buf, "</leasetime>\n"); + } [...]
diff --git a/src/util/virutil.c b/src/util/virutil.c index b57a195..81a60e6 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -275,6 +275,76 @@ virHexToBin(unsigned char c) } }
+/** + * virScaleTime: This change should be done as a separate patch.
+ * @value: pointer to the integer which is supposed to hold value + * @unit: pointer to the string holding the unit + * @scale: integer holding the value of scale ^^ at least this value is not documented enough.
+ * @limit: upper limit on scaled value + * + * Scale an integer @value in-place by an optional case-insensitive @unit, The statement about case insensitivity doesn't look to be true according to the code blow. With case insensitivity, 'Seconds' and 'SEconDs' would also be valid inputs, and it would be difficult to accommodate it in the .rng file, plus I was told that if attributes are words, then we shouldn't be going for case-insensitivity anyway. I'll fix this in the next patch set.
I had already said in IRC when you asked about representing case insensitivity in the RNG that any attribute values should *not* be made case-insensitive. Case insensitivity is for the FAT32 filesystem, not for libvirt XML.
Either case, the comment and code should not behave differently. The comment looks copied from the memory number scaler, but the code certainly doesn't behave that way.
+ * defaulting to @scale if @unit is NULL or empty. Recognized units include + * (w)eeks, (d)ays, (h)ours, (m)inutes and (s)econds. Ensure that the result
We don't want to have "w" and "weeks" represent the same thing. A single string should represent a single value. This can be easily handled by setting up and enum, then adding VIR_ENUM_DECL() and VIR_ENUM_IMPL() for it. We're not reimplementing /usr/bin/date.
+ * does not exceed @limit. Return 0 on success, -1 with error message raised + * on failure. This does not document the scale of the returned value at all.
+ */ +int +virScaleTime(long long *value, const char *unit, + long long scale, long long limit) Scale can't really be negative. Also the result being negative does not make much sense to me. Should this be changed to unsigned long long or unsigned long? Since leasetime has to be a signed integer, what would be the ideal way to pass it to the modified function? Should I typecast it?
From above, it seems you want it to accept a negative value so you can specify "-1" to be "unlimited"?' What about "0"? For that matter, do you *really* want to allow that?
Negative scale doesn't make sense. For time values, negative time value might make sense for relative time values. You need to check then that the overflow check works for negative numbers as well, which it doesn't since it was just copied from the memory integer scaler.

On Fri, 2016-09-23 at 11:56 -0400, Laine Stump wrote:
Martin's patch tries to solve the problem with "dhcp-authoritative" which, as far as I understand, tells dnsmasq "you are the keeper of *all* lease information on this network, so if you think the address is unused, it really is unused"; dnsmasq uses this information to freely grant a guest any address it asks for, as long as there is no current lease for it. This sounds troublesome to me - If client A's lease has expired (because it's turned off for a bit), is it okay for client B to come barging in and insist on grabbing the address that client A has just lost even though there are many other addresses still available? Sure, technically it's legal, but it seems unnecessarily disruptive.
I am not sure what you mean. What scenario would be "disrupted" by this approach? Without dhcp_authoritative, client A will not get its lease back, whether or not B applied for it in the meantime, and B wouldn't get the lease, either. The user wouldn't be able to ping either one. If dhcp_authoritative is used, at least one of them will get what it needs (and serving both is impossible - it's hard to tell what's the Right Thing in the situation you describe!). Anyway, unless the virtual network is really crowded (in which case it might make sense for the admin to use a class B network instead) the probability of such clashes should be rather minimal in the real world. It's much more likely that A itself tries to reacquire the lease, and that situation is fixed by dhcp_authoritative.
I think the *real* solution is to fix the lease handling so that dnsmasq remembers leases after they've expired (assuming that can be done once leasefile-ro is set). They would be marked as "expired", and probably not even reported externally, but all their info would still be there internally for dnsmasq's use when considering what to do with incoming requests for specific IP addresses.
Is that possible at all with the current leasehelper setup? I have to admit that, being new to this discussion, I haven't understood the purpose of the leasehelper yet. What benefit does it have for the operation of the virtual network, compared to dnsmasq alone? Why is the "custom leases file" needed at all? I tried to find a rationale on the web, but I didn't. Regards, Martin

On 09/23/2016 02:43 PM, Martin Wilck wrote:
On Fri, 2016-09-23 at 11:56 -0400, Laine Stump wrote:
Martin's patch tries to solve the problem with "dhcp-authoritative" which, as far as I understand, tells dnsmasq "you are the keeper of *all* lease information on this network, so if you think the address is unused, it really is unused"; dnsmasq uses this information to freely grant a guest any address it asks for, as long as there is no current lease for it. This sounds troublesome to me - If client A's lease has expired (because it's turned off for a bit), is it okay for client B to come barging in and insist on grabbing the address that client A has just lost even though there are many other addresses still available? Sure, technically it's legal, but it seems unnecessarily disruptive. I am not sure what you mean. What scenario would be "disrupted" by this approach?
Disruptive may not have been the best word. What I mean is that an unfriendly guest with proper information could take advantage of this "give any currently-unused address to anyone who asks for it" policy by purposefully taking over an IP address that it knew had previously been leased to a different guest (and who was expecting on getting the address back the next time it booted). Of course 1) it would take a lot of inside knowledge / good timing / luck to make this happen, and 2) you can argue that any guest who relied on a dhcp address remaining unchanged deserves what it gets (and the same for any other machine trying to contact that guest via the IP address).
Without dhcp_authoritative, client A will not get its lease back,
Is that true even if dnsmasq isn't restarted, and even if leasefile-ro isn't used? The Openstack discussion on this topic implies that the behavior only exists because they aren't saving a leasefile, not just because the lease is expired.
whether or not B applied for it in the meantime, and B wouldn't get the lease, either. The user wouldn't be able to ping either one. If dhcp_authoritative is used, at least one of them will get what it needs (and serving both is impossible - it's hard to tell what's the Right Thing in the situation you describe!).
Well, I guess it's a policy decision, but I would tend toward *not* allowing B to demand a particular IP address that it has never before been given.
Anyway, unless the virtual network is really crowded (in which case it might make sense for the admin to use a class B network instead) the probability of such clashes should be rather minimal in the real world. It's much more likely that A itself tries to reacquire the lease, and that situation is fixed by dhcp_authoritative.
Yeah, as long as the guests are all friendly. My concern is with unfriendly guests.
I think the *real* solution is to fix the lease handling so that dnsmasq remembers leases after they've expired (assuming that can be done once leasefile-ro is set). They would be marked as "expired", and probably not even reported externally, but all their info would still be there internally for dnsmasq's use when considering what to do with incoming requests for specific IP addresses. Is that possible at all with the current leasehelper setup?
"maybe"? See Nehal's recent email - we weren't notifying dnsmasq of expired leases, and that could be the source of the problem (or not, I didn't understand if he's verified whether doing so actually causes dnsmasq to accept the requests for expired addresses). If that works, then the problem is solved. If not, then we may need to consider using dhcp-authoritative.
I have to admit that, being new to this discussion, I haven't understood the purpose of the leasehelper yet. What benefit does it have for the operation of the virtual network, compared to dnsmasq alone? Why is the "custom leases file" needed at all? I tried to find a rationale on the web, but I didn't.
libvirt previously allowed dnsmasq to handle the lease files, but apparently they didn't have enough information for the APIs that report guest IP info, so Nehal replaced dnsmasq's automatic leasefile handling with custom handling that stores the extra data in addition to what dnsmasq needs. He can provide more info if you need it.

On Sat, Sep 24, 2016 at 3:01 AM, Laine Stump <laine@laine.org> wrote:
On 09/23/2016 02:43 PM, Martin Wilck wrote:
On Fri, 2016-09-23 at 11:56 -0400, Laine Stump wrote:
Martin's patch tries to solve the problem with "dhcp-authoritative" which, as far as I understand, tells dnsmasq "you are the keeper of *all* lease information on this network, so if you think the address is unused, it really is unused"; dnsmasq uses this information to freely grant a guest any address it asks for, as long as there is no current lease for it. This sounds troublesome to me - If client A's lease has expired (because it's turned off for a bit), is it okay for client B to come barging in and insist on grabbing the address that client A has just lost even though there are many other addresses still available? Sure, technically it's legal, but it seems unnecessarily disruptive.
I am not sure what you mean. What scenario would be "disrupted" by this approach?
Disruptive may not have been the best word. What I mean is that an unfriendly guest with proper information could take advantage of this "give any currently-unused address to anyone who asks for it" policy by purposefully taking over an IP address that it knew had previously been leased to a different guest (and who was expecting on getting the address back the next time it booted). Of course 1) it would take a lot of inside knowledge / good timing / luck to make this happen, and 2) you can argue that any guest who relied on a dhcp address remaining unchanged deserves what it gets (and the same for any other machine trying to contact that guest via the IP address).
Without dhcp_authoritative, client A will not get its lease back,
Is that true even if dnsmasq isn't restarted, and even if leasefile-ro isn't used? The Openstack discussion on this topic implies that the behavior only exists because they aren't saving a leasefile, not just because the lease is expired.
whether or not B applied for it in the meantime, and B wouldn't get the lease, either. The user wouldn't be able to ping either one. If dhcp_authoritative is used, at least one of them will get what it needs (and serving both is impossible - it's hard to tell what's the Right Thing in the situation you describe!).
Well, I guess it's a policy decision, but I would tend toward *not* allowing B to demand a particular IP address that it has never before been given.
Anyway, unless the virtual network is really crowded (in which case it might make sense for the admin to use a class B network instead) the probability of such clashes should be rather minimal in the real world. It's much more likely that A itself tries to reacquire the lease, and that situation is fixed by dhcp_authoritative.
Yeah, as long as the guests are all friendly. My concern is with unfriendly guests.
I think the *real* solution is to fix the lease handling so that dnsmasq remembers leases after they've expired (assuming that can be done once leasefile-ro is set). They would be marked as "expired", and probably not even reported externally, but all their info would still be there internally for dnsmasq's use when considering what to do with incoming requests for specific IP addresses.
Is that possible at all with the current leasehelper setup?
"maybe"? See Nehal's recent email - we weren't notifying dnsmasq of expired leases, and that could be the source of the problem (or not, I didn't understand if he's verified whether doing so actually causes dnsmasq to accept the requests for expired addresses). If that works, then the problem is solved. If not, then we may need to consider using dhcp-authoritative.
I did a small test. I compiled the SHA b55c064f of libvirt with a small modification. I enforced the lease time to be 2mins. Then I spawned a VM, and did the following: $ sudo journalctl -f | grep dnsmasq-dhcp Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPDISCOVER(virbr1) 52:54:00:12:34:56 Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPOFFER(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPREQUEST(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPACK(virbr1) 192.168.123.76 52:54:00:12:34:56 --- Now I pause the VM by sending 'stop' on the qemu monitor and wait. --- As soon as lease expires, I go back to the qemu monitor and un-pause the VM by sending 'cont' on the qemu monitor. Sep 24 19:34:06 lenovo dnsmasq-dhcp[9093]: DHCPREQUEST(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:06 lenovo dnsmasq-dhcp[9093]: DHCPNAK(virbr1) 192.168.123.76 52:54:00:12:34:56 lease not found --- I haven't restarted dnsmasq in between, so clearly, it knows about the expired lease. But it still sends a DHCPNAK. Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPDISCOVER(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPOFFER(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPREQUEST(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPACK(virbr1) 192.168.123.76 52:54:00:12:34:56 --- And the VM has to send a DHCPDISCOVER again :'( Inside the VM, the version of dhclient used by the Network Manager was: isc-dhclient-4.3.3-P1 Gentoo-r0 So this begs the question. Does dnsmasq really do anything about the expired lease? Or does it simply forget about it no matter what?
I have to admit that, being new to this discussion, I haven't understood the purpose of the leasehelper yet. What benefit does it have for the operation of the virtual network, compared to dnsmasq alone? Why is the "custom leases file" needed at all? I tried to find a rationale on the web, but I didn't.
libvirt previously allowed dnsmasq to handle the lease files, but apparently they didn't have enough information for the APIs that report guest IP info, so Nehal replaced dnsmasq's automatic leasefile handling with custom handling that stores the extra data in addition to what dnsmasq needs. He can provide more info if you need it.
There was quite an amount of discussion on this dating back to 2013, involving IRC as well as mailing list. All the discussion on the mailing list is available at: https://www.mail-archive.com/search?l=libvir-list%40redhat.com&q=from%3A%22Nehal%22&a=1&o=oldest . If there are any specific queries, I am available on email/IRC. -- Nehal J Wani

On 09/24/2016 10:51 AM, Nehal J Wani wrote:
On Sat, Sep 24, 2016 at 3:01 AM, Laine Stump <laine@laine.org> wrote:
On 09/23/2016 02:43 PM, Martin Wilck wrote:
On Fri, 2016-09-23 at 11:56 -0400, Laine Stump wrote:
Martin's patch tries to solve the problem with "dhcp-authoritative" which, as far as I understand, tells dnsmasq "you are the keeper of *all* lease information on this network, so if you think the address is unused, it really is unused"; dnsmasq uses this information to freely grant a guest any address it asks for, as long as there is no current lease for it. This sounds troublesome to me - If client A's lease has expired (because it's turned off for a bit), is it okay for client B to come barging in and insist on grabbing the address that client A has just lost even though there are many other addresses still available? Sure, technically it's legal, but it seems unnecessarily disruptive. I am not sure what you mean. What scenario would be "disrupted" by this approach?
Disruptive may not have been the best word. What I mean is that an unfriendly guest with proper information could take advantage of this "give any currently-unused address to anyone who asks for it" policy by purposefully taking over an IP address that it knew had previously been leased to a different guest (and who was expecting on getting the address back the next time it booted). Of course 1) it would take a lot of inside knowledge / good timing / luck to make this happen, and 2) you can argue that any guest who relied on a dhcp address remaining unchanged deserves what it gets (and the same for any other machine trying to contact that guest via the IP address).
Without dhcp_authoritative, client A will not get its lease back,
Is that true even if dnsmasq isn't restarted, and even if leasefile-ro isn't used? The Openstack discussion on this topic implies that the behavior only exists because they aren't saving a leasefile, not just because the lease is expired.
whether or not B applied for it in the meantime, and B wouldn't get the lease, either. The user wouldn't be able to ping either one. If dhcp_authoritative is used, at least one of them will get what it needs (and serving both is impossible - it's hard to tell what's the Right Thing in the situation you describe!).
Well, I guess it's a policy decision, but I would tend toward *not* allowing B to demand a particular IP address that it has never before been given.
Anyway, unless the virtual network is really crowded (in which case it might make sense for the admin to use a class B network instead) the probability of such clashes should be rather minimal in the real world. It's much more likely that A itself tries to reacquire the lease, and that situation is fixed by dhcp_authoritative.
Yeah, as long as the guests are all friendly. My concern is with unfriendly guests.
I think the *real* solution is to fix the lease handling so that dnsmasq remembers leases after they've expired (assuming that can be done once leasefile-ro is set). They would be marked as "expired", and probably not even reported externally, but all their info would still be there internally for dnsmasq's use when considering what to do with incoming requests for specific IP addresses. Is that possible at all with the current leasehelper setup?
"maybe"? See Nehal's recent email - we weren't notifying dnsmasq of expired leases, and that could be the source of the problem (or not, I didn't understand if he's verified whether doing so actually causes dnsmasq to accept the requests for expired addresses). If that works, then the problem is solved. If not, then we may need to consider using dhcp-authoritative.
I did a small test. I compiled the SHA b55c064f of libvirt with a small modification. I enforced the lease time to be 2mins. Then I spawned a VM, and did the following:
$ sudo journalctl -f | grep dnsmasq-dhcp Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPDISCOVER(virbr1) 52:54:00:12:34:56 Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPOFFER(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPREQUEST(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPACK(virbr1) 192.168.123.76 52:54:00:12:34:56
--- Now I pause the VM by sending 'stop' on the qemu monitor and wait. --- As soon as lease expires, I go back to the qemu monitor and un-pause the VM by sending 'cont' on the qemu monitor.
Sep 24 19:34:06 lenovo dnsmasq-dhcp[9093]: DHCPREQUEST(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:06 lenovo dnsmasq-dhcp[9093]: DHCPNAK(virbr1) 192.168.123.76 52:54:00:12:34:56 lease not found
--- I haven't restarted dnsmasq in between, so clearly, it knows about the expired lease. But it still sends a DHCPNAK.
Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPDISCOVER(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPOFFER(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPREQUEST(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPACK(virbr1) 192.168.123.76 52:54:00:12:34:56
--- And the VM has to send a DHCPDISCOVER again :'(
But when it does, it again gets the address it previously had, rather than some different address.
Inside the VM, the version of dhclient used by the Network Manager was: isc-dhclient-4.3.3-P1 Gentoo-r0
So this begs the question. Does dnsmasq really do anything about the expired lease? Or does it simply forget about it no matter what?
Well it least *appears* that dnsmasq ended up giving out the same IP address once the client sent a DHCPDISCOVER. We should find out if that's a happy accident, or it it happens on purpose. (in other words, does dnsmasq try to *not* give out addresses that were previously in use but now expired unless the original owner comes back (or all the other addresses are in use), or was this just coincidental because nobody else had requested an address in the meantime?)

On Sun, Sep 25, 2016 at 12:58 AM, Laine Stump <laine@laine.org> wrote:
On 09/24/2016 10:51 AM, Nehal J Wani wrote:
On Sat, Sep 24, 2016 at 3:01 AM, Laine Stump <laine@laine.org> wrote:
On 09/23/2016 02:43 PM, Martin Wilck wrote:
On Fri, 2016-09-23 at 11:56 -0400, Laine Stump wrote:
Martin's patch tries to solve the problem with "dhcp-authoritative" which, as far as I understand, tells dnsmasq "you are the keeper of *all* lease information on this network, so if you think the address is unused, it really is unused"; dnsmasq uses this information to freely grant a guest any address it asks for, as long as there is no current lease for it. This sounds troublesome to me - If client A's lease has expired (because it's turned off for a bit), is it okay for client B to come barging in and insist on grabbing the address that client A has just lost even though there are many other addresses still available? Sure, technically it's legal, but it seems unnecessarily disruptive.
I am not sure what you mean. What scenario would be "disrupted" by this approach?
Disruptive may not have been the best word. What I mean is that an unfriendly guest with proper information could take advantage of this "give any currently-unused address to anyone who asks for it" policy by purposefully taking over an IP address that it knew had previously been leased to a different guest (and who was expecting on getting the address back the next time it booted). Of course 1) it would take a lot of inside knowledge / good timing / luck to make this happen, and 2) you can argue that any guest who relied on a dhcp address remaining unchanged deserves what it gets (and the same for any other machine trying to contact that guest via the IP address).
Without dhcp_authoritative, client A will not get its lease back,
Is that true even if dnsmasq isn't restarted, and even if leasefile-ro isn't used? The Openstack discussion on this topic implies that the behavior only exists because they aren't saving a leasefile, not just because the lease is expired.
whether or not B applied for it in the meantime, and B wouldn't get the lease, either. The user wouldn't be able to ping either one. If dhcp_authoritative is used, at least one of them will get what it needs (and serving both is impossible - it's hard to tell what's the Right Thing in the situation you describe!).
Well, I guess it's a policy decision, but I would tend toward *not* allowing B to demand a particular IP address that it has never before been given.
Anyway, unless the virtual network is really crowded (in which case it might make sense for the admin to use a class B network instead) the probability of such clashes should be rather minimal in the real world. It's much more likely that A itself tries to reacquire the lease, and that situation is fixed by dhcp_authoritative.
Yeah, as long as the guests are all friendly. My concern is with unfriendly guests.
I think the *real* solution is to fix the lease handling so that dnsmasq remembers leases after they've expired (assuming that can be done once leasefile-ro is set). They would be marked as "expired", and probably not even reported externally, but all their info would still be there internally for dnsmasq's use when considering what to do with incoming requests for specific IP addresses.
Is that possible at all with the current leasehelper setup?
"maybe"? See Nehal's recent email - we weren't notifying dnsmasq of expired leases, and that could be the source of the problem (or not, I didn't understand if he's verified whether doing so actually causes dnsmasq to accept the requests for expired addresses). If that works, then the problem is solved. If not, then we may need to consider using dhcp-authoritative.
I did a small test. I compiled the SHA b55c064f of libvirt with a small modification. I enforced the lease time to be 2mins. Then I spawned a VM, and did the following:
$ sudo journalctl -f | grep dnsmasq-dhcp Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPDISCOVER(virbr1) 52:54:00:12:34:56 Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPOFFER(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPREQUEST(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:31:13 lenovo dnsmasq-dhcp[9093]: DHCPACK(virbr1) 192.168.123.76 52:54:00:12:34:56
--- Now I pause the VM by sending 'stop' on the qemu monitor and wait. --- As soon as lease expires, I go back to the qemu monitor and un-pause the VM by sending 'cont' on the qemu monitor.
Sep 24 19:34:06 lenovo dnsmasq-dhcp[9093]: DHCPREQUEST(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:06 lenovo dnsmasq-dhcp[9093]: DHCPNAK(virbr1) 192.168.123.76 52:54:00:12:34:56 lease not found
--- I haven't restarted dnsmasq in between, so clearly, it knows about the expired lease. But it still sends a DHCPNAK.
Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPDISCOVER(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPOFFER(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPREQUEST(virbr1) 192.168.123.76 52:54:00:12:34:56 Sep 24 19:34:22 lenovo dnsmasq-dhcp[9093]: DHCPACK(virbr1) 192.168.123.76 52:54:00:12:34:56
--- And the VM has to send a DHCPDISCOVER again :'(
But when it does, it again gets the address it previously had, rather than some different address.
In the previous test, there was only one VM in that network at all time.
Inside the VM, the version of dhclient used by the Network Manager was: isc-dhclient-4.3.3-P1 Gentoo-r0
So this begs the question. Does dnsmasq really do anything about the expired lease? Or does it simply forget about it no matter what?
Well it least *appears* that dnsmasq ended up giving out the same IP address once the client sent a DHCPDISCOVER. We should find out if that's a happy accident, or it it happens on purpose. (in other words, does dnsmasq try to *not* give out addresses that were previously in use but now expired unless the original owner comes back (or all the other addresses are in use), or was this just coincidental because nobody else had requested an address in the meantime?)
I did another experiment. My network xml limited the range to: <ip address='192.168.123.1' netmask='255.255.255.0'> <dhcp> <range start='192.168.123.2' end='192.168.123.6'/> </dhcp> </ip> So, we can safely assume that, if, at any given point of time, there are two virtual machines with different mac addresses, connected to this network, they should have been leased different IP addresses. VM1 is spawned with -net nic,macaddr=52:54:00:69:09:b6 -net bridge,br=virbr1 VM1 is assigned 192.168.123.4 with lease time of 2 mins. Sep 27 08:53:32 lenovo dnsmasq-dhcp[12156]: DHCPDISCOVER(virbr1) 52:54:00:69:09:b6 Sep 27 08:53:32 lenovo dnsmasq-dhcp[12156]: DHCPOFFER(virbr1) 192.168.123.4 52:54:00:69:09:b6 Sep 27 08:53:32 lenovo dnsmasq-dhcp[12156]: DHCPREQUEST(virbr1) 192.168.123.4 52:54:00:69:09:b6 Sep 27 08:53:32 lenovo dnsmasq-dhcp[12156]: DHCPACK(virbr1) 192.168.123.4 52:54:00:69:09:b6 VM1 is paused. VM1's lease expires. VM2 is spawned with -net nic,macaddr=52:54:00:69:09:b1 -net bridge,br=virbr1 VM2 is assigned 192.168.123.4 with lease time of 2 mins. :-'( Sep 27 08:56:04 lenovo dnsmasq-dhcp[12156]: DHCPDISCOVER(virbr1) 52:54:00:69:09:b1 Sep 27 08:56:04 lenovo dnsmasq-dhcp[12156]: DHCPOFFER(virbr1) 192.168.123.4 52:54:00:69:09:b1 Sep 27 08:56:04 lenovo dnsmasq-dhcp[12156]: DHCPREQUEST(virbr1) 192.168.123.4 52:54:00:69:09:b1 Sep 27 08:56:04 lenovo dnsmasq-dhcp[12156]: DHCPACK(virbr1) 192.168.123.4 52:54:00:69:09:b1 Seems like the previous experiment involved a happy accident indeed. -- Nehal J Wani

On Wed, Sep 28, 2016 at 04:23:32PM +0200, Martin Wilck wrote:
On Tue, 2016-09-27 at 09:15 +0530, Nehal J Wani wrote:
Seems like the previous experiment involved a happy accident indeed.
So, what are we going to do about libvirt+dnsmasq's handling of expired leases now?
IMHO nothing. Once a lease has expired all bets are off, so any behaviour is valid and you have to expect you'll get a different IP address at some point. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On Tue, Sep 27, 2016 at 09:15:18AM +0530, Nehal J Wani wrote:
I did another experiment. My network xml limited the range to:
<ip address='192.168.123.1' netmask='255.255.255.0'> <dhcp> <range start='192.168.123.2' end='192.168.123.6'/> </dhcp> </ip>
So, we can safely assume that, if, at any given point of time, there are two virtual machines with different mac addresses, connected to this network, they should have been leased different IP addresses.
VM1 is spawned with -net nic,macaddr=52:54:00:69:09:b6 -net bridge,br=virbr1
VM1 is assigned 192.168.123.4 with lease time of 2 mins.
Sep 27 08:53:32 lenovo dnsmasq-dhcp[12156]: DHCPDISCOVER(virbr1) 52:54:00:69:09:b6 Sep 27 08:53:32 lenovo dnsmasq-dhcp[12156]: DHCPOFFER(virbr1) 192.168.123.4 52:54:00:69:09:b6 Sep 27 08:53:32 lenovo dnsmasq-dhcp[12156]: DHCPREQUEST(virbr1) 192.168.123.4 52:54:00:69:09:b6 Sep 27 08:53:32 lenovo dnsmasq-dhcp[12156]: DHCPACK(virbr1) 192.168.123.4 52:54:00:69:09:b6
VM1 is paused.
VM1's lease expires.
VM2 is spawned with -net nic,macaddr=52:54:00:69:09:b1 -net bridge,br=virbr1
VM2 is assigned 192.168.123.4 with lease time of 2 mins. :-'(
That is totally valid for dnsmasq todo. Once VM1's lease expires there is no rule that it must not be given to VM2. Once VM1 is unpaused it will try to renew its original lease and will get a NACK from dnsmasq, at which point it will have to request a new lease and get a different IP address. There's nothing really special about VMs here either. The same would happen with physical laptop if you close the lid and re-open it some time later after your lease expired - any other laptop could have got your lease in the meantime. Anyone who cares about VMs having the same IP address after resuming from suspend *must* configure persistent DHCP mappings for their guests. Relying on automatic assignment will always fail in the end, may be not immediately, may be not often, but it *will* fail at some point, usually at the worst possible time. Or if you enable the libvirt nss module, then you can avoid needing to care about specific PI addresses at all and just use the VM names as a hostname Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On Wed, 2016-09-28 at 16:41 +0100, Daniel P. Berrange wrote:
Anyone who cares about VMs having the same IP address after resuming from suspend *must* configure persistent DHCP mappings for their guests. Relying on automatic assignment will always fail in the end, may be not immediately, may be not often, but it *will* fail at some point, usually at the worst possible time.
I believe that setting "dhcp-authoritative" will be a major improvement for many setups. Without it, VMs are *never* able to reacquire their expired lease. With it, reacquiring the lease would work most of the time (as far as I'm concerned, practically always). I reckon the dnsmasq man page recommends it for a reason "when dnsmasq is definitely the only DHCP server on a network". Creating persistent DHCP entries using "virsh net-edit" is not exactly user-friendly today. libvirt-nss is helpful, but other stuff relying on IP address (NFS, ssh sessions, ...) would still benefit from "dhcp- authorative". Regards Martin

On Wed, Sep 28, 2016 at 06:03:21PM +0200, Martin Wilck wrote:
On Wed, 2016-09-28 at 16:41 +0100, Daniel P. Berrange wrote:
Anyone who cares about VMs having the same IP address after resuming from suspend *must* configure persistent DHCP mappings for their guests. Relying on automatic assignment will always fail in the end, may be not immediately, may be not often, but it *will* fail at some point, usually at the worst possible time.
I believe that setting "dhcp-authoritative" will be a major improvement for many setups. Without it, VMs are *never* able to reacquire their expired lease. With it, reacquiring the lease would work most of the time (as far as I'm concerned, practically always). I reckon the dnsmasq man page recommends it for a reason "when dnsmasq is definitely the only DHCP server on a network".
That certainly matches our deployment model for dnsmasq, so unless there's a obvious downside to it, it seems reasonable to add that. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On Wed, 2016-09-28 at 17:10 +0100, Daniel P. Berrange wrote:
I believe that setting "dhcp-authoritative" will be a major improvement for many setups. Without it, VMs are *never* able to reacquire their expired lease. With it, reacquiring the lease would work most of the time (as far as I'm concerned, practically always). I reckon the dnsmasq man page recommends it for a reason "when dnsmasq is definitely the only DHCP server on a network".
That certainly matches our deployment model for dnsmasq, so unless there's a obvious downside to it, it seems reasonable to add that.
I'm glad to read that. Is anything needed besides http://www.spinics.net/linux/fedora/libvir/msg136813.html and follow-ups? Martin

On Fri, Sep 23, 2016 at 9:26 PM, Laine Stump <laine@laine.org> wrote:
On 09/23/2016 08:24 AM, Nehal J Wani wrote:
On Fri, Sep 23, 2016 at 4:51 PM, Peter Krempa <pkrempa@redhat.com> wrote:
On Fri, Sep 23, 2016 at 15:33:46 +0530, Nehal J Wani wrote:
On Fri, Sep 23, 2016 at 1:20 PM, Peter Krempa <pkrempa@redhat.com> wrote:
On Fri, Sep 23, 2016 at 12:19:49 +0530, Nehal J Wani wrote:
The default dhcp lease time set by dnsmasq is only one hour, which can be pretty small for developers relying on ip address(es) to be consistent across reboots.
This should be handled through the leases file which is supposed to remember the IP addresses given to a host despite a short lease time. Is that broken for you?
We can't ask developers to parse the .status file and the DHCP leases API would never show expired leases. Or am I misinterpreting what you
If a lease is expired then the VM is necessarily not using it for a while.
mean by 'remember'? Even dnsmasq depends on the output for the init
I meant 'remember' as dnsmasq should assign the same address even if the lease was expired for a while.
Yes, but then you are assuming that dnsmasq hasn't been restarted after the lease has expired. User may switch distros in between or shutdown is computer, etc.
event sent to the leaseshelper program, which will not output lease which have expired.
So then maybe this is the actual bug. The expired leases still can be renewed and AFAIK dnsmasq reloads them even for the expired entries so we probably should remove that part from the leaseshelper.
After addition of leaseshelper, because of the --leasefile-ro option that we specify to dnsmasq, the leases database is entirely controlled by it. If the developer shuts down his VM, the lease expires and then he restarts the computer (which will involve restarting the dnsmasq service), dnsmasq will rely on the leaseshelper program for the initial input (aka init event). The first parameter from the 'init' output is the expiry time of the lease. If the lease has already expired, then we can't print it in virLeasePrintLeases(). Hence I fail to understand why you would think that this may be a bug.
Why would libvirt/leaseshelper renew a certain lease? Isn't that the job of the client?
I'm pretty sure that's not what Peter is suggesting. What he's saying is that dnsmasq normally keeps track of even leases that have expired and tries not to give them out to others who are simply requesting "give me an address", so that when a client returns at some later time they have a chance of getting back the same address they had before even if their lease had expired (as long as it wasn't needed for some other client due to all the other available addresses being used).
I at least *thought* this worked with a standard dnsmasq setup.
This patch adds support for setting a lease time in the network definition. For now, all IP ranges under one <dhcp> will share a common leasetime.
An example: <dhcp> <leasetime units='hours'>12</leasetime> </dhcp>
This sounds like a reasonable feature, but I don't really want to sell it as a workaround for the problem you've described above. Setting the lease time is useful, but guests should get the same IP addresses thanks to the lease file.
Actually, the major reason behind this feature was mentioned at https://www.redhat.com/archives/libvir-list/2016-April/msg01071.html
I'm a bit curious if you had any contact with the author of that patch other than CC'ing them to your patch mail. From the BZ (https://bugzilla.redhat.com/show_bug.cgi?id=913446 ), it sounds like he was still intending to update his patch but was bogged down implementing some extra functionality.
As I've said, adding ability to specify longer lease time may be desired, but the fact that you don't get the address most probably is not related to this and should be addressed separately.
Let's say that the lease time is 2 hours and I shutdown my VM. Now I reboot it after 2hours 5mins, the lease will be gone, since it will be
The lease will expire, but dnsmasq still can reuse that address for the particular mac address until the pool is depleted.
expired. The .status file is read and the leases which have expired are removed. So, there is no guarantee that my VM will get the same IP
Again, this looks like the actual bug.
Agreed. Here's another report of what I think is the same bug:
https://www.redhat.com/archives/libvir-list/2016-September/msg00739.html
and a patch that attempts to fix it in a different manner (by adding "dhcp-authoritative" to the dnsmasq.conf file):
https://www.redhat.com/archives/libvir-list/2016-September/msg00909.html
I actually don't think that either method is the correct solution to the problem.
To back up for a minute - I think the source of everyone's problems is that dnsmasq as configured by current libvirt *completely* forgets leases as soon as they have expired, so that when a guest asks for the same address again, it is refused (because 1) although dnsmasq doesn't have any info that the address is currently assigned to anyone else, it is concerned that it may not be the only dhcp server on the network, and some other server may have given it out, and 2) what business does some lowly client have demanding which address it wants from dnsmasq anyway?!? </tongueInCheek>).
I was wrong. Peter and you both are right. There does seem to be a bug in the leases helper program. All it has to be, is a program to maintain a database and follow certain events. It was never supposed to control the expiry of leases on its own. Here is what it does in case of any event: 112 /* Check whether lease has expired or not */ 113 if (expirytime < currtime) { 114 i++; 115 continue; 116 } Which, IMO, is wrong. When init event takes place, the leases helper is supposed to print all leases, irrespective of whether or not they have expired. Since it doesn't do so, dnsmasq doesn't come to know about the expired lease. According to the discussion at http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2014q2/008657.html , the output from the init event has to be exactly as same as that of the leases file which dnsmasq used to maintain previously for libvirt-managed networks. In dnsmasq's code, it almost treats both databases to be same: Without --leasefile-ro: leasestream = daemon->lease_stream = fopen(daemon->lease_file, "a+"); With --leasefile-ro and --dhcp-script: strcpy(daemon->dhcp_buff, daemon->lease_change_command); strcat(daemon->dhcp_buff, " init"); leasestream = popen(daemon->dhcp_buff, "r"); For an expired lease, dnsmasq sends a 'del' event anyway, so leases helper doesn't need to worry about expiry at all. I tried it out with hard coded 2m dhcp lease time and confirmed that we are not sending the expired leases info on the init event. I also confirmed that if leases helper doesn't control the expiry time on its own, and prints the lease info on the init event, dnsmasq properly sends the del event to remove it from our database instantly. With the above change, IMO, we can rely on the fact that dnsmasq will always know about the expired leases too.
Martin's patch tries to solve the problem with "dhcp-authoritative" which, as far as I understand, tells dnsmasq "you are the keeper of *all* lease information on this network, so if you think the address is unused, it really is unused"; dnsmasq uses this information to freely grant a guest any address it asks for, as long as there is no current lease for it. This sounds troublesome to me - If client A's lease has expired (because it's turned off for a bit), is it okay for client B to come barging in and insist on grabbing the address that client A has just lost even though there are many other addresses still available? Sure, technically it's legal, but it seems unnecessarily disruptive.
On the other hand, this patch tries to solve the problem by providing a way to lengthen the lease. As Peter points out a few paragraphs below, unnecessarily lengthening the lease is a bad idea too, because it will lead to complete depletion of the address pool in a network that has a lot of clients who are only transiently online (think of a test setup where maybe 2000 new guests are created, used, and destroyed in a day).
I think the *real* solution is to fix the lease handling so that dnsmasq remembers leases after they've expired (assuming that can be done once leasefile-ro is set). They would be marked as "expired", and probably not even reported externally, but all their info would still be there internally for dnsmasq's use when considering what to do with incoming requests for specific IP addresses.
address again. Hence the developer would want to increase the lease
No it's not guaranteed, but very likely.
time to something big, like 1 week, and go to sleep happily when the VM is turned off at night :)
That's a rather weak point. Setting too long lease time may on the other hand exhaust the pool. Usually the configured DHCP lease times are rather short.
Again agreed (see above).
The value for attribute 'units' can be seconds/hours/days/weeks/s/h/d/w. If the leasetime specified is -1, it is considered to be infinite.
[...]
diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng index 1b4f980..023edfb 100644 --- a/docs/schemas/basictypes.rng +++ b/docs/schemas/basictypes.rng @@ -13,6 +13,12 @@ <param name='pattern'>[0-9]+</param> </data> </define> + <!-- Our long doesn"t allow a leading "+" in its lexical form --> + <define name='long'> + <data type='long'>
This looks weird. Also why would you need a separate type for this?
I didn't find any available data type for signed integers. Which one would you rather use? (Signed, because we want to support -1 too)
A quick grep revealled that there's a builtin "long" type used in at least one case.
+ <param name='pattern'>-?[0-9]+</param> + </data>
[...]
diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index aa39776..d2372f2 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1032,21 +1032,76 @@ virNetworkDHCPHostDefParseXML(const char *networkName,
[...]
+ + save = ctxt->node; + ctxt->node = node; + + leasetimeRV = virXPathLongLong("string(./text())", ctxt, &leasetime); + if (leasetimeRV == -2) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid long long value specified for leasetime " + "in definition of network '%s'"), + networkName); + goto cleanup; + } else { + if (leasetime < 0) { + leasetime = -1; /* infinite */ + } else { + unit = virXPathString("string(./@unit)", ctxt); + if (virScaleTime(&leasetime, unit, scale, + VIR_NETWORK_DHCP_LEASE_TIME_MAX) < 0) { + // let virScaleTime() report the appropriate error
'//' comments should not be used in libvirt. Also the comment does not make sense since it's obvious what is happening.
I have a query regarding this. If we rely on virScaleTime to report the appropriate error, it wouldn't be including the network name. Is
That happens in quite a few cases.
that bad? Should I be modifying virScaleTime to return different return values for different errors and shape the error message appropriately here? (That would sort be inconsistent with
No, not really.
virScaleInteger which doesn't have different error codes for different scenarios)
Whoops on the comment. Should have read hacking.html more carefully!
[...]
@@ -2675,6 +2731,13 @@ virNetworkIPDefFormat(virBufferPtr buf, virBufferAddLit(buf, "<dhcp>\n"); virBufferAdjustIndent(buf, 2);
+ if (def->leasetime) { + virBufferAddLit(buf, "<leasetime"); + virBufferAsprintf(buf, " unit='seconds'>%lld",
Why aren't we remembering the unit the user used originally? Other places can't do that since the unit was added later, but with this newly added code you can remember and format back the unit just fine.
Actually, I followed what was originally done for memory units. Even if user specifies MiB, everything is stored in KiB and on dumpxml, the
This was because libvirt added support for scaled integers later than the original code. Thus we needed to keep compatibility. This is not required for new elements.
user is shown the unit in KiB. In case we want to remember the unit specified by the user, we'll have to add another member in the _virNetworkIPDef struct. Since we haven't done this anywhere, I was apprehensive about having inconsistent behaviour for 'unit'.
+ def->leasetime); + virBufferAddLit(buf, "</leasetime>\n"); + }
[...]
diff --git a/src/util/virutil.c b/src/util/virutil.c index b57a195..81a60e6 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -275,6 +275,76 @@ virHexToBin(unsigned char c) } }
+/** + * virScaleTime:
This change should be done as a separate patch.
+ * @value: pointer to the integer which is supposed to hold value + * @unit: pointer to the string holding the unit + * @scale: integer holding the value of scale
^^ at least this value is not documented enough.
+ * @limit: upper limit on scaled value + * + * Scale an integer @value in-place by an optional case-insensitive @unit,
The statement about case insensitivity doesn't look to be true according to the code blow.
With case insensitivity, 'Seconds' and 'SEconDs' would also be valid inputs, and it would be difficult to accommodate it in the .rng file, plus I was told that if attributes are words, then we shouldn't be going for case-insensitivity anyway. I'll fix this in the next patch set.
I had already said in IRC when you asked about representing case insensitivity in the RNG that any attribute values should *not* be made case-insensitive. Case insensitivity is for the FAT32 filesystem, not for libvirt XML.
Either case, the comment and code should not behave differently. The comment looks copied from the memory number scaler, but the code certainly doesn't behave that way.
+ * defaulting to @scale if @unit is NULL or empty. Recognized units include + * (w)eeks, (d)ays, (h)ours, (m)inutes and (s)econds. Ensure that the result
We don't want to have "w" and "weeks" represent the same thing. A single string should represent a single value. This can be easily handled by setting up and enum, then adding VIR_ENUM_DECL() and VIR_ENUM_IMPL() for it. We're not reimplementing /usr/bin/date.
+ * does not exceed @limit. Return 0 on success, -1 with error message raised + * on failure.
This does not document the scale of the returned value at all.
+ */ +int +virScaleTime(long long *value, const char *unit, + long long scale, long long limit)
Scale can't really be negative. Also the result being negative does not make much sense to me.
Should this be changed to unsigned long long or unsigned long? Since leasetime has to be a signed integer, what would be the ideal way to pass it to the modified function? Should I typecast it?
From above, it seems you want it to accept a negative value so you can specify "-1" to be "unlimited"?' What about "0"? For that matter, do you *really* want to allow that?
Negative scale doesn't make sense. For time values, negative time value might make sense for relative time values. You need to check then that the overflow check works for negative numbers as well, which it doesn't since it was just copied from the memory integer scaler.
-- Nehal J Wani
participants (5)
-
Daniel P. Berrange
-
Laine Stump
-
Martin Wilck
-
Nehal J Wani
-
Peter Krempa