
On 06/15/2015 05:03 PM, John Ferlan wrote:
Something I didn't catch on pass1, but perhaps an important distinction
title:
Resolve issues with SCSI LUN hostdev device addressing
Sounds good! I'll work that into the split patches.
Coincidentally - there's a bz regarding the <address> setting between SCSI <disk> and <hostdev> which is assigned to me and was near the top of my todo list... https://bugzilla.redhat.com/show_bug.cgi?id=1210587 so this has helped put me in the right frame of reference!
...
Need to add some XML in order to "test" the longer values - eg, xml2xml and/or xml2args in order to validate that what you read in is what gets printed out and that what you read in gets sent to the hypervisor correctly. I wrestled with this part, because the xml2args tests pass the SCSI [host:bus:target:unit] address to testSCSIDeviceGetSgName, which ignores all inputs and returns /dev/sg0. The reverse had a similar gotcha. I'll dig at xml2xml, and see what I can come up with.
<me> wonders what qemu dos with a 20 digit unit number...
So speaking of qemu... If you look at qemuBuildDriveStr you will see how the device address is passed down to qemu... That function needs some adjustment too it seems.
This function doesn't appear to be invoked for an <address> tag within a <hostdev> element, either as part of the <source> subelements or as the address being created for the guest. Rather, it seems to turn up as part of a <disk> element, which (unless I'm mistaken) relies on the host "/dev" address instead of [host:bus:target:unit]. Ah... so back to my title comment ;-) - hostdev... That would mean qemuBuildSCSIHostdevDevStr, which does have a printing of bus, target, unit. Still wondering about how qemu would handle it (haven't got that far yet), but that seems to be what you point out in the next paragraph.
Speaking of the address that is created in the guest, though, I left that as-is because virtio defines LUN addresses up to 16,384 and is enforced in KVM/QEMU. Since I can't create a unit address in the guest larger than that, leaving that defined as an int is okay.
And so regardless of what you did for libvirt, kvm/qemu wouldn't be able to handle it? Perhaps then that needs to be checked somehow... Is this something qemu/kvm needs to support?
I'll defer to you, but kvm/qemu enforces things okay and libvirt fails gracefully. Example: # cat disk.xml <hostdev mode='subsystem' type='scsi'> <source> <adapter name='scsi_host0'/> <address bus='0' target='14' unit='1074872354'/> </source> <address type='drive' controller='0' bus='0' target='0' unit='16384'/> </hostdev> # virsh attach-device lmb_guest disk.xml error: Failed to attach device from disk.xml error: internal error: unable to execute QEMU command 'device_add': bad scsi device lun: 16384 # vim disk.xml # cat disk.xml <hostdev mode='subsystem' type='scsi'> <source> <adapter name='scsi_host0'/> <address bus='0' target='14' unit='1074872354'/> </source> <address type='drive' controller='0' bus='0' target='0' unit='16383'/> </hostdev> # virsh attach-device lmb_guest disk.xml Device attached successfully This looks reasonable to me because virtio only permits 256 targets, and 16,384 units per targets. So there's no concern with the corresponding "int" fields being overrun.
Again I haven't looked at those sources yet. Is there a specific hypervisor that isn't working because libvirt isn't passing the correct address value? This is the "what is the bug you're trying to fix" type question...
I'm using qemu/kvm, but I haven't seen anything besides the "virsh edit" and "virt-xml-validate" errors when the hostdev->source->address->unit value is larger than two digits. But this really boils to your later comment about the patch being split, to explain the sequence of events that brought me here. I have it broken into the patches you had suggested, plus two others (one doc, one xml schema which is what started this), but haven't tested them independently yet. Tomorrow, I hope. Eric
I didn't dig into the qemu sources yet, but
Looks like I also lost my train of thought here ;-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 1781996..e7a8e1a 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2827,7 +2827,7 @@ <dd>Drive addresses have the following additional attributes: <code>controller</code> (a 2-digit controller number), <code>bus</code> (a 2-digit bus number), - <code>target</code> (a 2-digit bus number), + <code>target</code> (a 2-digit target number), and <code>unit</code> (a 2-digit unit number on the bus). </dd> <dt><code>type='virtio-serial'</code></dt> Interesting there's no 'scsi' in here (of course you're removing it below, but that leaves the address as unknown Is it? See below, but I thought that got hardcoded somewhere because it's in a hostdev. I'll doublecheck.
It seems virDomainHostdevSubsysSCSIHostDefParseXML ignores the type field and the RNG supports that.
@@ -3136,7 +3136,7 @@ <hostdev mode='subsystem' type='scsi' sgio='filtered' rawio='yes'> <source> <adapter name='scsi_host0'/> - <address type='scsi' bus='0' target='0' unit='0'/> + <address bus='0' target='0' unit='0'/> These first two don't seem relevant to the changes below and should have been their own patch.
Fair enough, my apologies. I'll break them out.
We all fall into the same trap at some point in time.... ;-)
See [1] below for a side comment on this particular changes
...
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 36de844..1f2bfca 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4954,7 +4954,7 @@ virDomainHostdevSubsysSCSIHostDefParseXML(xmlNodePtr sourcenode, goto cleanup; } - if (virStrToLong_ui(unit, NULL, 0, &scsihostsrc->unit) < 0) { + if (virStrToLong_ull(unit, NULL, 0, &scsihostsrc->unit) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot parse unit '%s'"), unit); Existing - read as "ui" (or now "ull"), but could be "uip" and "ullp" (eg, don't allow reading a negative value).
I would be remiss if I didn't point out UINT_MAX is 4294967295U, which is 10 digits, but means 9999999999 which would be 'valid' according to the 10 digit rule in RNG but invalid once you read it in...
goto cleanup; @@ -18844,7 +18844,7 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, virBufferAsprintf(buf, "<adapter name='%s'/>\n", scsihostsrc->adapter); virBufferAsprintf(buf, - "<address %sbus='%d' target='%d' unit='%d'/>\n", + "<address %sbus='%d' target='%d' unit='%lld'/>\n", includeTypeInAddr ? "type='scsi' " : "",
Printed as "%d" not "%u" (of course that only means something for -1... Without even considering the math, my eyes saw the commit message "unit='1074872354'" and I immediately thought - is this one of those negative value type issues...
BTW: I wonder if this deserves a patch prior to this to use _uip and avoid negative values and a second patch to change the formatting of %d to %u. I like this idea. Will work through it.
[1] Doesn't this provide an example of the 2nd of the two formatdomain.html.in changes you made? That is you removed having 'scsi' in the "<address...", but this shows that it is printed.... I see three places that call virDomainHostdevDefFormatSubsys, but the one where a <hostdev mode='subsystem'> is handled makes the call with includeTypeInAddr set to false. The other two callers handle network definitions and make the call with it set to true, but those won't drop into this section of code.
This is why I pulled the type='scsi' from the doc above, because the existing code doesn't allow a type='scsi' to be specified. The routine virDomainHostdevSubsysSCSIHostDefParseXML only parses bus, target, and unit parameters in its address tag.
yep - you are correct... furthermore for the "<source..." processing for non iSCSI devices. So we're getting more and more specific.
It seems you've covered most of the various areas. It may be easier to review them if they were smaller units. Just making sure that "units" here is in reference to breaking up this patch for the ui->uip and %d->%u changes, and not the "logical unit" in the commit message?
This patch may be tough to do it, but perhaps a general comment about if it's possible to make smaller chunks of changes, in the long run it makes it easier to apply and make progress while perhaps working out the "fine details". Of course changing a data type representation causes a ripple effect throughout the code and probably isn't possible.
I certainly think the uip changes will help... followed perhaps by just changes to print %u instead of %d... then followed by the change from %u to %llu... It's a natural progression of changes that can be built and checked. You may end up changing the same file twice in followup patches, but seeing how you get from point A to point B is helpful as is providing tests.
John