On Mon, May 07, 2007 at 10:13:57AM +0900, Atsushi SAKAI wrote:
Hi, Jan
I think you should use 0.2.1 at this moment.
libvirt cannot handle Xen-hypervisor-domctl correctly on 0.2.2.
But Xen-hypervisor-sysctl works fine.
This problem recognized in two weeks ago,
but I have no time to investigate this issue.
I've been trying to reproduce / diagnose the problems you reported too
but not had much luck so far. Every way I look at it the code looks to
be using the correct hypercall numbers, operation numbers & structs.
Until I just noticed this:
xenHypervisorDoV2Dom(int handle, xen_op_v2_dom* op)
{
....
if (mlock(op, sizeof(dom0_op_t)) < 0) {
Notice that it is doing sizeof(dom0_op_t) instead of sizeof(xen_op_v2_dom)
There is the same typo with xenHypervisorDoV2Sys.
Now dom0_op_t is defined as
struct dom0_op {
uint32_t cmd;
uint32_t interface_version; /* DOM0_INTERFACE_VERSION */
union {
struct dom0_msr msr;
struct dom0_settime settime;
struct dom0_add_memtype add_memtype;
struct dom0_del_memtype del_memtype;
struct dom0_read_memtype read_memtype;
struct dom0_microcode microcode;
struct dom0_platform_quirk platform_quirk;
struct dom0_memory_map_entry physical_memory_map;
uint8_t pad[128];
} u;
};
Which is 4 + 4 + 128 bytes == 136
Nexzt, xen_sysctl is defined as
struct xen_sysctl {
uint32_t cmd;
uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
union {
struct xen_sysctl_readconsole readconsole;
struct xen_sysctl_tbuf_op tbuf_op;
struct xen_sysctl_physinfo physinfo;
struct xen_sysctl_sched_id sched_id;
struct xen_sysctl_perfc_op perfc_op;
struct xen_sysctl_getdomaininfolist getdomaininfolist;
uint8_t pad[128];
} u;
};
Which is also 4 + 4 + 128 bytes == 136
Finally, xen_domctl is defined as
struct xen_domctl {
uint32_t cmd;
uint32_t interface_version; /* XEN_DOMCTL_INTERFACE_VERSION */
domid_t domain;
union {
struct xen_domctl_createdomain createdomain;
struct xen_domctl_getdomaininfo getdomaininfo;
struct xen_domctl_getmemlist getmemlist;
struct xen_domctl_getpageframeinfo getpageframeinfo;
struct xen_domctl_getpageframeinfo2 getpageframeinfo2;
struct xen_domctl_vcpuaffinity vcpuaffinity;
struct xen_domctl_shadow_op shadow_op;
struct xen_domctl_max_mem max_mem;
struct xen_domctl_vcpucontext vcpucontext;
struct xen_domctl_getvcpuinfo getvcpuinfo;
struct xen_domctl_max_vcpus max_vcpus;
struct xen_domctl_scheduler_op scheduler_op;
struct xen_domctl_setdomainhandle setdomainhandle;
struct xen_domctl_setdebugging setdebugging;
struct xen_domctl_irq_permission irq_permission;
struct xen_domctl_iomem_permission iomem_permission;
struct xen_domctl_ioport_permission ioport_permission;
struct xen_domctl_hypercall_init hypercall_init;
struct xen_domctl_arch_setup arch_setup;
struct xen_domctl_settimeoffset settimeoffset;
uint8_t pad[128];
} u;
};
Which is cruicially different 4 + 4 + 2 + 128 bytes == 138
So the buffer we're mlock()ing is 2 bytes too small for domctl
hypercalls. This may or may not explan the bugs, but its a
worthwhile bug fix to try if you have a system where you can
reliably reproduce the vcpu problems.
The second thing is that we've just discovered a bug in the Fedora Xen
kernels 2.6.20 wrt to SMP which could cause random bad things to happen
So if you're using a Fedora 2.6.20 kernel it is also worth seeing if it
is still a problem with an older Fedora 2.6.19/18 kernel, or with the
vanilla upstream Xen
Dan.
--
|=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=|
|=- Perl modules:
http://search.cpan.org/~danberr/ -=|
|=- Projects:
http://freshmeat.net/~danielpb/ -=|
|=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|