[libvirt] libvirtd 0.6.3 - Kernel Panic - 2.6.27-openvz-briullov.1
by Evan Borgstrom
Hi,
I have a machine running 4 OpenVZ VE's and am looking at libvirt as a
management tool.
My machine is a Sun Fire X2250 with 1 CPU and 4GB of RAM running Gentoo.
When I start libvirtd the output below appears in the dmesg output and
the machine continues to function for about 30s longer before it
starts dumping piles of continuous backtrace onto the console.
Can I provide anymore info that will help diagnose this problem or
should I open a bug about this?
Thanks!
-Evan
[ 426.613841] WARNING: at fs/sysfs/dir.c:463 sysfs_add_one+0x2a/0x36()
[ 426.613843] sysfs: duplicate filename 'lo' can not be created
[ 426.613845] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat
nf_conntrack_ipv4 xt_state nf_conntrack ipt_REJECT iptable_filter
ip_tables simfs vznetdev vzrst vzcpt tun vzdquota vzethdev vzmon vzdev
8021q bridge stp llc
[ 426.613859] Pid: 17966, comm: libvirtd Not tainted 2.6.27-openvz-
briullov.1-r1-fb3 #1
[ 426.613863] [<c022291e>] warn_slowpath+0x4b/0x6c
[ 426.613869] [<c0333d00>] ? vsnprintf+0x233/0x531
[ 426.613874] [<c02294d1>] ? __sysctl_head_next+0x14/0x99
[ 426.613878] [<c032f826>] ? idr_get_empty_slot+0x13c/0x1f0
[ 426.613883] [<c032f9aa>] ? ida_get_new_above+0xd0/0x171
[ 426.613887] [<c028f9ca>] ? find_inode+0x1f/0x5b
[ 426.613891] [<c02bbf5c>] ? sysfs_ilookup_test+0x0/0x11
[ 426.613894] [<c028fac5>] ? ifind+0x32/0x71
[ 426.613897] [<c02bc129>] ? sysfs_find_dirent+0x16/0x27
[ 426.613900] [<c02bc2b9>] sysfs_add_one+0x2a/0x36
[ 426.613903] [<c02bc773>] create_dir+0x43/0x72
[ 426.613906] [<c02bc7e9>] sysfs_create_dir+0x47/0x5b
[ 426.613909] [<c033028b>] ? kobject_get+0x12/0x17
[ 426.613912] [<c0330390>] kobject_add_internal+0xb2/0x153
[ 426.613915] [<c03304dc>] kobject_add_varg+0x35/0x41
[ 426.613919] [<c033054d>] kobject_add+0x43/0x49
[ 426.613921] [<c03888e8>] device_add+0x7f/0x470
[ 426.613926] [<c027cd9d>] ? __percpu_alloc_mask+0xb6/0xd5
[ 426.613931] [<c023e5e7>] ? ub_slab_charge+0x4c/0x65
[ 426.613935] [<c03330fe>] ? strlcpy+0x17/0x49
[ 426.613938] [<c045a40c>] netdev_register_kobject+0x59/0x5c
[ 426.613943] [<c0451afe>] register_netdevice+0x270/0x332
[ 426.613947] [<c0451bf2>] register_netdev+0x32/0x3f
[ 426.613950] [<c03ad540>] loopback_net_init+0x34/0x63
[ 426.613955] [<c044d1eb>] setup_net+0x71/0xbb
[ 426.613958] [<c044d534>] copy_net_ns+0x3d/0x99
[ 426.613960] [<c023712b>] create_new_namespaces+0x109/0x196
[ 426.613965] [<c0237370>] copy_namespaces+0x62/0x92
[ 426.613967] [<c0221a50>] copy_process+0x8df/0x105a
[ 426.613971] [<c02222b6>] do_fork_pid+0xeb/0x25a
[ 426.613973] [<c027f8a2>] ? __fput+0x138/0x140
[ 426.613977] [<c0222438>] do_fork+0x13/0x15
[ 426.613980] [<c02023cb>] sys_clone+0x1f/0x21
[ 426.613983] [<c020399e>] syscall_call+0x7/0xb
[ 426.613986] [<c04f0000>] ? init_scattered_cpuid_features+0x4f/0x7d
[ 426.613992] =======================
[ 426.613994] ---[ end trace 3809b0ba52b812c7 ]---
[ 426.613997] kobject_add_internal failed for lo with -EEXIST, don't
try to register things with the same name in the same directory.
[ 426.614001] Pid: 17966, comm: libvirtd Tainted: G W 2.6.27-
openvz-briullov.1-r1-fb3 #1
[ 426.614004] [<c033041f>] kobject_add_internal+0x141/0x153
[ 426.614008] [<c03304dc>] kobject_add_varg+0x35/0x41
[ 426.614011] [<c033054d>] kobject_add+0x43/0x49
[ 426.614014] [<c03888e8>] device_add+0x7f/0x470
[ 426.614017] [<c027cd9d>] ? __percpu_alloc_mask+0xb6/0xd5
[ 426.614020] [<c023e5e7>] ? ub_slab_charge+0x4c/0x65
[ 426.614024] [<c03330fe>] ? strlcpy+0x17/0x49
[ 426.614027] [<c045a40c>] netdev_register_kobject+0x59/0x5c
[ 426.614030] [<c0451afe>] register_netdevice+0x270/0x332
[ 426.614033] [<c0451bf2>] register_netdev+0x32/0x3f
[ 426.614036] [<c03ad540>] loopback_net_init+0x34/0x63
[ 426.614040] [<c044d1eb>] setup_net+0x71/0xbb
[ 426.614042] [<c044d534>] copy_net_ns+0x3d/0x99
[ 426.614045] [<c023712b>] create_new_namespaces+0x109/0x196
[ 426.614048] [<c0237370>] copy_namespaces+0x62/0x92
[ 426.614051] [<c0221a50>] copy_process+0x8df/0x105a
[ 426.614054] [<c02222b6>] do_fork_pid+0xeb/0x25a
[ 426.614057] [<c027f8a2>] ? __fput+0x138/0x140
[ 426.614060] [<c0222438>] do_fork+0x13/0x15
[ 426.614063] [<c02023cb>] sys_clone+0x1f/0x21
[ 426.614066] [<c020399e>] syscall_call+0x7/0xb
[ 426.614068] [<c04f0000>] ? init_scattered_cpuid_features+0x4f/0x7d
[ 426.614073] =======================
--
Evan Borgstrom <evan(a)fatbox.ca> - FatBox Inc.
Find me on LinkedIn - http://www.linkedin.com/in/evanb
15 years, 7 months
[libvirt] RFC / Braindump: public APIs needing data streams
by Daniel P. Berrange
The patches for secure migration raise an interesting question wrt
the handling of data streams and their effects on the internal driver
API and the public API. Although the migration helper APIs are not
technically public, they do map onto the remote wire protocol and as
such we have the same long term compatability issues to worry about.
The way the migration APIs fit together are obscuring the picture
a little, so for the sake of clarity, the remainder of this mail is
going to talk about a ficticious public API 'virDomainRestoreStream'
which allows a guest domain to be restored from a generic data
stream, rather than a named file. If we can solve this API problem,
then the design will trivially apply to secure migration.
I'll now outline some possible approaches at public API level:
1. Pass a file handle to the public API
Application usage:
fd = open(filename);
ret = virDomainRestoreStream(dom, fd);
Driver internal usage:
int virDomainRestoreStreamImpl(virDomainPtr dom, int fd)
char buf[4096];
int ret;
int qemuFD;
qemuFD = runQEMUWithIncomingFD(dom);
do {
ret = read(fd, buf, sizeof buf);
if (ret > 0)
write(qemuFD, buf, ret);
} while (ret > 0);
}
Good: Restore functionality all in one driver method
Good: Public API is very simple
Good: Internal driver can poll() on the FD to avoid blocking
Bad: Application API is blocked
Bad: Data read from FD might need transformation
eg, uncompress, or decrypt TLS/SASL.
2. Provide public APIs for starting restore, feeding data,
and completing. This matches proposal for secure migration
patchset.
Application usage:
fd = open(filename);
ret = virDomainRestorePrepare(dom);
do {
char buf[4096];
ret = read(fd, buf, sizeof buf);
virDomainRestoreData(dom, buf, ret);
} while (ret > 0);
virDomainRestoreFinish(dom, ret == 0 ? 0 : 1);
Driver internal usage:
int virDomainRestorePrepareImpl(virDomainPtr dom) {
qemuFD = runQEMUWithIncomingFD(dom);
}
int virDomainRestoreDataImpl(virDomainPtr dom, const char *buf, int buflen) {
qemudFD = ...find previously opened qemuFD ...
write(qemuFD, buf, buflen)
}
int virDomainRestoreFinishImpl(virDomainPtr dom, int error) {
if (error)
...kill QEMU ...
else
qemuFD = ...find previously opened qemuFD ...
close(qemuFD);
}
Good: Application can easily decrypt input data
Good: Application can use an event loop to feed in data
as it becomes available. eg poll in socket()
Good: Application API never blocks execution for long time
Bad: Driver has to maintain state across calls indefinitely
Bad: Cannot guarentee that same client calls prepare/data.
ie Different clients can get mixed up feeding data.
Bad: Public API is fairly complex
Bad: Lots of public API entry points for each method needing
streams.
3. Provide a stream object for feeding data to driver from
application. Similar to option 2, but provides easier state
mgmt for driver. The driver will set callbacks on the data
stream to receive data from client.
Application usage:
virDataStream stream = virDataStreamNew();
ret = virDomainRestore(dom, stream);
do {
char buf[4096];
ret = read(fd, buf, sizeof buf);
virDataStreamWrite(stream, buf, ret);
} while (ret > 0);
virDataStreamFinish(dom, ret == 0 ? 0 : 1);
Driver internal usage:
int virDomainRestoreStreamImpl(virDomainPtr dom, virDataStream stream) {
qemuFD = runQEMUWithIncomingFD(dom);
virDataStreamSetCallbacks(stream,
virDomainRestoreDataImpl,
virDomainRestoreFinishImpl,
(void *)qemudFD);
}
int virDomainRestoreDataImpl(virDomainPtr dom, const char *buf, int buflen, void *opaque) {
int qemudFD = (int)opaque;
write(qemuFD, buf, buflen)
}
int virDomainRestoreFinishImpl(virDomainPtr dom, int error, void *opaque) {
if (error)
...kill QEMU ...
else
int qemudFD = (int)opaque;
close(qemuFD);
}
Good: Application can easily decrypt input data
Good: Application can use an event loop to feed in data
as it becomes available. eg poll in socket()
Good: Application API never blocks execution for long time
Good: New APIs reusable for any public API with data stream
Bad: Public API is fairly complex
Bad: Driver has to maintain state across calls indefinitely
4. Provide a callback for driver to fetch data from the client app.
Similar to option 1, but avoids need to expose concept of a 'fd'
in public API directly.
Application usage:
int appreader(virDomainPtr dom, char *buf, int buflen, void *opaque) {
int fd = (int)opaque;
return read(fd, buf, buflen)
}
fd = open(filename);
ret = virDomainRestoreStream(dom, appreader, (void *)fd);
Driver internal usage
int virDomainRestoreStreamImpl(virDomainPtr dom, int (*reader)(virDomainPtr, char *, int, voi d*), void *opaque)
char buf[4096];
int ret;
int qemuFD;
qemuFD = runQEMUWithIncomingFD(dom);
do {
ret = (*reader)(dom, buf, sizeof buf, opaque);
if (ret > 0)
write(qemuFD, buf, ret);
} while (ret > 0);
}
Good: Restore functionality all in one driver method
Good: Public API is very simple
Good: Client app callback can decrypt data
Bad: Application API is blocked
Bad: Internal driver code will block on executing callback if
not data is available. Can't integrate with event loop.
5. Provide a generic public stream API to fetch data from the
client app. Similar to option 4, but adding an stream object
to manage the callback - allows more functionality to be added
to public API later without changing restore API contract.
Application usage:
int appreader(virDomainPtr dom, char *buf, int buflen, void *opaque) {
int fd = (int)opaque;
return read(fd, buf, buflen)
}
fd = open(filename);
stream = virDataStreamNewReader(appreader, (void*)fd);
ret = virDomainRestoreStream(dom, stream);
Driver internal usage
int virDomainRestoreStreamImpl(virDomainPtr dom, virDataStream stream) {
char buf[4096];
int ret;
int qemuFD;
qemuFD = runQEMUWithIncomingFD(dom);
do {
ret = virDataStreamRead(dom, buf, sizeof buf);
if (ret > 0)
write(qemuFD, buf, ret);
} while (ret > 0);
}
Good: Restore functionality all in one driver method
Good: Public API is fairly simple
Good: Client app callback can decrypt data
Bad: Application API is blocked
Bad: Internal driver code will block on executing callback if
not data is available. Can't integrate with event loop.
All the APIs have good and bad points to them, in particular there is
a difficult tradeoff between simplicity of the public API application
code, vs the internal API implementation code. Some important goals
though:
- There must be a way to invoke the public API without blocking
the application code
- The driver must be able to receive data from an encrypted
channel, because in libvirtd the FD might be the SASL/TLS socket
- Internal driver API should not block on callbacks to app code,
since it might need to be polling on another FD concurrently
with reading data.
- The number of new APIs to support streaming should not increase
for each new method needing stream support.
Ultimately I think options 3 or 5 are the most promising, because
the addition of a generic 'virDataStream' public object makes it
easier to manage the processing of the data stream without adding
huge numbers of new APIs. Option 3 is a little more cumbersome to
use from application code, but it avoids blocking either the client
app, or the internal driver. The downside is the driver impl code
is split across several methods. With option 5 it is harder to
avoid blocking the client and internal driver, since it'd require
the driver to integrate with an event loop, but there is no direct
FD for the driver to poll() on.
The choices made also have possible implications on the design of
the remote wire protocol to support these methods. Ignoring the
design of the public API, there are a handful of ways to stream
data between client and server
1. Invoke primary method eg "restore domain", then feed the data
in a sequence of following RPC calls.
C --------------> S Restore domain call
C <-------------- S Restore domain reply
C --------------> S Restore data call 1
C <-------------- S Restore data reply 1
C --------------> S Restore data call 2
C <-------------- S Restore data reply 2
...............
C --------------> S Restore data call n
C <-------------- S Restore data reply n
C --------------> S Restore data complete
C <-------------- S Restore data reply
If server wants to abort a restore operation, it'll
send an error on one of the replies.
2. Invoke primary method eg "restore domain", then feed the data
in a sequence of following async messages.
C --------------> S Restore domain call
C <-------------- S Restore domain reply
C --------------> S Restore data msg 1
C --------------> S Restore data msg 2
...............
C --------------> S Restore data msg n
C --------------> S Restore data complete
C <-------------- S Restore data reply
If server wants to stop the client without closing the socket, it
needs an async 'stop' message from server to client. This is
prety much same as option 2, but is killing off the explicit
replies for each data packet.
3. Invoke primary method eg "restore domain", but require the data
to be stream to server, before the reply is sent back
C --------------> S Restore domain call
C --------------> S Stream data msg 1
C --------------> S Stream data msg 2
...............
C --------------> S Stream data msg n
C --------------> S Stream finish msg
C <-------------- S Restore domain reply
If server wants to stop the client without closing the socket, it
simply sends back the 'restore domain reply' message as an error
before client finishes sending data, and ignore any further data
messages.
Options 2 or 3 have potential benefits on links with noticable
latency, since they're not blocking the client on synchronous
replies from the server. That said, the remote protocol does
allow for interleaving of calls & replies, so with Option 1 the
client could send multiple data packets without waiting for their
replies, and deal with possible delayed error replies. If doing
that though, the benefit of having a 1-to-1 call-to-reply ratio
is minimal, miht as well go for a n-to-1, call-to-reply approach.
It is hard to match public API option 3, with wire protocol
option 3, because of the delayed 'restore domain reply' message.
The options I'm really thinking are most viable are
- Public API 3 + RPC 2
- Public API 5 + RPC 3
Both of these are a little more complex to implement in the libvirtd
daemon that Chris' current secure migration patches, but then then also
have functional & design benefits
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
15 years, 7 months
[libvirt] suspend/hibernate guests on host shutdown
by Gerry Reno
There are times when you just have to reboot the host machine to clear a
problem or after some maintenance. The problem right now is that this
forces all the guests to go through an entire reboot as well. What
would be good is if the host shutdown would invoke a suspend/hibernate
of the guests and then restore the guests to their same condition on the
reboot. The guests should notice nothing other that a small loss of
wallclock time.
Regards,
Gerry
15 years, 7 months
[libvirt] HELP: after host upgrade to F11, guest runs extremely slow
by Gerry Reno
I upgraded the host from F10 to F11 (x86_64) with no issues. Now when I
start a F10 (i386) guest it runs very very slow. I also see messages on
the guest boot console about "clocksource tsc unstable" and some kernel
oops. Once it got far enough to start network I logged in and checked
the clocksource and it currently is 'acpi_pm' even though the kernel
line says clocksource=pit. The available clocksources are acpi_pm,
jiffies, and tsc. I do not see 'pit' in the list. How do I fix this issue?
Regards,
Gerry
15 years, 7 months
[libvirt] VMware ESX driver announcement
by Matthias Bolte
Hello,
I'm participate in a project of the Paderborn Center for Parallel
Computing, an institute of the University of Paderborn:
http://pc2.uni-paderborn.de
The project's goal is to use virtualization in a supercomputer
environment. We've decided to use libvirt to manage different
hypervisor. A subgoal is to extend the driver base of libvirt. We've
started an VMware ESX driver and are investigating Hyper-V support
(see next mail).
The ESX driver isn't complete yet, currently it supports:
- domain lookup by ID, UUID and name
- domain listing
- domain suspend and resume
- domain reboot, if the VMware tools are installed inside the domain
- domain start and shutdown
- domain migration
The driver uses the SOAP based VMware VI API. We've tried to generate
SOAP client code with gSOAP and looked at other SOAP libraries, but
the VI API uses inheritance in a way that gSOAP fails to generate C
code for. Because of this we wrote a minimal SOAP client based on
libcurl and libxml2 that can handle this inheritance problem in C.
The next item on the todo list is domain creation, because currently
the driver can only deal with prior existing domains.
We think this code might be useful for others and would be glad if the
driver could be merged into libvirt in medium term.
All of our work is licensed under LGPLv2+.
15 years, 7 months
[libvirt] Koan, KVM and PXE
by Fabien Dupont
Hi all.
I use Cobbler to manage my PXE boot. I would like to extend my use to manage
VM deployment with Koan. But I have a problem related to network and my VM
can't boot. I have already posted on Cobbler mailing-list but it's probably
on kvm/libvirt side...
I am on Fedora 10 with Cobbler 1.6.4 and Koan 1.6.3 (latest available from
stable repos), and libvirt 0.5.1-2.fc10. I would like to create VMs using
KVM thus I have imported FC10 DVD, added Everything and Updates repos and
created a host using the profile created during import. The system has the
following properties :
root# cobbler system report vmtest
system : vmtest
profile : fedora-core-10-x86_64
comment :
created : Fri May 8 13:43:40 2009
gateway : 192.168.1.254
hostname : vmtest
image :
kernel options : {}
kernel options post : {}
kickstart : <<inherit>>
ks metadata : {}
mgmt classes : []
modified : Sat May 23 18:42:59 2009
name servers : []
name servers search : []
netboot enabled? : True
owners : ['admin']
redhat mgmt key : <<inherit>>
redhat mgmt server : <<inherit>>
server : <<inherit>>
template files : {}
virt cpus : <<inherit>>
virt file size : <<inherit>>
virt path : <<inherit>>
virt ram : <<inherit>>
virt type : <<inherit>>
power type : ipmitool
power address :
power user :
power password :
power id :
interface : eth0
mac address : ff:00:00:01:00:00
bonding :
bonding_master :
bonding_opts :
is static? : True
ip address : 192.168.1.201
subnet : 255.255.255.0
static routes : []
dns name : vmtest.local
dhcp tag :
virt bridge : br0
As you can see, I use a bridged network with a real bridge create with the
method described on the wiki (eth0 is the real interface and br0 the
bridge).
Then I create the VM using : koan --server=cobbler.evenit.info --virt
--system=vmtest
The VM is correctly created and started : kernel and initrd are correctly
set, virtual hard drive is right and network interface seems right.
root# ps -edf | grep kvm
root 19484 23655 67 21:54 ? 00:00:06 /usr/bin/qemu-kvm -S -M pc
-m 256 -smp 1 -name vmtest -monitor pty -no-reboot -boot c -kernel
/var/lib/libvirt/boot/virtinst-vmlinuz.omv9yr -initrd
/var/lib/libvirt/boot/virtinst-initrd.img.Ufj-qc -append ks=
http://cobbler.evenit.info/cblr/svc/op/ks/system/vmtest ksdevice=link
kssendmac lang= text method=
http://cobbler.evenit.info:80/cblr/links/fedora-core-10-x86_64/<http://cobbler.evenit.info/cblr/links/fedora-core-10-x86_64/>-drive
file=/var/lib/libvirt/images/vmtest-disk0,if=virtio,index=0,boot=on
-net nic,macaddr=ff:00:00:01:00:00,vlan=0,model=virtio -net
tap,fd=14,script=,vlan=0,ifname=vnet0 -serial pty -parallel none -usb -vnc
127.0.0.1:0 -k en-us
But, when I open the VM display with virt-manager, I see that installation
can't start as DHCP is not answering... There's no trace of the DHCP
requests on DHCP server and vnet0 seems to have no trafic :
[root@lilith ~]# netstat -i
Table d'interfaces noyau
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP
TX-OVR Flg
br0 1500 0 173524 0 0 0 261005 0
0 0 BMRU
eth0 1500 0 173589 0 0 0 260909 0
0 0 BMPRU
lo 16436 0 29105 0 0 0 29105 0
0 0 LRU
vnet0 1500 0 0 0 0 0 1 0
0 0 BMRU
As far as I can see, it seems vnet0 doesn't forward its trafic to br0 like
there's no link.
I have also seen that the vnet0 interface seen from host (opposed to guest)
doesn't have the MAC address asked for vnet0, but this may be normal.
--
Fabien
15 years, 7 months
[libvirt] PATCH: add support for -snapshot
by Geert Jansen
Hi,
i'm working on some infrastructure to run unit tests in a virtual
environment. For this, I need to be able to pass a "snapshot=on"
argument to a qemu virtual disk so that I know I start with the same
state all the time.
The attached patch implements this. It adds a boolean <snapshot/> to
the <disk> section in the XML.
Please consider for inclusion.
Regards,
Geert Jansen
15 years, 7 months
[libvirt] migration
by Zvi Dubitzky
Does anybody know if migration within a host from one domain to another
works today
It did not work for me for libvirt 0.6.1
thanks
Zvi Dubitzky
Virtualization and System Architecture Email:dubi@il.ibm.com
IBM Haifa Research Laboratory Phone: +972-4-8296182
Haifa, 31905, ISRAEL
15 years, 7 months