[libvirt] [PATCH 1/13] Adding recursive locks
by Stefan Berger
This patch adds recursive locks necessary due to the processing of
network filter XML that can reference other network filters, including
references that cause looks. Loops in the XML are prevented but their
detection requires recursive locks.
14 years, 9 months
[libvirt] [RFC][PATCH v2 0/2] Dynamic backend setup for macvtap interfaces
by Ed Swierk
I posted this RFC/patch set a few weeks ago but didn't receive any
response. My implementation works, but I'd like to hear from anyone more
familiar with libvirt concurrency than I (i.e. nearly everyone) about
how it might be improved.
In v2 I rebased against libvirt-0.7.7 and made a bunch of minor changes.
-
Using a bridge to connect a qemu NIC to a host interface offers a fair
amount of flexibility to reconfigure the host without restarting the VM.
For example, if the bridge connects host interface eth0 to the qemu tap0
interface, eth0 can be hot-removed and hot-plugged without affecting the
VM. Similarly, if the bridge connects host VLAN interface vlan0 to the
qemu tap0 interface, the admin can easily replace vlan0 with vlan1
without the VM noticing.
Using the macvtap driver instead of a kernel bridge, the host interface
is much more tightly tied to the VM. Qemu communicates with the macvtap
interface through a file descriptor, and the macvtap interface is bound
permanently to a specific host interface when it is first created.
What's more, if the underlying host interface disappears, the macvtap
interface vanishes along with it, leaving the VM holding a file
descriptor for a deleted file.
To avoid race conditions during system startup, I would like libvirt to
allow starting up the VM with a NIC even if the underlying host
interface doesn't yet exist, deferring creation of the macvtap interface
(analogous to starting up the VM with a tap interface bound to an orphan
bridge). To support adding and removing a host interface without
restarting the VM, I would like libvirt to react to the (re)appearance
of the underlying host interface, creating a new macvtap interface and
passing the new fd to qemu to reconnect to the NIC.
(It would also be nice if libvirt allowed the user to change which
underlying host interface the qemu NIC is connected to. I'm ignoring
this issue for now, except to note that implementing the above features
should make this easier.)
The libvirt API already supports domainAttachDevice and
domainDetachDevice to add or remove an interface while the VM is
running. In the qemu implementation, these commands add or remove the
VM NIC device as well as reconfiguring the host side. This works only
if the OS and application running in the VM can handle PCI hotplug and
dynamically reconfigure its network. I would like to isolate the VM
from changes to the host network setup, whether you use macvtap or a
bridge.
The changes I think are needed to implement this include:
1. Refactor qemudDomainAttachNetDevice/qemudDomainDetachNetDevice, which
currently handle both backend (host) setup and adding/removing the VM
NIC device; move the backend setup code into separate functions that can
called separately without affecting VM devices.
2. Implement a thread or task that watches for changes to the underlying
host interface for each configured macvtap interface, and reacts by
invoking the appropriate backend setup code.
3. Change qemudBuildCommandLine to defer backend setup if qemu supports
the necessary features for doing it later (e.g. the host_net_add monitor
command).
4. Implement appropriate error handling and reporting, and any necessary
changes to the configuration schema.
The following patches are a partial implementation of the above as a
proof of concept.
Patch 1 implements change (1) above, moving the backend setup code to
new functions
qemudDomainConnectNetBackend/qemudDomainDisconnectNetBackend, and
calling these functions from the existing
qemudDomainAttachNetDevice/qemudDomainDetachNetDevice. I think this
change is useful on its own: it breaks up two monster functions into
more manageable pieces, and eliminates some code duplication (e.g. the
try_remove clause at the end of qemudDomainAttachNetDevice).
Patch 2 is a godawful hack roughly implementing changes (2) and (3)
above (did I mention that this is a proof of concept?). It spawns a
thread that simply tries reconnecting the backend of each macvtap
interface once a second. As long as the interface is already up, the
reconnection fails. If the macvtap interface goes away because the
underlying host interface disappears, the reconnection fails until the
host interface reappears.
I ran into two major issues while implementing (2) and (3):
- Can we use the existing virEvent functions to invoke the reconnection
process, triggered either by a timer or by an event from the host? It
seems like this ought to work, but it appears that communication between
libvirt and the qemu monitor relies on an event, and since all events
run in the same thread, there's no way for an event to call the monitor.
- Should the reconnection process use udev or hal to get notifications,
or leverage the node device code which itself uses udev or hal?
Currently there doesn't appear to be a way to get notifications of
changes to node devices; if there were, we'd still need to address the
threading issue. If we use node devices, what changes to the
configuration schema would be needed to associate a macvtap interface
with the underlying node device?
I'd appreciate input on item (4) as well (e.g. does it always make sense
to ignore the missing host interface on the assumption that it could
show up later?).
--Ed
14 years, 9 months
[libvirt] [PATCH] fix two "make syntax check" failures
by Jim Meyering
Laine mentioned that there were two syntax-check failures.
I've just pushed this fix:
>From a31bc6750347a79dbd43b0aacecf86d204e6e160 Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering(a)redhat.com>
Date: Tue, 16 Mar 2010 19:32:05 +0100
Subject: [PATCH] fix two "make syntax check" failures
* src/xenapi/xenapi_driver.c (xenapiOpen): Remove useless-if-before-free.
* po/POTFILES.in: Add src/xenapi/xenapi_utils.c.
---
po/POTFILES.in | 1 +
src/xenapi/xenapi_driver.c | 2 +-
2 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/po/POTFILES.in b/po/POTFILES.in
index ac169be..1845572 100644
--- a/po/POTFILES.in
+++ b/po/POTFILES.in
@@ -78,5 +78,6 @@ src/xen/xen_hypervisor.c
src/xen/xen_inotify.c
src/xen/xm_internal.c
src/xen/xs_internal.c
+src/xenapi/xenapi_utils.c
tools/console.c
tools/virsh.c
diff --git a/src/xenapi/xenapi_driver.c b/src/xenapi/xenapi_driver.c
index 153582d..ad77068 100644
--- a/src/xenapi/xenapi_driver.c
+++ b/src/xenapi/xenapi_driver.c
@@ -102,7 +102,7 @@ xenapiOpen (virConnectPtr conn, virConnectAuthPtr auth, int flags ATTRIBUTE_UNUS
}
if (!passwd || !conn->uri->user) {
xenapiSessionErrorHandler(conn, VIR_ERR_AUTH_FAILED, "Username/Password not valid");
- if (passwd) VIR_FREE(passwd);
+ VIR_FREE(passwd);
return VIR_DRV_OPEN_ERROR;
}
if (VIR_ALLOC(privP) < 0) {
--
1.7.0.2.445.g0ae494
14 years, 9 months
[libvirt] [PATCH] Use WARN_CFLAGS when compiling virsh.c
by Jiri Denemark
Signed-off-by: Jiri Denemark <jdenemar(a)redhat.com>
---
tools/Makefile.am | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/tools/Makefile.am b/tools/Makefile.am
index 941e93e..46107f6 100644
--- a/tools/Makefile.am
+++ b/tools/Makefile.am
@@ -48,6 +48,7 @@ virsh_CFLAGS = \
-I$(top_srcdir)/src/util \
-DGETTEXT_PACKAGE=\"$(PACKAGE)\" \
-DLOCALEBASEDIR=\""$(datadir)/locale"\" \
+ $(WARN_CFLAGS) \
$(COVERAGE_CFLAGS) \
$(LIBXML_CFLAGS) \
$(READLINE_CFLAGS)
--
1.7.0.2
14 years, 9 months
[libvirt] Fix slow storage file allocation with O_DSYNC
by Jiri Denemark
Hi.
By opening storage file with O_DSYNC before allocating disk blocks for it we
made the whole operation terribly slow when the allocation is done using
posix_fallocate() on a filesystem which does not support fallocate(), i.e.
anything but ext4.
And by terribly slow I mean 45 minutes (RHEL5) to 10+ hours (Fedora Rawhide)
for a 12GB storage.
To fix this issue, we have two options. Either avoid using fallocate()
emulation inside posix_fallocate() or stop using O_DSYNC. I prepared a patch
for each of these options and they will follow as replies to this
introduction.
Some numbers for the same 12GB file:
- libvirt's internal fallocate emulation using
- mmap ~3.5 minutes (will be used whenever mmap() is present)
- write ~5 minutes
- fsync() instead of O_DSYNC
- ~3.5 minutes
So basically the two options are equivalent wrt to time consumption. The
second one seems to be less hacky at the expense of non-lienar behavior which
makes progress reporting less useful.
Jirka
14 years, 9 months
[libvirt] 'build' on FS pool now unconditionally formats?
by Cole Robinson
Hi guys,
Looking at the new FS pool build options and talking with Dave, I see that
calling PoolBuild on an FS pool now unconditionally calls mkfs. This is really
bad when mixed with virt-manager: previously, we assumed the FS build command
was always non destructive (at most it created a directory), so we called it
every time, and didn't even allow users to opt out, since there wasn't a use
case that called for it.
This new formatting behavior really needs to be opt in, otherwise all
virt-manager versions creating an FS pool can destroy data.
Just FYI, for disk pools (and certain LVM configurations) where this operation
has always been destructive, we default to build=off, and loudly warn the user
if they choose otherwise. We can do that with this new option as well, but the
previous behavior really needs to be reinstated IMO (and before the new release).
I fully accept that this could be a bug in virt-manager's assumptions of the
build command, but even consider a virsh user: previously build just created a
directory, now it formats a partition, without any XML change.
Thanks,
Cole
14 years, 9 months
[libvirt] XenAPI remote storage - target path
by Sharadha Prabhakar (3P)
Hi,
I'm trying to write a Remote Storage driver for XenAPI.
I see that target-path is used for both storage pools and volumes.
In the case of XenAPI remote storage, the storage is not mounted on the
local host where libvirt is running. Storage is maintained in a remote location only.
In this case how do I specify target-path and how do I go about creating
a VM with storage using libvirt APIs with virsh and virt-manager.
Virt-manager expects me to give an absolute path for target-path. But In my case
I don't have an absolute path.
Can I have a target path like this for a particular storage volume
"/storage pool uuid/storage-vol uuid"? This will help me identify which storage pool is
Libvirt talking about and which volume in it. Using this information I can fetch data from
The remote location and give it back to libvirt. Is this approach
ok or does libvirt support my specific remote storage case in some way. Could someone
clarify please?
Regards,
Sharadha
14 years, 9 months
[libvirt] RFC: Add further domain event callbacks
by Daniel P. Berrange
In the current domain APIs, we currently have support for getting notified
of domain lifecycle transition events.
THis is done using two methods
int virConnectDomainEventRegister(virConnectPtr conn,
virConnectDomainEventCallback cb,
void *opaque,
virFreeCallback freecb);
int virConnectDomainEventDeregister(virConnectPtr conn,
virConnectDomainEventCallback cb);
This allows an app to register a callback that looks like this
typedef int (*virConnectDomainEventCallback)(virConnectPtr conn,
virDomainPtr dom,
int event,
int detail,
void *opaque);
Where 'event' is the lifecycle transition (suspended, stopped, started, etc)
and 'detail' is the cause of he transition (pause, migration, shutdown, etc)
I have outstanding feature requests to add a lot more event notifications
to the libvirt API. In particular
- IO Errors. Parameter 'alias' the name of the block device with the error
- Reboot. No parameters
You can argue whether this should be part of the lifecycle events. On
the one hand it is not guest visible, because the guest does a internal
machine reset without the host ever seeing a change. This is different
from other cases where QEMU itself is stopping/starting.
- Watchdog. Parameter 'action', saying what is going to happen to the
guest due to this watchdog firing (ignored, shutdown, paused, etc)
- VNC client. In fact three events, connect, authenticated and disconnect.
Parameters, TCP address & port number of client, and also of the server.
Optionally a SASL username and TLS certificate name of the authenticated
user
- Guest user. Logon/logoff events.
This requires co-operation from a guest agent, so it may not actually
be of scope of libvirt.
- Disk 'high watermark'. Emitted when a QCow volume grows beyond a certain
physical allocation. THis allows an app to enlarge the underlying storage
holding the qcow volume before an 'out of space' occurs. Parameter is
the disk alias name.
So we can see there are events with a wide variety of parameters and we need
to figure out how to represent this in the API.
Option 1
--------
Follow the existing lifecycle event model. For each new event, add a
virConnectXXXXXEventRegister & virConnectEventDeregister method, and
typedef a new callback for them. eg
typedef int (*virConnectDomainBlockIOEventCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *diskname,
const char *alias,
void *opaque);
int virConnectDomainBlockIOEventRegister(virConnectPtr conn,
virConnectDomainBlockIOEventCallback cb,
void *opaque,
virFreeCallback freecb);
int virConnectDomainBlockIOEventDeregister(virConnectPtr conn,
virConnectDomainEventCallback cb);
So we'll have 2 extra APIs for every event + a new typedef. We'll also need
to add new APIs in src/conf/domain_event.h to cope with dispatch
Option 2
--------
GLib/GObject take a very loosely typed approach to registering/unregistering
events. The have a single pair of methods that work for any event & a generic
callback signature, requiring application casts.
typedef int (*virConnectEventCallback)(void *opaque);
int virConnectEventRegister(virConnectPtr conn,
const char *eventname,
virConnectEventCallback cb,
void *opaque,
virFreeCallback freecb);
int virCOnnectEventUnregister(virConnectPtr conn,
int eventID);
In this model, the register method returns a unique integer ID for the
callback which can be used to unregister it. Application's using this
will still need a strongly typed callback for receiving the event, but
when calling virConnectEventRegister(), the would do an explicit 'bad'
cast to 'virConnectEventCallback'
Option 3
--------
A hybrid of both approaches. Have a new 'register' method for each type of
event that takes a strongly typed callback, but have a generic 'unregister'
method that just uses the 'int eventID'
int virConnectDomainBlockIOEventRegister(virConnectPtr conn,
virConnectDomainBlockIOEventCallback cb,
void *opaque,
virFreeCallback freecb);
int virCOnnectEventUnregister(virConnectPtr conn,
int eventID);
Option 4
--------
Have one pair of register/unregister events, but instead of passing diffeerent
parameters to each callback, have a generic callback that takes a single
parameter. This parameter would be declared as a union. So depending on
the type of event being received, you'd access different parts of the union
typedef union {
virConnectDomainBlockIOEvent blockio;
virConnectDomainWatchdogEvent watchdog;
...other events...
} virConnectEvent;
Either we could include a dummy member in the union with padding to 1024
bytes in size for future expansion, or we could simply declare that apps
must never allocate this data type themselves, thus allowing us to enlarge
it at will.
typedef int (*virConnectEventCallback)(int eventType, virConnectEvent, void *opaque);
int virConnectEventRegister(virConnectPtr conn,
const char *eventname,
virConnectEventCallback cb,
void *opaque,
virFreeCallback freecb);
int virConnectEventUnregister(virConnectPtr conn,
int eventID);
There is one final question unrelated to these 4 options. For the lifecycle
events we always registered against the 'virConnectPtr' since that is
needed to capture 'domain created' events where there's no virDomainPtr
to register a callback against yet.
Do we want to always register all events aganist the virConnectPtr, and
then pass a 'virDomainPtr' as a parameter to the callbacks as needed. Or
should we allow registering events against the virDomainPtr directly.
The latter might make it simpler to map libvirt into GLib/GObjects event
system in the future.
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
14 years, 9 months
[libvirt] emulation failed (pagetable) rip
by Adam Mooz
Good morning list,
I run a few VM's using libvirt on a C2Q 6600, have been steadily and happily since November, most of the VM's have been happily chugging away since then as well. I woke on Thursday to find one of the VM's had crashed, couldn't login and wasn't responding to the "virsh shutdown" commands. Now when I try to start it; it comes up as 'running' according to virsh but doesn't even request an IP from my DHCP server. I've enabled info logging and the log output looks like this:
> Mar 14 09:52:55 Godfather kernel: [24816.021257] __ratelimit: 932597 callbacks suppressed
> Mar 14 09:52:55 Godfather kernel: [24816.021262] emulation failed (pagetable) rip 7f2019b7aafb 66 0f 7f 07
> Mar 14 09:52:55 Godfather kernel: [24816.021297] emulation failed (pagetable) rip 7f2019b7aafb 66 0f 7f 07
> Mar 14 09:52:55 Godfather kernel: [24816.021321] emulation failed (pagetable) rip 7f2019b7aafb 66 0f 7f 07
> Mar 14 09:52:55 Godfather kernel: [24816.021344] emulation failed (pagetable) rip 7f2019b7aafb 66 0f 7f 07
Any ideas on fixing this?
-----------------------------------------------------------------
Adam Mooz
Adam.Mooz(a)gmail.com
http://www.AdamMooz.com
14 years, 9 months
[libvirt] [PATCH] security: Set permissions for kernel/initrd
by Cole Robinson
Fixes URL installs when running virt-install as root on Fedora.
Signed-off-by: Cole Robinson <crobinso(a)redhat.com>
---
src/qemu/qemu_security_dac.c | 21 +++++++++++++++++++++
src/security/security_selinux.c | 16 ++++++++++++++++
2 files changed, 37 insertions(+), 0 deletions(-)
diff --git a/src/qemu/qemu_security_dac.c b/src/qemu/qemu_security_dac.c
index 6911f48..1883fbe 100644
--- a/src/qemu/qemu_security_dac.c
+++ b/src/qemu/qemu_security_dac.c
@@ -332,6 +332,15 @@ qemuSecurityDACRestoreSecurityAllLabel(virDomainObjPtr vm)
vm->def->disks[i]) < 0)
rc = -1;
}
+
+ if (vm->def->os.kernel &&
+ qemuSecurityDACRestoreSecurityFileLabel(vm->def->os.kernel) < 0)
+ rc = -1;
+
+ if (vm->def->os.initrd &&
+ qemuSecurityDACRestoreSecurityFileLabel(vm->def->os.initrd) < 0)
+ rc = -1;
+
return rc;
}
@@ -356,6 +365,18 @@ qemuSecurityDACSetSecurityAllLabel(virDomainObjPtr vm)
return -1;
}
+ if (vm->def->os.kernel &&
+ qemuSecurityDACSetOwnership(vm->def->os.kernel,
+ driver->user,
+ driver->group) < 0)
+ return -1;
+
+ if (vm->def->os.initrd &&
+ qemuSecurityDACSetOwnership(vm->def->os.initrd,
+ driver->user,
+ driver->group) < 0)
+ return -1;
+
return 0;
}
diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c
index b2c8581..975b315 100644
--- a/src/security/security_selinux.c
+++ b/src/security/security_selinux.c
@@ -616,6 +616,14 @@ SELinuxRestoreSecurityAllLabel(virDomainObjPtr vm)
rc = -1;
}
+ if (vm->def->os.kernel &&
+ SELinuxRestoreSecurityFileLabel(vm->def->os.kernel) < 0)
+ rc = -1;
+
+ if (vm->def->os.initrd &&
+ SELinuxRestoreSecurityFileLabel(vm->def->os.initrd) < 0)
+ rc = -1;
+
return rc;
}
@@ -736,6 +744,14 @@ SELinuxSetSecurityAllLabel(virDomainObjPtr vm)
return -1;
}
+ if (vm->def->os.kernel &&
+ SELinuxSetFilecon(vm->def->os.kernel, default_content_context) < 0)
+ return -1;
+
+ if (vm->def->os.initrd &&
+ SELinuxSetFilecon(vm->def->os.initrd, default_content_context) < 0)
+ return -1;
+
return 0;
}
--
1.6.6.1
14 years, 9 months