[Libvir] RFC: Supporting serial & parallel ports for QEMU (and improving Xen)
by Daniel P. Berrange
One user's feature request for our QEMU driver is to support serial ports.
Easy you might think... you'd be wrong :-)
On Xen we have very simple approach. When creating a guest simply add
<console/>
And it'll cause a serial port to be setup with an autoallocated pty, so
you get back
<console pty='/dev/pts/5'/>
In retrospect calling it 'console' was dumb, but hey we're stuck with that
now and its only a tiny naming issue so I don't mind really.
We can do the same thing with QEMU quite easily. QEMU, however, supports
many many many more ways to hooking up the serial port to Dom0. Indeed
so does Xen fullyvirt, but we don't expose this ability. Parallel ports
have identical config syntax to serial ports so we might as well deal
with both at once.
So, here's a stripped down version of the QEMU docs:
[quote http://fabrice.bellard.free.fr/qemu/qemu-doc.html#SEC10]
`-serial dev'
Redirect the virtual serial port to host character device dev. The
default device is vc in graphical mode and stdio in non graphical
mode. This option can be used several times to simulate up to 4
serials ports. Use -serial none to disable all serial ports.
Available character devices are:
vc
Virtual console
pty
[Linux only] Pseudo TTY (a new PTY is automatically allocated)
none
No device is allocated.
null
void device
/dev/XXX
[Linux only] Use host tty, e.g. `/dev/ttyS0'. The host serial port
parameters are set according to the emulated ones.
/dev/parportN
[Linux only, parallel port only] Use host parallel port N. Currently
only SPP parallel port features can be used.
file:filename
Write output to filename. No character can be read.
stdio
[Unix only] standard input/output
pipe:filename
name pipe filename
COMn
[Windows only] Use host serial port n
udp:[remote_host]:remote_port[@[src_ip]:src_port]
This implements UDP Net Console.
tcp:[host]:port[,server][,nowait][,nodelay]
The TCP Net Console has two modes of operation.
telnet:host:port[,server][,nowait][,nodelay]
The telnet protocol is used instead of raw tcp sockets.
unix:path[,server][,nowait]
A unix domain socket is used instead of a tcp socket.
[/quote]
I don't see any reason to not support all/most of these options. The things
I don't like here is that /dev/XXX, vs /dev/parportN, vs COMn differences
for connecting guest <-> host passthrough of the devices. I figure it could
be simpler if it was just represented as 'n' and we'd translate that to
be /dev/ttyS[n] or /dev/parport[n] or COM[n] as needed.
The question as ever is how to represent this in XML. For serial ports we'll
stick with '<console>', while parallel ports we might as well use a better
named '<parallel>'. Next up, I think should use a 'type' attribute on this
element to determine the main way ot connecting the device, and then more
type specific attributes or sub-elements as needed. If 'type' was not
specified then use a default of 'pty', since that gives compatability with
existing practice.
As an illustrative example
/*
* Parse the XML definition for a character device
*
* Top level node will be <console> or <parallel>, but all attributes
* and sub-elements are identical.
*
* type=vc|pty|null|host|file|pipe|udp|tcp|telnet, default is pty
*
* <console type='vc'/>
*
* <console type='pty' pty='/dev/pts/3'/>
*
* <console type='null'/>
*
* <console type='host' port='3'/>
*
* <console type='file' path='/some/file'/>
*
* <console type='pipe' path='/some/file'/>
*
* <console type='udp'>
* <sendto port='12356'/>
* </console>
*
* <console type='udp'>
* <sendto addr='127.0.0.1' port='12356'/>
* </console>
*
* <console type='udp'>
* <sendto addr='127.0.0.1' port='12356'/>
* <bind port='12356'/>
* </console>
*
* <console type='udp'>
* <sendto addr='127.0.0.1' port='12356'/>
* <bind addr='127.0.0.1' port='12356'/>
* </console>
*
* <console type='tcp'>
* <listen port='12356'/>
* </console>
*
* <console type='tcp'>
* <listen addr='127.0.0.1' port='12356'>
* <nowait/>
* <nodelay/>
* </listen>
* </console>
*
* <console type='tcp'>
* <connect addr='127.0.0.1' port='12356'>
* <nodelay/>
* </connect>
* </console>
*
* <console type='telnet'>
* <listen addr='127.0.0.1' port='12356'/>
* </console>
*
* <console type='telnet'>
* <connect addr='127.0.0.1' port='12356'/>
* </console>
*
*/
BTW, the udp, tcp, telnet options are only available on QEMU >= 0.9.0. We
already have ability to detect / validate that for both Xen & QEMU drivers.
NB, whereever there are IP addresses, hostnames can be used too, hence I
call the attriute 'addr' instead of 'ip'
Regards,
Dan.
--
|=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=|
|=- Perl modules: http://search.cpan.org/~danberr/ -=|
|=- Projects: http://freshmeat.net/~danielpb/ -=|
|=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
16 years, 9 months
[Libvir] [PATCH] Enhanced stats for fullvirt domains
by Richard W.M. Jones
This patch does a couple of primary things:
Firstly it allows you to use "hda", etc. as a path for getting block
device stats from fullvirt domains.
Secondly it separates out the stats code into a new file called
'stats_linux.c'. The reasoning behind the name is that this code can be
shared between Xen & QEMU, and that the code is Linux-specific (it never
worked on Solaris, but now this is explicit). I anticipate a
'stats_solaris.c' file once I can get Solaris + Xen going on a test machine.
Also we try to detect the case where the block dev stats of a fullvirt
domain are stuck at 0 -- caused by there being no frontend driver
connected. We detect the condition by a query to xenstore.
XENVBD_MAJOR is no longer hard-coded if we can get it instead from Linux
header files.
This patch adds bytes written/read to block devices for Xen PV domains
if available.
This also corrects a bug where stats from xvdb, xvdc, .. could not be
read out because the device number was being miscalculated.
Rich.
--
Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/
Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod
Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in
England and Wales under Company Registration No. 03798903
17 years
[Libvir] problems using libvirt to mange "defined" domains.
by Evan Bigall
I'm having a problem using libvirt to manage "defined" domains (ie:
domains for which it has XML, but are not running.)
I have tried several debugging scenarios, but the simplest one is this.
In a libvirt C program I define the following XML:
char domxml[]="<domain type='xen' id='99'>"
"<name>rhel5vm</name>"
"<uuid>f948d7f4f36c4bfae8053bca535183b3</uuid>"
"<os>"
"<type>linux</type>"
"<kernel>/var/lib/xen/boot/rhel5vm/vmlinuz-2.6.18-8.el5xen</kernel>"
"<initrd>/var/lib/xen/boot/rhel5vm/initrd-2.6.18-8.el5xen.img</initrd>"
"<root>/dev/xvda1</root>"
"</os>"
"<memory>512000</memory>"
"<vcpu>1</vcpu>"
"<on_poweroff>destroy</on_poweroff>"
"<on_reboot>restart</on_reboot>"
"<on_crash>restart</on_crash>"
"<devices>"
"<interface type='bridge'>"
"<source bridge='xenbr0'/>"
"<mac address='00:16:3e:09:ef:b0'/>"
"<script path='vif-bridge'/>"
"</interface>"
"<disk type='block' device='disk'>"
"<driver name='phy'/>"
"<source dev='/dev/sda3'/>"
"<target dev='xvda'/>"
"</disk>"
"<console tty='/dev/pts/2'/>"
"</devices>"
"</domain>";
(I've played with several values for the domain id).
What I find is that when I use virDomainDefineXML() to define the domain,
for some reason it drops, the "root" element from the "os" specification.
Because of this the domain then crashes when I try to start it.
I've also tried, using virsh to create the domain from an XML file,
capture the XML with virDomainGetXMLDesc(), pause the C program, and then
try to redefine the domain with the domain with the exact XML
virDomainGetXMLDesc() gave me (after destroying the original of course),
and I get the same behavior, it strips the root element from the OS
specification.
Here is the version information:
[root@rhel5-xen tmp]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5 (Tikanga)
[root@rhel5-xen tmp]# uname -a
Linux rhel5-xen 2.6.18-8.1.8.el5xen #1 SMP Mon Jun 25 17:19:38 EDT 2007
x86_64 x86_64 x86_64 GNU/Linux
[root@rhel5-xen tmp]# virsh version
Compiled against library: libvir 0.3.1
Using library: libvir 0.3.1
Using API: Xen 3.0.1
Running hypervisor: Xen 3.0.0
I am using rhel5 "out of the box" with libvirt upgraded to:
libvirt-0.3.1-1.x86_64.rpm
libvirt-devel-0.3.1-1.x86_64.rpm
libvirt-python-0.3.1-1.x86_64.rpm
I tried using libvirt-0.3.2-1, but sample code segv'd in
virConnectClose().
All I'm really trying to do is build a primitive interface to list domains
(both running, and non-running), and start and stop them. I'm open to
suggestions as to whether I have this misconfigured or if there would be a
better version with which to do this.
Thanks,
Evan
17 years, 1 month
[Libvir] Thoughts on remote storage support
by Tóth István
Hello!
I had a hobby project where I needed to manipulate xen disk images on
remote systems, that used a model similar to libirt's remote support.
Based on what I learnt from it, I came up with a possible model for
libvirt's remote storage support. I present it here for discussion.
We typically store the images in volumes on LVM, or in dedicated file
system folders.
The folders and volume groups usable by libvirt can be limited in a
config file.
It is probably not neccessary to differentiate between defined and
created files, as you can not stop and start a file like a domain, you
either have it on disk, or not.
Libvirt should not store information on these files, everything should
be checked/listed on the fly, so that if you just copy an image to a
directory, libvirt can deduct all information (well, all it can) on it,
and handle it just as if the file was created by it.
The handle for the file is its path, plus its virConnect object (i.e.
the host it is on). For consistency, it may be possible to create an
object for it, but as disk images have no persistent properties apart
from what is on the disk, and it can always be checked from there, it
provides no extra functionality.
I think there is no need to support remote files explicitly, as the
domains mount local files/volumes. The file/volume may actually be
mounted from a NAS or SAN, of course, but it does not matter because we
use the local path names, and AFAIK all virtualization tools use local
files or local devices as blockdevs.
I have added compression to the mix because it is immensely useful.
I have used lzop in my project, and a full backup and restore was much
faster when using a compressed backup file, than with and uncrompessed
one. It conserves disk space, as well as cpu/bus capacity.
Zeroing out newly allocated files, helps with compressed backups, as
well as security. It also means that no holey files can be used.
The objects we are dealing with are disk images.
They have the following properties:
-Path: The unix path of the file ( /mnt/images/fc7.img or /dev/VG/fc7)
-Compression: Mountable/compressed
-Type: Plain file/LVM volume/ What else?
-Size
-Filesystem: swap/ext3/xfs/....
-Is it mounted?
We can do the following operations on the images:
Create
-connection
-filepath
-size
Allocates a new image of the given type, size, and name.
Libvirt should parse the filepath, and determine the base path, check if
it's a directory or a VG, check if libvirt is allowed to operate on the
path/VG, then create the file/volume.
For security reasons zeroing out the allocated space should be a
non-optional step of the allocation.
DirectoryList
-connection
-directorypath
Plain ls functionality, that returns the list of files, and any
subdirectories. If called on a VG, it returns the volumes in it.
Info
-connection
-filepath
Returns information on the given file/volume, including size, type,
filesystem (if
available), whether it is a snapshot (if a volume), and whether it is
mounted or not.
size can be determined by ls or lvinfo, filesystem by 'file' command.
Delete
-connection
-filepath
Delete the file/volume. Find out if it's a file or volume, and rm or
lvremove it.
Grow
-connection
-path
-filename
-newsize
Grows the specified image to the given size. The newly added space is zeroed
out.
Shrink?
-connection
-path
-filename
-newsize
Shrinks the specified image the the given size. It's very tricky,
because to avoid data loss, we need to analyze the file system size. Of
course, we can just say that it's the reponsibility of the user, after
all, we allow him to outright delete the file as well.
We may combine it with Grow, and call it Resize.
Growfs
-connection
-path
-filename
-newsize?
Grows the filesystem on the image to fill the size of the image, or the
given newsize.
Can only be used on umounted images.
It is neccesary because some filesystems may not be grown while mounted,
so the guest can not do it on its own.
Shrinkfs
-connection
-path
-filname
-newsize
Shrinks the filesystem to the given size. Same coniderations apply as
with growfs, may be combined to Resizefs?
Snapshot
-connection
-filepath
-filesize
Creates a snapshot of the given image. It is only possible with images
on LVM. Should return the snapshot image name. The snapshot can later be
deleted with Delete.
CopyTo
-connection
-source filepath
-target connection
-target filepath
-snapshot flag
-archive flag
-overwrite flag
Copy the source image to the target image.
If connection is on another machine, then it's a network copy.
If the snapshot flag is true, then first create a snapshot of the source
image, copy that, then delete the snapshot.
If the archive flag is active, then the target file will be archive file
(compressed).
If the overwite flag is active, then the target file is overwitten, if
it exists. Otherwise existing files are not changed.
Even if the source file is compressed, the target file is uncompressed,
unless the archive flag is set.
CopyContents
-connection
-source filepath
-target connection
-target filepath
-snapshot flag
Copies the contents of the source file to the target file. The target
file must already exist, and be no smaller than the source file. The
contents of the target file are overwitten, and any extra space is
zeroed out.
Archive
-connection
-filepath
compresses the given file. Makes sense only on files, not volumes.
Unarchive
-connection
-filepath
uncompresses the given file. Makes sense only on archved(compressed) files.
StorageInfo
- connection
Returns information on the node's storage configuration. What kind of
filesystems it can handle, What are the accessible file / VG paths,
what's the free space on them, etc.
A typical usage scenario could be something like this:
Aconn=getVirconn("ssh:Ahost"); //Open the connection to host A
Bconn=getVirconn("ssh:Bhost"); //Host B will hold our backup image
Ainfo=StorageInfo(Aconn); //Get Ainfo
AVGPath = <get the first usable VG path from VG info>
Newimage = Create(Aconn, concat("AVGPath", "newimage", 100000); //A
100Mb volume is created and zeroed out.
CopyContents(Aconn, NewImage, Aconn, "/images/ghost/FC7default.img",
no); //Copy our pre-created FC7 image to the new image
Growfs(Aconn, NewImage, 0); //Grow the copied filesystem to fill the
whole volume.
<Here we define a new domain, and use NewImage as the name of the
backing image for the guest's block device>
<Start the new domain>
Copy(Aconn, NewImage, Bconn, "/mnt/backups/backup23image", snapshot=yes,
archive=yes, overwrite=no);
//Make an LVM snapshot of NewImage, and copy it to Host B on the given
filename, compressing it on the fly, then remove the snapshot
<Stop the domain>
CopyContents(Bconn, "/mnt/backups/backup23image", Aconn, NewImage,
snapshot=no); //Restore the backed-up image to NewImage, decompress it
on the fly.
<Start the domain>
<Stop the domain>
Grow(Aconn, NewImage, 200000); //Grow the volume to 200MBs
Growfs(Aconn, NewImage); //Grow the fs on the volume to fill the volume
<Start the domain>
.....
Best regards
István
17 years, 1 month
[Libvir] PATCH: Updated patches for PolicyKit support
by Daniel P. Berrange
A few weeks back I posted some prototype patches for PolicyKit support to
allow the main libvirt daemon socket to be made world-accessible. PolicyKit
then can do ACLs on incoming connections, allowing definition of rules which
could for example, allow only the user who owns the active X login sesion
http://www.redhat.com/archives/libvir-list/2007-August/msg00027.html
This is an updated patch which takes account of a change in the PolicyKit
XML file syntax between 0.4 and 0.5 releases.
The configure.in scripts has been tweaked to automatically disable PolicyKit
if pkg-config is not available instead of aborting.
The code for getting UNIX socket credentials has been factored out into its
own method. There is still only a Linux implementation. I was going to take
the code for other OS from DBus, but DBus is currently under a GPL/Academic
license options, which is not compatible with LGPL. Fortunately DBus is in
middle of re-licensing to X11 style which is LGPL compatible, so in a week
or so's time we'll be able to safely take their OS portability code for
socket credentials.
I short-circuit the logic to always allow root. This allows existing people
running libvirt tools as root to continue use without any regressions. There
is one small issue still that the default policy I provide only allows the
use of read-only connections if the user is logged into to the desktop. This
is a partial regression - the admin can edit /etc/PolicyKit/PolicyKit.conf
and add a site-local rule allowing all users access, regardless of whether
they're in a session. I've spoken with David Zeuthan and he's going to add
ability to specify rules for non-session clients in the default policy
config files, which will fix this minor regression. Once this is done the
libvirt default policy will be identical to current file permission based
policy (root == full access, non-root == read only).
As I mentioned previously, with this change it is now possible to open a full
read-write connection from virt-manager running as non-root. Depending on
site policy it will optionally prompt for root password (su style equiv) or
the user's password (sudo style equiv) without needing virt-manager itself
to gain any elevated privileges.
When compiling with PolicyKit support, the default file permissions for both
the main & readonly UNIX sockets in the daemon switch to 0777, instead of
the previous 0700 & 0777. It is possible to turn off PolicyKit auth in the
daemon config file, even if it is compiled in - in which case the default
permissions get set back to 0700 & 0777.
Although in previous feedback Daniel suggested I leave the LIBVIRTD_AUTH_POLKIT
constant compiled in all the time, I feel it is better to remove it when the
policykit support is disabled in configure. This removes the need to have
extra switch/case statements to explicitly reject LIBVIRTD_AUTH_POLKIT auth,
since it will be handle by the 'default:' statement which already has code
to reject connections.
I've done more extensive testing with virt-manager since my previous patch,
and its working very nicely with the new UI which allows multiple hypervisor
connections. Instead of asking for the root password up-front at app start
time, we now only need ask for it if the user connects to a local HV. If
they only ever manage remote connections we don't need to do anything with
the local root password.
Dan.
--
|=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=|
|=- Perl modules: http://search.cpan.org/~danberr/ -=|
|=- Projects: http://freshmeat.net/~danielpb/ -=|
|=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
17 years, 1 month
[Libvir] Release of libvirt-0.3.3
by Daniel Veillard
More than a month without a release ... it was clearly time to push the
bits out. Available as usual from
ftp://libvirt.org/libvirt/
A lot of new things in the last month:
* New features:
- Avahi mDNS daemon export (Daniel Berrange)
- NUMA support (Beth Kan)
* Documentation:
- cleanups (Toth Istvan)
- typos (Eduardo Pereira)
* Bug fixes:
- memory corruption on large dumps (Masayuki Sunou)
- fix virsh vncdisplay command exit (Masayuki Sunou)
- Fix network stats TX/RX result (Richard Jones)
- warning on Xen 3.0.3 (Richard Jones)
- missing buffer check in virDomainXMLDevID (Hugh Brock)
- avoid zombies when using remote (Daniel Berrange)
- xend connection error message (Richard Jones)
- avoid ssh tty prompt (Daniel Berrange)
- username handling for remote URIs (Fabian Deutsch)
- fix potential crash on multiple input XML tags (Daniel Berrange)
- Solaris Xen hypercalls fixup (Mark Johnson)
* Improvements:
- OpenVZ support (Shuveb Hussain and Anoop Cyriac)
- CD-Rom reload on XEn (Hugh Brock)
- PXE boot got QEmu/KVM (Daniel Berrange)
- QEmu socket permissions customization (Daniel Berrange)
- more QEmu support (Richard Jones)
- better path detection for qemu and dnsmasq (Richard Jones)
- QEmu flags are per-Domain (Daniel Berrange)
- virsh freecell command
- Solaris portability fixes (Mark Johnson)
- default bootloader support (Daniel Berrange)
- new virNodeGetFreeMemory API
- vncpasswd extraction in configuration files if secure (Mark Johnson and Daniel Berrange)
- Python bindings for block and interface statistics
* Code cleanups:
- virDrvOpenRemoteFlags definition (Richard Jones)
- configure tests and output (Daniel Berrange)
Thanks to everybody who helped with sugegstion, bug reports patches and more !
Daniel
--
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard | virtualization library http://libvirt.org/
veillard(a)redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
17 years, 1 month
[Libvir] PATCH: vncpasswd
by Mark Johnson
This adds support for handling vncpasswd..
MRJ
vncpasswd support
diff --git a/src/xend_internal.c b/src/xend_internal.c
--- a/src/xend_internal.c
+++ b/src/xend_internal.c
@@ -1657,11 +1657,14 @@ xend_parse_sexp_desc(virConnectPtr conn,
} else if (tmp && !strcmp(tmp, "vnc")) {
int port = xenStoreDomainGetVNCPort(conn, domid);
const char *listenAddr = sexpr_node(node, "device/vfb/vnclisten");
+ const char *vncPasswd = sexpr_node(node, "device/vfb/vncpasswd");
const char *keymap = sexpr_node(node, "device/vfb/keymap");
virBufferVSprintf(&buf, " <input type='mouse' bus='%s'/>\n", hvm ? "ps2": "xen");
virBufferVSprintf(&buf, " <graphics type='vnc' port='%d'", port);
if (listenAddr)
virBufferVSprintf(&buf, " listen='%s'", listenAddr);
+ if (vncPasswd)
+ virBufferVSprintf(&buf, " passwd='%s'", vncPasswd);
if (keymap)
virBufferVSprintf(&buf, " keymap='%s'", keymap);
virBufferAdd(&buf, "/>\n", 3);
@@ -1723,6 +1726,7 @@ xend_parse_sexp_desc(virConnectPtr conn,
if (tmp[0] == '1') {
int port = xenStoreDomainGetVNCPort(conn, domid);
const char *listenAddr = sexpr_fmt_node(root, "domain/image/%s/vnclisten", hvm ? "hvm" : "linux");
+ const char *vncPasswd = sexpr_fmt_node(root, "domain/image/%s/vncpasswd", hvm ? "hvm" : "linux");
const char *keymap = sexpr_fmt_node(root, "domain/image/%s/keymap", hvm ? "hvm" : "linux");
/* For Xen >= 3.0.3, don't generate a fixed port mapping
* because it will almost certainly be wrong ! Just leave
@@ -1736,6 +1740,8 @@ xend_parse_sexp_desc(virConnectPtr conn,
virBufferVSprintf(&buf, " <graphics type='vnc' port='%d'", port);
if (listenAddr)
virBufferVSprintf(&buf, " listen='%s'", listenAddr);
+ if (vncPasswd)
+ virBufferVSprintf(&buf, " passwd='%s'", vncPasswd);
if (keymap)
virBufferVSprintf(&buf, " keymap='%s'", keymap);
virBufferAdd(&buf, "/>\n", 3);
17 years, 1 month
[Libvir] PATCH: Allow emptry bootloader
by Daniel P. Berrange
Latest XenD allows for default bootloader to be used if one is not defined
for a paravirt OS. libvirt doesn't currently cope with this (see Solaris
thread), since it requires either a bootloader or kernel to always be
present.
The attached patch allows for an emptry bootloader element to indicate that
the domain should use the default.
<domain type='xen' id='-1'>
<name>rhel5pv</name>
<uuid>614d7eb0-c6b1-5235-ac39-361b20f6cd49</uuid>
<bootloader/>
<os>
<type>linux</type>
</os>
...snip...
</domain>
The current API for looking up nodes in the SEXPR did not allow for the
scenario where you might want to lookup a node with no content, eg
(bootloader )
This distinction is needed here, so I added an sexpr_has() method to check
for this.
Second, it fixes a long standing bug where we'd record an error about the
missing kernel or bootloader, but then continue generating (malformed) XML
anyway. With this patch, it will correctly fail if the empty bootloader
field is missing from the SEXPR.
Dan.
--
|=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=|
|=- Perl modules: http://search.cpan.org/~danberr/ -=|
|=- Projects: http://freshmeat.net/~danielpb/ -=|
|=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
17 years, 1 month
[Libvir] [RFC][PATCH 0/2] Tested NUMA patches for available memory and topology
by beth kon
I have tested the patches on a NUMA and a non-NUMA configuration, and
they fundamentally appear to work. The first patch is for accessing
available memory on a per-node basis.
The second patch is for accessing NUMA node topology. I've gotten some
helpful suggestions about my string parsing code, introducing me to
sscanf :-). I've become convinced that there are more elegant ways to do
this which would have about the same level of error checking. However, I
am out of time if Daniel wants to check this in this week. So I offer
what I do have, which functions, but is not elegant. I have been playing
with other ways to do this and am not far from finished. So, Daniel, you
need to tell me if you want to take this code, and possibly upgrade to
something more compact later, or if you'd like to wait for the next
revision.
One point to comment on for posterity is that the string returned from
xend is not what might be expected. An example:
printf("string is %s\n", tempstr);
would return
string is node0:0\n node1:1
So, this means that the '\n' sent by the xend python code is somehow
translated to "\n". And the trailing '\n' that should be there isn't.
Instead there is a '\0'. Looking at the xend code, it would appear this
string should look like:
"node0:0\n node1:1\n"
where these are '\n'.
--
Elizabeth Kon (Beth)
IBM Linux Technology Center
Open Hypervisor Team
email: eak(a)us.ibm.com
17 years, 1 month
[Libvir] Updated Solaris dom0 patch (part 1)
by Mark Johnson
I've broken this patch up into a few pieces to make
it more reviewable and tried to address the comments
from the previous patch (Jun 15th'ish if your looking).
Here is first part...
tested on today's CVS bits on a FC7 dom0
(LD_PRELOAD=src/.libs/libvirt.so src/.libs/virsh)
One note, it looks like you still need a xend change on
FC7 before the no kernel/bootloader option works. It
looks like it's close though.. works fine with them
of course.
More details...
[root@fedora solaris]# cat guest.py
name = "solaris"
vcpus = 1
memory = "512"
#bootloader = "/usr/bin/pygrub"
#kernel = "/platform/i86xpv/kernel/unix"
#ramdisk = "/platform/i86pc/boot_archive"
extra = "-k"
root = "/dev/dsk/c0d0s0"
disk = ['file:/export/guests/solaris/disk.img,0,w']
vif = ['bridge=xenbr0']
on_shutdown = "destroy"
on_reboot = "restart"
on_crash = "destroy"
[root@fedora solaris]# xm list -l solaris
(domain
(on_crash destroy)
(uuid 72f9b45e-4be9-bf1f-a500-3707b9c3922c)
(bootloader_args )
(vcpus 1)
(name solaris)
(on_poweroff destroy)
(on_reboot restart)
(bootloader )
(maxmem 512)
(memory 512)
(shadow_memory 0)
(cpu_weight 256)
(cpu_cap 0)
(features )
(on_xend_start ignore)
(on_xend_stop ignore)
(image (linux (kernel ) (args 'root=/dev/dsk/c0d0s0 -k')))
(status 0)
(device (vif (bridge xenbr0) (uuid 9a8d5a2f-0a70-3a1f-fed0-fd2459e63733)))
(device
(vbd
(uuid a6ba2241-1019-151e-6dd7-28638d3e17b7)
(bootable 1)
(driver paravirtualised)
(dev 0)
(uname file:/export/guests/solaris/disk.img)
(mode w)
)
)
)
[root@fedora solaris]#
virsh # start solaris
Domain solaris started
virsh # dominfo solaris
Id: 5
Name: solaris
UUID: 72f9b45e-4be9-bf1f-a500-3707b9c3922c
OS Type: linux
State: no state
CPU(s): 1
CPU time: 2.1s
Max memory: 524288 kB
Used memory: 524288 kB
virsh # console solaris
libvir: Xen Daemon error : internal error domain information incomplete, missing kernel & bootloader
domain.xml:25: parser error : Opening and ending tag mismatch: os line 4 and domain
</domain>
^
domain.xml:26: parser error : Premature end of data in tag domain line 1
^
virsh # quit
[root@fedora solaris]# xm console solaris
Loading kmdb...
SunOS Release 5.11 Version matrix-devel-build 32-bit
Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
WARNING: Found xen v3.1.0-rc7-2934.fc7 but need xen v3.0.4-1-xvm
WARNING: The kernel may not function correctly
Hostname: unknown
Reading ZFS config: done.
Sep 28 18:48:53 unknown ntpdate[419]: no server suitable for synchronization found
unknown console login: root
Password:
Last login: Fri Sep 28 18:45:46 on console
Sep 28 18:50:30 unknown login: ROOT LOGIN /dev/console
Sun Microsystems Inc. SunOS 5.11 matrix-devel-build October 2007
bfu'ed from /net/girltalk/export/xvm/matrix-devel/archives/i386/nightly-nd on 2007-09-28
Sun Microsystems Inc. SunOS 5.11 xen-nv66-2007-06-24 October 2007
# poweroff
Sep 28 18:50:33 unknown poweroff: initiated by root on /dev/console
[root@fedora solaris]#
Thanks,
MRJ
---
Solaris dom0 support
diff --git a/src/xen_internal.c b/src/xen_internal.c
--- a/src/xen_internal.c
+++ b/src/xen_internal.c
@@ -56,9 +56,10 @@ typedef struct v0_hypercall_struct {
unsigned long op;
unsigned long arg[5];
} v0_hypercall_t;
+
+#ifdef __linux__
#define XEN_V0_IOCTL_HYPERCALL_CMD \
_IOC(_IOC_NONE, 'P', 0, sizeof(v0_hypercall_t))
-
/* the new one */
typedef struct v1_hypercall_struct
{
@@ -67,8 +68,12 @@ typedef struct v1_hypercall_struct
} v1_hypercall_t;
#define XEN_V1_IOCTL_HYPERCALL_CMD \
_IOC(_IOC_NONE, 'P', 0, sizeof(v1_hypercall_t))
-
typedef v1_hypercall_t hypercall_t;
+#elif define(__sun__)
+typedef privcmd_hypercall_t hypercall_t;
+#else
+#error "unsupported platform"
+#endif
#ifndef __HYPERVISOR_sysctl
#define __HYPERVISOR_sysctl 35
@@ -314,6 +319,26 @@ typedef union xen_getschedulerid xen_get
dominfo.v2.handle : \
dominfo.v2d5.handle))
+
+static int
+lock_pages(void *addr, size_t len)
+{
+#ifdef __linux__
+ return (mlock(addr, len));
+#elif define(__sun)
+ return (0);
+#endif
+}
+
+static int
+unlock_pages(void *addr, size_t len)
+{
+#ifdef __linux__
+ return (munlock(addr, len));
+#elif define(__sun)
+ return (0);
+#endif
+}
struct xen_v0_getdomaininfolistop {
@@ -616,7 +641,17 @@ typedef struct xen_op_v2_dom xen_op_v2_d
#include "xen_unified.h"
#include "xen_internal.h"
-#define XEN_HYPERVISOR_SOCKET "/proc/xen/privcmd"
+#ifdef __linux__
+#define XEN_HYPERVISOR_SOCKET "/proc/xen/privcmd"
+#define HYPERVISOR_CAPABILITIES "/sys/hypervisor/properties/capabilities"
+#define CPUINFO "/proc/cpuinfo"
+#elif define(__sun__)
+#define XEN_HYPERVISOR_SOCKET "/dev/xen/privcmd"
+#define HYPERVISOR_CAPABILITIES ""
+#define CPUINFO "/dev/cpu/self/cpuid"
+#else
+#error "unsupported platform"
+#endif
#ifndef PROXY
static const char * xenHypervisorGetType(virConnectPtr conn);
@@ -773,7 +808,7 @@ xenHypervisorDoV0Op(int handle, xen_op_v
hc.op = __HYPERVISOR_dom0_op;
hc.arg[0] = (unsigned long) op;
- if (mlock(op, sizeof(dom0_op_t)) < 0) {
+ if (lock_pages(op, sizeof(dom0_op_t)) < 0) {
virXenError(VIR_ERR_XEN_CALL, " locking", sizeof(*op));
return (-1);
}
@@ -783,7 +818,7 @@ xenHypervisorDoV0Op(int handle, xen_op_v
virXenError(VIR_ERR_XEN_CALL, " ioctl ", xen_ioctl_hypercall_cmd);
}
- if (munlock(op, sizeof(dom0_op_t)) < 0) {
+ if (unlock_pages(op, sizeof(dom0_op_t)) < 0) {
virXenError(VIR_ERR_XEN_CALL, " releasing", sizeof(*op));
ret = -1;
}
@@ -814,7 +849,7 @@ xenHypervisorDoV1Op(int handle, xen_op_v
hc.op = __HYPERVISOR_dom0_op;
hc.arg[0] = (unsigned long) op;
- if (mlock(op, sizeof(dom0_op_t)) < 0) {
+ if (lock_pages(op, sizeof(dom0_op_t)) < 0) {
virXenError(VIR_ERR_XEN_CALL, " locking", sizeof(*op));
return (-1);
}
@@ -824,7 +859,7 @@ xenHypervisorDoV1Op(int handle, xen_op_v
virXenError(VIR_ERR_XEN_CALL, " ioctl ", xen_ioctl_hypercall_cmd);
}
- if (munlock(op, sizeof(dom0_op_t)) < 0) {
+ if (unlock_pages(op, sizeof(dom0_op_t)) < 0) {
virXenError(VIR_ERR_XEN_CALL, " releasing", sizeof(*op));
ret = -1;
}
@@ -856,7 +891,7 @@ xenHypervisorDoV2Sys(int handle, xen_op_
hc.op = __HYPERVISOR_sysctl;
hc.arg[0] = (unsigned long) op;
- if (mlock(op, sizeof(dom0_op_t)) < 0) {
+ if (lock_pages(op, sizeof(dom0_op_t)) < 0) {
virXenError(VIR_ERR_XEN_CALL, " locking", sizeof(*op));
return (-1);
}
@@ -866,7 +901,7 @@ xenHypervisorDoV2Sys(int handle, xen_op_
virXenError(VIR_ERR_XEN_CALL, " sys ioctl ", xen_ioctl_hypercall_cmd);
}
- if (munlock(op, sizeof(dom0_op_t)) < 0) {
+ if (unlock_pages(op, sizeof(dom0_op_t)) < 0) {
virXenError(VIR_ERR_XEN_CALL, " releasing", sizeof(*op));
ret = -1;
}
@@ -898,7 +933,7 @@ xenHypervisorDoV2Dom(int handle, xen_op_
hc.op = __HYPERVISOR_domctl;
hc.arg[0] = (unsigned long) op;
- if (mlock(op, sizeof(dom0_op_t)) < 0) {
+ if (lock_pages(op, sizeof(dom0_op_t)) < 0) {
virXenError(VIR_ERR_XEN_CALL, " locking", sizeof(*op));
return (-1);
}
@@ -908,7 +943,7 @@ xenHypervisorDoV2Dom(int handle, xen_op_
virXenError(VIR_ERR_XEN_CALL, " ioctl ", xen_ioctl_hypercall_cmd);
}
- if (munlock(op, sizeof(dom0_op_t)) < 0) {
+ if (unlock_pages(op, sizeof(dom0_op_t)) < 0) {
virXenError(VIR_ERR_XEN_CALL, " releasing", sizeof(*op));
ret = -1;
}
@@ -936,7 +971,7 @@ virXen_getdomaininfolist(int handle, int
{
int ret = -1;
- if (mlock(XEN_GETDOMAININFOLIST_DATA(dominfos),
+ if (lock_pages(XEN_GETDOMAININFOLIST_DATA(dominfos),
XEN_GETDOMAININFO_SIZE * maxids) < 0) {
virXenError(VIR_ERR_XEN_CALL, " locking",
XEN_GETDOMAININFO_SIZE * maxids);
@@ -992,7 +1027,7 @@ virXen_getdomaininfolist(int handle, int
if (ret == 0)
ret = op.u.getdomaininfolist.num_domains;
}
- if (munlock(XEN_GETDOMAININFOLIST_DATA(dominfos),
+ if (unlock_pages(XEN_GETDOMAININFOLIST_DATA(dominfos),
XEN_GETDOMAININFO_SIZE * maxids) < 0) {
virXenError(VIR_ERR_XEN_CALL, " release",
XEN_GETDOMAININFO_SIZE * maxids);
@@ -1679,7 +1714,7 @@ virXen_setvcpumap(int handle, int id, un
if (hypervisor_version > 1) {
xen_op_v2_dom op;
- if (mlock(cpumap, maplen) < 0) {
+ if (lock_pages(cpumap, maplen) < 0) {
virXenError(VIR_ERR_XEN_CALL, " locking", maplen);
return (-1);
}
@@ -1697,7 +1732,7 @@ virXen_setvcpumap(int handle, int id, un
}
ret = xenHypervisorDoV2Dom(handle, &op);
- if (munlock(cpumap, maplen) < 0) {
+ if (unlock_pages(cpumap, maplen) < 0) {
virXenError(VIR_ERR_XEN_CALL, " release", maplen);
ret = -1;
}
@@ -1794,7 +1829,7 @@ virXen_getvcpusinfo(int handle, int id,
ipt->cpu = op.u.getvcpuinfod5.online ? (int)op.u.getvcpuinfod5.cpu : -1;
}
if ((cpumap != NULL) && (maplen > 0)) {
- if (mlock(cpumap, maplen) < 0) {
+ if (lock_pages(cpumap, maplen) < 0) {
virXenError(VIR_ERR_XEN_CALL, " locking", maplen);
return (-1);
}
@@ -1812,7 +1847,7 @@ virXen_getvcpusinfo(int handle, int id,
op.u.getvcpumapd5.cpumap.nr_cpus = maplen * 8;
}
ret = xenHypervisorDoV2Dom(handle, &op);
- if (munlock(cpumap, maplen) < 0) {
+ if (unlock_pages(cpumap, maplen) < 0) {
virXenError(VIR_ERR_XEN_CALL, " release", maplen);
ret = -1;
}
@@ -1963,6 +1998,7 @@ xenHypervisorInit(void)
goto detect_v2;
}
+#ifndef __sun__
/*
* check if the old hypercall are actually working
*/
@@ -1980,6 +2016,7 @@ xenHypervisorInit(void)
hypervisor_version = 0;
goto done;
}
+#endif
/*
* we faild to make any hypercall
diff --git a/src/xend_internal.c b/src/xend_internal.c
--- a/src/xend_internal.c
+++ b/src/xend_internal.c
@@ -1280,9 +1280,7 @@ xend_parse_sexp_desc_os(virConnectPtr xe
virBufferVSprintf(buf, " <type>hvm</type>\n");
tmp = sexpr_node(node, "domain/image/hvm/kernel");
if (tmp == NULL && !bootloader) {
- virXendError(xend, VIR_ERR_INTERNAL_ERROR,
- _("domain information incomplete, missing kernel & bootloader"));
- return(-1);
+ virBufferVSprintf(buf, " <!-- use the default bootloader -->\n");
}
if (tmp)
virBufferVSprintf(buf, " <loader>%s</loader>\n", tmp);
@@ -1311,9 +1309,7 @@ xend_parse_sexp_desc_os(virConnectPtr xe
virBufferVSprintf(buf, " <type>linux</type>\n");
tmp = sexpr_node(node, "domain/image/linux/kernel");
if (tmp == NULL && !bootloader) {
- virXendError(xend, VIR_ERR_INTERNAL_ERROR,
- _("domain information incomplete, missing kernel & bootloader"));
- return(-1);
+ virBufferVSprintf(buf, " <!-- use the default bootloader -->\n");
}
if (tmp)
virBufferVSprintf(buf, " <kernel>%s</kernel>\n", tmp);
diff --git a/src/xml.c b/src/xml.c
--- a/src/xml.c
+++ b/src/xml.c
@@ -703,10 +703,7 @@ virDomainParseXMLOSDescPV(virConnectPtr
return (-1);
}
virBufferAdd(buf, "(image (linux ", 14);
- if (kernel == NULL) {
- virXMLError(conn, VIR_ERR_NO_KERNEL, NULL, 0);
- return (-1);
- } else {
+ if (kernel != NULL) {
virBufferVSprintf(buf, "(kernel '%s')", (const char *) kernel);
}
if (initrd != NULL)
diff --git a/src/xs_internal.c b/src/xs_internal.c
--- a/src/xs_internal.c
+++ b/src/xs_internal.c
@@ -31,7 +31,13 @@
#include "xs_internal.h"
#include "xen_internal.h" /* for xenHypervisorCheckID */
+#ifdef __linux__
#define XEN_HYPERVISOR_SOCKET "/proc/xen/privcmd"
+#elif define(__sun__)
+#define XEN_HYPERVISOR_SOCKET "/dev/xen/privcmd"
+#else
+#error "unsupported platform"
+#endif
#ifndef PROXY
static char *xenStoreDomainGetOSType(virDomainPtr domain);
17 years, 2 months