[PATCH 00/29] docs: Convert some pages to rST and clean up (part 2)

Peter Krempa (29): docs: Remove empty unreferenced 'drvremote' page docs: Convert 'cgroups' page to rST docs: Convert 'drvbhyve' page to rST docs: Convert 'drvesx' page to rST docs: Convert 'drvhyperv' page to rST docs: Convert 'drvlxc' page to rST docs: Convert 'drvnodedev' page to rST docs: Convert 'drvopenvz' page to rST docs: Convert 'drvsecret' page to rST docs: Convert 'drvtest' page to rST docs: Convert 'drvvbox' page to rST docs: Convert 'drvvirtuozzo' page to rST docs: Convert 'drvvmware' page to rST docs: Convert 'drvxen' page to rST docs: Convert 'firewall' page to rST docs: Convert 'format' page to rST docs: Convert 'formatcaps' page to rST docs: Convert 'formatdomaincaps' to rST docs: Convert 'formatnetworkport' to rST docs: Fix heading of 'formatnetworkport' page docs: Convert 'formatstoragecaps' page to rST docs: Convert 'formatstorageencryption' page to rST docs: formatstorageencryption: Drop empty 'default' paragraph docs: formatstorageencryption: Re-style encryption type headers docs: Convert 'hooks' page to rST docs: Convert 'java' page to rST docs: Convert 'logging' page to rST docs: logging: Replace example by link to kbase/debuglogs.html docs: Convert 'php' page to rST docs/cgroups.html.in | 424 -------------- docs/cgroups.rst | 364 ++++++++++++ docs/drvbhyve.html.in | 583 ------------------- docs/drvbhyve.rst | 582 +++++++++++++++++++ docs/drvesx.html.in | 838 --------------------------- docs/drvesx.rst | 681 ++++++++++++++++++++++ docs/drvhyperv.html.in | 150 ----- docs/drvhyperv.rst | 121 ++++ docs/drvlxc.html.in | 822 -------------------------- docs/drvlxc.rst | 670 +++++++++++++++++++++ docs/drvnodedev.html.in | 383 ------------ docs/drvnodedev.rst | 348 +++++++++++ docs/drvopenvz.html.in | 123 ---- docs/drvopenvz.rst | 97 ++++ docs/drvremote.html.in | 7 - docs/drvsecret.html.in | 82 --- docs/drvsecret.rst | 65 +++ docs/drvtest.html.in | 27 - docs/drvtest.rst | 21 + docs/drvvbox.html.in | 172 ------ docs/drvvbox.rst | 161 +++++ docs/drvvirtuozzo.html.in | 70 --- docs/drvvirtuozzo.rst | 60 ++ docs/drvvmware.html.in | 89 --- docs/drvvmware.rst | 72 +++ docs/drvxen.html.in | 358 ------------ docs/drvxen.rst | 338 +++++++++++ docs/firewall.html.in | 523 ----------------- docs/firewall.rst | 506 ++++++++++++++++ docs/format.html.in | 48 -- docs/format.rst | 35 ++ docs/formatcaps.html.in | 219 ------- docs/formatcaps.rst | 196 +++++++ docs/formatdomain.rst | 25 +- docs/formatdomaincaps.html.in | 693 ---------------------- docs/formatdomaincaps.rst | 602 +++++++++++++++++++ docs/formatnetworkport.html.in | 223 ------- docs/formatnetworkport.rst | 175 ++++++ docs/formatstoragecaps.html.in | 95 --- docs/formatstoragecaps.rst | 81 +++ docs/formatstorageencryption.html.in | 181 ------ docs/formatstorageencryption.rst | 139 +++++ docs/hooks.html.in | 406 ------------- docs/hooks.rst | 518 +++++++++++++++++ docs/java.html.in | 121 ---- docs/java.rst | 128 ++++ docs/kbase/backing_chains.rst | 2 +- docs/logging.html.in | 243 -------- docs/logging.rst | 215 +++++++ docs/meson.build | 49 +- docs/php.html.in | 28 - docs/php.rst | 23 + 52 files changed, 6236 insertions(+), 6946 deletions(-) delete mode 100644 docs/cgroups.html.in create mode 100644 docs/cgroups.rst delete mode 100644 docs/drvbhyve.html.in create mode 100644 docs/drvbhyve.rst delete mode 100644 docs/drvesx.html.in create mode 100644 docs/drvesx.rst delete mode 100644 docs/drvhyperv.html.in create mode 100644 docs/drvhyperv.rst delete mode 100644 docs/drvlxc.html.in create mode 100644 docs/drvlxc.rst delete mode 100644 docs/drvnodedev.html.in create mode 100644 docs/drvnodedev.rst delete mode 100644 docs/drvopenvz.html.in create mode 100644 docs/drvopenvz.rst delete mode 100644 docs/drvremote.html.in delete mode 100644 docs/drvsecret.html.in create mode 100644 docs/drvsecret.rst delete mode 100644 docs/drvtest.html.in create mode 100644 docs/drvtest.rst delete mode 100644 docs/drvvbox.html.in create mode 100644 docs/drvvbox.rst delete mode 100644 docs/drvvirtuozzo.html.in create mode 100644 docs/drvvirtuozzo.rst delete mode 100644 docs/drvvmware.html.in create mode 100644 docs/drvvmware.rst delete mode 100644 docs/drvxen.html.in create mode 100644 docs/drvxen.rst delete mode 100644 docs/firewall.html.in create mode 100644 docs/firewall.rst delete mode 100644 docs/format.html.in create mode 100644 docs/format.rst delete mode 100644 docs/formatcaps.html.in create mode 100644 docs/formatcaps.rst delete mode 100644 docs/formatdomaincaps.html.in create mode 100644 docs/formatdomaincaps.rst delete mode 100644 docs/formatnetworkport.html.in create mode 100644 docs/formatnetworkport.rst delete mode 100644 docs/formatstoragecaps.html.in create mode 100644 docs/formatstoragecaps.rst delete mode 100644 docs/formatstorageencryption.html.in create mode 100644 docs/formatstorageencryption.rst delete mode 100644 docs/hooks.html.in create mode 100644 docs/hooks.rst delete mode 100644 docs/java.html.in create mode 100644 docs/java.rst delete mode 100644 docs/logging.html.in create mode 100644 docs/logging.rst delete mode 100644 docs/php.html.in create mode 100644 docs/php.rst -- 2.35.1

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvremote.html.in | 7 ------- docs/meson.build | 1 - 2 files changed, 8 deletions(-) delete mode 100644 docs/drvremote.html.in diff --git a/docs/drvremote.html.in b/docs/drvremote.html.in deleted file mode 100644 index 224f1bfb17..0000000000 --- a/docs/drvremote.html.in +++ /dev/null @@ -1,7 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Remote management driver</h1> - </body> -</html> diff --git a/docs/meson.build b/docs/meson.build index b097866208..5f26d40082 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -29,7 +29,6 @@ docs_html_in_files = [ 'drvlxc', 'drvnodedev', 'drvopenvz', - 'drvremote', 'drvsecret', 'drvtest', 'drvvbox', -- 2.35.1

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/cgroups.html.in | 424 ------------------------------------------- docs/cgroups.rst | 364 +++++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 365 insertions(+), 425 deletions(-) delete mode 100644 docs/cgroups.html.in create mode 100644 docs/cgroups.rst diff --git a/docs/cgroups.html.in b/docs/cgroups.html.in deleted file mode 100644 index 412a9360ff..0000000000 --- a/docs/cgroups.html.in +++ /dev/null @@ -1,424 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Control Groups Resource Management</h1> - - <ul id="toc"></ul> - - <p> - The QEMU and LXC drivers make use of the Linux "Control Groups" facility - for applying resource management to their virtual machines and containers. - </p> - - <h2><a id="requiredControllers">Required controllers</a></h2> - - <p> - The control groups filesystem supports multiple "controllers". By default - the init system (such as systemd) should mount all controllers compiled - into the kernel at <code>/sys/fs/cgroup/$CONTROLLER-NAME</code>. Libvirt - will never attempt to mount any controllers itself, merely detect where - they are mounted. - </p> - - <p> - The QEMU driver is capable of using the <code>cpuset</code>, - <code>cpu</code>, <code>cpuacct</code>, <code>memory</code>, - <code>blkio</code> and <code>devices</code> controllers. - None of them are compulsory. If any controller is not mounted, - the resource management APIs which use it will cease to operate. - It is possible to explicitly turn off use of a controller, - even when mounted, via the <code>/etc/libvirt/qemu.conf</code> - configuration file. - </p> - - <p> - The LXC driver is capable of using the <code>cpuset</code>, - <code>cpu</code>, <code>cpuacct</code>, <code>freezer</code>, - <code>memory</code>, <code>blkio</code> and <code>devices</code> - controllers. The <code>cpuacct</code>, <code>devices</code> - and <code>memory</code> controllers are compulsory. Without - them mounted, no containers can be started. If any of the - other controllers are not mounted, the resource management APIs - which use them will cease to operate. - </p> - - <h2><a id="currentLayout">Current cgroups layout</a></h2> - - <p> - As of libvirt 1.0.5 or later, the cgroups layout created by libvirt has been - simplified, in order to facilitate the setup of resource control policies by - administrators / management applications. The new layout is based on the concepts - of "partitions" and "consumers". A "consumer" is a cgroup which holds the - processes for a single virtual machine or container. A "partition" is a cgroup - which does not contain any processes, but can have resource controls applied. - A "partition" will have zero or more child directories which may be either - "consumer" or "partition". - </p> - - <p> - As of libvirt 1.1.1 or later, the cgroups layout will have some slight - differences when running on a host with systemd 205 or later. The overall - tree structure is the same, but there are some differences in the naming - conventions for the cgroup directories. Thus the following docs split - in two, one describing systemd hosts and the other non-systemd hosts. - </p> - - <h3><a id="currentLayoutSystemd">Systemd cgroups integration</a></h3> - - <p> - On hosts which use systemd, each consumer maps to a systemd scope unit, - while partitions map to a system slice unit. - </p> - - <h4><a id="systemdScope">Systemd scope naming</a></h4> - - <p> - The systemd convention is for the scope name of virtual machines / containers - to be of the general format <code>machine-$NAME.scope</code>. Libvirt forms the - <code>$NAME</code> part of this by concatenating the driver type with the id - and truncated name of the guest, and then escaping any systemd reserved - characters. - So for a guest <code>demo</code> running under the <code>lxc</code> driver, - we get a <code>$NAME</code> of <code>lxc-12345-demo</code> which when escaped - is <code>lxc\x2d12345\x2ddemo</code>. So the complete scope name is - <code>machine-lxc\x2d12345\x2ddemo.scope</code>. - The scope names map directly to the cgroup directory names. - </p> - - <h4><a id="systemdSlice">Systemd slice naming</a></h4> - - <p> - The systemd convention for slice naming is that a slice should include the - name of all of its parents prepended on its own name. So for a libvirt - partition <code>/machine/engineering/testing</code>, the slice name will - be <code>machine-engineering-testing.slice</code>. Again the slice names - map directly to the cgroup directory names. Systemd creates three top level - slices by default, <code>system.slice</code> <code>user.slice</code> and - <code>machine.slice</code>. All virtual machines or containers created - by libvirt will be associated with <code>machine.slice</code> by default. - </p> - - <h4><a id="systemdLayout">Systemd cgroup layout</a></h4> - - <p> - Given this, a possible systemd cgroups layout involving 3 qemu guests, - 3 lxc containers and 3 custom child slices, would be: - </p> - - <pre> -$ROOT - | - +- system.slice - | | - | +- libvirtd.service - | - +- machine.slice - | - +- machine-qemu\x2d1\x2dvm1.scope - | | - | +- libvirt - | | - | +- emulator - | +- vcpu0 - | +- vcpu1 - | - +- machine-qemu\x2d2\x2dvm2.scope - | | - | +- libvirt - | | - | +- emulator - | +- vcpu0 - | +- vcpu1 - | - +- machine-qemu\x2d3\x2dvm3.scope - | | - | +- libvirt - | | - | +- emulator - | +- vcpu0 - | +- vcpu1 - | - +- machine-engineering.slice - | | - | +- machine-engineering-testing.slice - | | | - | | +- machine-lxc\x2d11111\x2dcontainer1.scope - | | - | +- machine-engineering-production.slice - | | - | +- machine-lxc\x2d22222\x2dcontainer2.scope - | - +- machine-marketing.slice - | - +- machine-lxc\x2d33333\x2dcontainer3.scope - </pre> - - <p> - Prior libvirt 7.1.0 the topology doesn't have extra - <code>libvirt</code> directory. - </p> - - <h3><a id="currentLayoutGeneric">Non-systemd cgroups layout</a></h3> - - <p> - On hosts which do not use systemd, each consumer has a corresponding cgroup - named <code>$VMNAME.libvirt-{qemu,lxc}</code>. Each consumer is associated - with exactly one partition, which also have a corresponding cgroup usually - named <code>$PARTNAME.partition</code>. The exceptions to this naming rule - is the top level default partition for virtual machines and containers - <code>/machine</code>. - </p> - - <p> - Given this, a possible non-systemd cgroups layout involving 3 qemu guests, - 3 lxc containers and 2 custom child slices, would be: - </p> - - <pre> -$ROOT - | - +- machine - | - +- qemu-1-vm1.libvirt-qemu - | | - | +- emulator - | +- vcpu0 - | +- vcpu1 - | - +- qeme-2-vm2.libvirt-qemu - | | - | +- emulator - | +- vcpu0 - | +- vcpu1 - | - +- qemu-3-vm3.libvirt-qemu - | | - | +- emulator - | +- vcpu0 - | +- vcpu1 - | - +- engineering.partition - | | - | +- testing.partition - | | | - | | +- lxc-11111-container1.libvirt-lxc - | | - | +- production.partition - | | - | +- lxc-22222-container2.libvirt-lxc - | - +- marketing.partition - | - +- lxc-33333-container3.libvirt-lxc - </pre> - - <h2><a id="customPartiton">Using custom partitions</a></h2> - - <p> - If there is a need to apply resource constraints to groups of - virtual machines or containers, then the single default - partition <code>/machine</code> may not be sufficiently - flexible. The administrator may wish to sub-divide the - default partition, for example into "testing" and "production" - partitions, and then assign each guest to a specific - sub-partition. This is achieved via a small element addition - to the guest domain XML config, just below the main <code>domain</code> - element - </p> - - <pre> -... -<resource> - <partition>/machine/production</partition> -</resource> -... - </pre> - - <p> - Note that the partition names in the guest XML are using a - generic naming format, not the low level naming convention - required by the underlying host OS. That is, you should not include - any of the <code>.partition</code> or <code>.slice</code> - suffixes in the XML config. Given a partition name - <code>/machine/production</code>, libvirt will automatically - apply the platform specific translation required to get - <code>/machine/production.partition</code> (non-systemd) - or <code>/machine.slice/machine-production.slice</code> - (systemd) as the underlying cgroup name - </p> - - <p> - Libvirt will not auto-create the cgroups directory to back - this partition. In the future, libvirt / virsh will provide - APIs / commands to create custom partitions, but currently - this is left as an exercise for the administrator. - </p> - - <p> - <strong>Note:</strong> the ability to place guests in custom - partitions is only available with libvirt >= 1.0.5, using - the new cgroup layout. The legacy cgroups layout described - later in this document did not support customization per guest. - </p> - - <h3><a id="createSystemd">Creating custom partitions (systemd)</a></h3> - - <p> - Given the XML config above, the admin on a systemd based host would - need to create a unit file <code>/etc/systemd/system/machine-production.slice</code> - </p> - - <pre> -# cat > /etc/systemd/system/machine-testing.slice <<EOF -[Unit] -Description=VM testing slice -Before=slices.target -Wants=machine.slice -EOF -# systemctl start machine-testing.slice - </pre> - - <h3><a id="createNonSystemd">Creating custom partitions (non-systemd)</a></h3> - - <p> - Given the XML config above, the admin on a non-systemd based host - would need to create a cgroup named '/machine/production.partition' - </p> - - <pre> -# cd /sys/fs/cgroup -# for i in blkio cpu,cpuacct cpuset devices freezer memory net_cls perf_event - do - mkdir $i/machine/production.partition - done -# for i in cpuset.cpus cpuset.mems - do - cat cpuset/machine/$i > cpuset/machine/production.partition/$i - done -</pre> - - <h2><a id="resourceAPIs">Resource management APIs/commands</a></h2> - - <p> - Since libvirt aims to provide an API which is portable across - hypervisors, the concept of cgroups is not exposed directly - in the API or XML configuration. It is considered to be an - internal implementation detail. Instead libvirt provides a - set of APIs for applying resource controls, which are then - mapped to corresponding cgroup tunables - </p> - - <h3>Scheduler tuning</h3> - - <p> - Parameters from the "cpu" controller are exposed via the - <code>schedinfo</code> command in virsh. - </p> - - <pre> -# virsh schedinfo demo -Scheduler : posix -cpu_shares : 1024 -vcpu_period : 100000 -vcpu_quota : -1 -emulator_period: 100000 -emulator_quota : -1</pre> - - - <h3>Block I/O tuning</h3> - - <p> - Parameters from the "blkio" controller are exposed via the - <code>bkliotune</code> command in virsh. - </p> - - - <pre> -# virsh blkiotune demo -weight : 500 -device_weight : </pre> - - <h3>Memory tuning</h3> - - <p> - Parameters from the "memory" controller are exposed via the - <code>memtune</code> command in virsh. - </p> - - <pre> -# virsh memtune demo -hard_limit : 580192 -soft_limit : unlimited -swap_hard_limit: unlimited - </pre> - - <h3>Network tuning</h3> - - <p> - The <code>net_cls</code> is not currently used. Instead traffic - filter policies are set directly against individual virtual - network interfaces. - </p> - - <h2><a id="legacyLayout">Legacy cgroups layout</a></h2> - - <p> - Prior to libvirt 1.0.5, the cgroups layout created by libvirt was different - from that described above, and did not allow for administrator customization. - Libvirt used a fixed, 3-level hierarchy <code>libvirt/{qemu,lxc}/$VMNAME</code> - which was rooted at the point in the hierarchy where libvirtd itself was - located. So if libvirtd was placed at <code>/system/libvirtd.service</code> - by systemd, the groups for each virtual machine / container would be located - at <code>/system/libvirtd.service/libvirt/{qemu,lxc}/$VMNAME</code>. In addition - to this, the QEMU drivers further child groups for each vCPU thread and the - emulator thread(s). This leads to a hierarchy that looked like - </p> - - - <pre> -$ROOT - | - +- system - | - +- libvirtd.service - | - +- libvirt - | - +- qemu - | | - | +- vm1 - | | | - | | +- emulator - | | +- vcpu0 - | | +- vcpu1 - | | - | +- vm2 - | | | - | | +- emulator - | | +- vcpu0 - | | +- vcpu1 - | | - | +- vm3 - | | - | +- emulator - | +- vcpu0 - | +- vcpu1 - | - +- lxc - | - +- container1 - | - +- container2 - | - +- container3 - </pre> - - <p> - Although current releases are much improved, historically the use of deep - hierarchies has had a significant negative impact on the kernel scalability. - The legacy libvirt cgroups layout highlighted these problems, to the detriment - of the performance of virtual machines and containers. - </p> - </body> -</html> diff --git a/docs/cgroups.rst b/docs/cgroups.rst new file mode 100644 index 0000000000..eb66a14f0d --- /dev/null +++ b/docs/cgroups.rst @@ -0,0 +1,364 @@ +================================== +Control Groups Resource Management +================================== + +.. contents:: + +The QEMU and LXC drivers make use of the Linux "Control Groups" facility for +applying resource management to their virtual machines and containers. + +Required controllers +-------------------- + +The control groups filesystem supports multiple "controllers". By default the +init system (such as systemd) should mount all controllers compiled into the +kernel at ``/sys/fs/cgroup/$CONTROLLER-NAME``. Libvirt will never attempt to +mount any controllers itself, merely detect where they are mounted. + +The QEMU driver is capable of using the ``cpuset``, ``cpu``, ``cpuacct``, +``memory``, ``blkio`` and ``devices`` controllers. None of them are compulsory. +If any controller is not mounted, the resource management APIs which use it will +cease to operate. It is possible to explicitly turn off use of a controller, +even when mounted, via the ``/etc/libvirt/qemu.conf`` configuration file. + +The LXC driver is capable of using the ``cpuset``, ``cpu``, ``cpuacct``, +``freezer``, ``memory``, ``blkio`` and ``devices`` controllers. The ``cpuacct``, +``devices`` and ``memory`` controllers are compulsory. Without them mounted, no +containers can be started. If any of the other controllers are not mounted, the +resource management APIs which use them will cease to operate. + +Current cgroups layout +---------------------- + +As of libvirt 1.0.5 or later, the cgroups layout created by libvirt has been +simplified, in order to facilitate the setup of resource control policies by +administrators / management applications. The new layout is based on the +concepts of "partitions" and "consumers". A "consumer" is a cgroup which holds +the processes for a single virtual machine or container. A "partition" is a +cgroup which does not contain any processes, but can have resource controls +applied. A "partition" will have zero or more child directories which may be +either "consumer" or "partition". + +As of libvirt 1.1.1 or later, the cgroups layout will have some slight +differences when running on a host with systemd 205 or later. The overall tree +structure is the same, but there are some differences in the naming conventions +for the cgroup directories. Thus the following docs split in two, one describing +systemd hosts and the other non-systemd hosts. + +Systemd cgroups integration +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +On hosts which use systemd, each consumer maps to a systemd scope unit, while +partitions map to a system slice unit. + +Systemd scope naming +^^^^^^^^^^^^^^^^^^^^ + +The systemd convention is for the scope name of virtual machines / containers to +be of the general format ``machine-$NAME.scope``. Libvirt forms the ``$NAME`` +part of this by concatenating the driver type with the id and truncated name of +the guest, and then escaping any systemd reserved characters. So for a guest +``demo`` running under the ``lxc`` driver, we get a ``$NAME`` of +``lxc-12345-demo`` which when escaped is ``lxc\x2d12345\x2ddemo``. So the +complete scope name is ``machine-lxc\x2d12345\x2ddemo.scope``. The scope names +map directly to the cgroup directory names. + +Systemd slice naming +^^^^^^^^^^^^^^^^^^^^ + +The systemd convention for slice naming is that a slice should include the name +of all of its parents prepended on its own name. So for a libvirt partition +``/machine/engineering/testing``, the slice name will be +``machine-engineering-testing.slice``. Again the slice names map directly to the +cgroup directory names. Systemd creates three top level slices by default, +``system.slice`` ``user.slice`` and ``machine.slice``. All virtual machines or +containers created by libvirt will be associated with ``machine.slice`` by +default. + +Systemd cgroup layout +^^^^^^^^^^^^^^^^^^^^^ + +Given this, a possible systemd cgroups layout involving 3 qemu guests, 3 lxc +containers and 3 custom child slices, would be: + +:: + + $ROOT + | + +- system.slice + | | + | +- libvirtd.service + | + +- machine.slice + | + +- machine-qemu\x2d1\x2dvm1.scope + | | + | +- libvirt + | | + | +- emulator + | +- vcpu0 + | +- vcpu1 + | + +- machine-qemu\x2d2\x2dvm2.scope + | | + | +- libvirt + | | + | +- emulator + | +- vcpu0 + | +- vcpu1 + | + +- machine-qemu\x2d3\x2dvm3.scope + | | + | +- libvirt + | | + | +- emulator + | +- vcpu0 + | +- vcpu1 + | + +- machine-engineering.slice + | | + | +- machine-engineering-testing.slice + | | | + | | +- machine-lxc\x2d11111\x2dcontainer1.scope + | | + | +- machine-engineering-production.slice + | | + | +- machine-lxc\x2d22222\x2dcontainer2.scope + | + +- machine-marketing.slice + | + +- machine-lxc\x2d33333\x2dcontainer3.scope + +Prior libvirt 7.1.0 the topology doesn't have extra ``libvirt`` directory. + +Non-systemd cgroups layout +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +On hosts which do not use systemd, each consumer has a corresponding cgroup +named ``$VMNAME.libvirt-{qemu,lxc}``. Each consumer is associated with exactly +one partition, which also have a corresponding cgroup usually named +``$PARTNAME.partition``. The exceptions to this naming rule is the top level +default partition for virtual machines and containers ``/machine``. + +Given this, a possible non-systemd cgroups layout involving 3 qemu guests, 3 lxc +containers and 2 custom child slices, would be: + +:: + + $ROOT + | + +- machine + | + +- qemu-1-vm1.libvirt-qemu + | | + | +- emulator + | +- vcpu0 + | +- vcpu1 + | + +- qeme-2-vm2.libvirt-qemu + | | + | +- emulator + | +- vcpu0 + | +- vcpu1 + | + +- qemu-3-vm3.libvirt-qemu + | | + | +- emulator + | +- vcpu0 + | +- vcpu1 + | + +- engineering.partition + | | + | +- testing.partition + | | | + | | +- lxc-11111-container1.libvirt-lxc + | | + | +- production.partition + | | + | +- lxc-22222-container2.libvirt-lxc + | + +- marketing.partition + | + +- lxc-33333-container3.libvirt-lxc + +Using custom partitions +----------------------- + +If there is a need to apply resource constraints to groups of virtual machines +or containers, then the single default partition ``/machine`` may not be +sufficiently flexible. The administrator may wish to sub-divide the default +partition, for example into "testing" and "production" partitions, and then +assign each guest to a specific sub-partition. This is achieved via a small +element addition to the guest domain XML config, just below the main ``domain`` +element + +:: + + ... + <resource> + <partition>/machine/production</partition> + </resource> + ... + +Note that the partition names in the guest XML are using a generic naming +format, not the low level naming convention required by the underlying host OS. +That is, you should not include any of the ``.partition`` or ``.slice`` suffixes +in the XML config. Given a partition name ``/machine/production``, libvirt will +automatically apply the platform specific translation required to get +``/machine/production.partition`` (non-systemd) or +``/machine.slice/machine-production.slice`` (systemd) as the underlying cgroup +name + +Libvirt will not auto-create the cgroups directory to back this partition. In +the future, libvirt / virsh will provide APIs / commands to create custom +partitions, but currently this is left as an exercise for the administrator. + +**Note:** the ability to place guests in custom partitions is only available +with libvirt >= 1.0.5, using the new cgroup layout. The legacy cgroups layout +described later in this document did not support customization per guest. + +Creating custom partitions (systemd) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Given the XML config above, the admin on a systemd based host would need to +create a unit file ``/etc/systemd/system/machine-production.slice`` + +:: + + # cat > /etc/systemd/system/machine-testing.slice <<EOF + [Unit] + Description=VM testing slice + Before=slices.target + Wants=machine.slice + EOF + # systemctl start machine-testing.slice + +Creating custom partitions (non-systemd) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Given the XML config above, the admin on a non-systemd based host would need to +create a cgroup named '/machine/production.partition' + +:: + + # cd /sys/fs/cgroup + # for i in blkio cpu,cpuacct cpuset devices freezer memory net_cls perf_event + do + mkdir $i/machine/production.partition + done + # for i in cpuset.cpus cpuset.mems + do + cat cpuset/machine/$i > cpuset/machine/production.partition/$i + done + +Resource management APIs/commands +--------------------------------- + +Since libvirt aims to provide an API which is portable across hypervisors, the +concept of cgroups is not exposed directly in the API or XML configuration. It +is considered to be an internal implementation detail. Instead libvirt provides +a set of APIs for applying resource controls, which are then mapped to +corresponding cgroup tunables + +Scheduler tuning +~~~~~~~~~~~~~~~~ + +Parameters from the "cpu" controller are exposed via the ``schedinfo`` command +in virsh. + +:: + + # virsh schedinfo demo + Scheduler : posix + cpu_shares : 1024 + vcpu_period : 100000 + vcpu_quota : -1 + emulator_period: 100000 + emulator_quota : -1 + +Block I/O tuning +~~~~~~~~~~~~~~~~ + +Parameters from the "blkio" controller are exposed via the ``bkliotune`` command +in virsh. + +:: + + # virsh blkiotune demo + weight : 500 + device_weight : + +Memory tuning +~~~~~~~~~~~~~ + +Parameters from the "memory" controller are exposed via the ``memtune`` command +in virsh. + +:: + + # virsh memtune demo + hard_limit : 580192 + soft_limit : unlimited + swap_hard_limit: unlimited + +Network tuning +~~~~~~~~~~~~~~ + +The ``net_cls`` is not currently used. Instead traffic filter policies are set +directly against individual virtual network interfaces. + +Legacy cgroups layout +--------------------- + +Prior to libvirt 1.0.5, the cgroups layout created by libvirt was different from +that described above, and did not allow for administrator customization. Libvirt +used a fixed, 3-level hierarchy ``libvirt/{qemu,lxc}/$VMNAME`` which was rooted +at the point in the hierarchy where libvirtd itself was located. So if libvirtd +was placed at ``/system/libvirtd.service`` by systemd, the groups for each +virtual machine / container would be located at +``/system/libvirtd.service/libvirt/{qemu,lxc}/$VMNAME``. In addition to this, +the QEMU drivers further child groups for each vCPU thread and the emulator +thread(s). This leads to a hierarchy that looked like + +:: + + $ROOT + | + +- system + | + +- libvirtd.service + | + +- libvirt + | + +- qemu + | | + | +- vm1 + | | | + | | +- emulator + | | +- vcpu0 + | | +- vcpu1 + | | + | +- vm2 + | | | + | | +- emulator + | | +- vcpu0 + | | +- vcpu1 + | | + | +- vm3 + | | + | +- emulator + | +- vcpu0 + | +- vcpu1 + | + +- lxc + | + +- container1 + | + +- container2 + | + +- container3 + +Although current releases are much improved, historically the use of deep +hierarchies has had a significant negative impact on the kernel scalability. The +legacy libvirt cgroups layout highlighted these problems, to the detriment of +the performance of virtual machines and containers. diff --git a/docs/meson.build b/docs/meson.build index 5f26d40082..bb7e27e031 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -19,7 +19,6 @@ docs_assets = [ docs_html_in_files = [ '404', - 'cgroups', 'csharp', 'dbus', 'docs', @@ -70,6 +69,7 @@ docs_rst_files = [ 'best-practices', 'bindings', 'bugs', + 'cgroups', 'ci', 'coding-style', 'committer-guidelines', -- 2.35.1

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvbhyve.html.in | 583 ------------------------------------------ docs/drvbhyve.rst | 582 +++++++++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 583 insertions(+), 584 deletions(-) delete mode 100644 docs/drvbhyve.html.in create mode 100644 docs/drvbhyve.rst diff --git a/docs/drvbhyve.html.in b/docs/drvbhyve.html.in deleted file mode 100644 index 228e8b2bd5..0000000000 --- a/docs/drvbhyve.html.in +++ /dev/null @@ -1,583 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Bhyve driver</h1> - - <ul id="toc"></ul> - -<p> -Bhyve is a FreeBSD hypervisor. It first appeared in FreeBSD 10.0. However, it's -recommended to keep tracking FreeBSD 10-STABLE to make sure all new features -of bhyve are supported. - -In order to enable bhyve on your FreeBSD host, you'll need to load the <code>vmm</code> -kernel module. Additionally, <code>if_tap</code> and <code>if_bridge</code> modules -should be loaded for networking support. Also, <span class="since">since 3.2.0</span> the -<code>virt-host-validate(1)</code> supports the bhyve host validation and could be -used like this: -</p> - -<pre> -$ virt-host-validate bhyve - BHYVE: Checking for vmm module : PASS - BHYVE: Checking for if_tap module : PASS - BHYVE: Checking for if_bridge module : PASS - BHYVE: Checking for nmdm module : PASS -$ -</pre> - -<p> -Additional information on bhyve could be obtained on <a href="https://bhyve.org/">bhyve.org</a>. -</p> - -<h2><a id="uri">Connections to the Bhyve driver</a></h2> -<p> -The libvirt bhyve driver is a single-instance privileged driver. Some sample -connection URIs are: -</p> - -<pre> -bhyve:///system (local access) -bhyve+unix:///system (local access) -bhyve+ssh://root@example.com/system (remote access, SSH tunnelled) -</pre> - -<h2><a id="exconfig">Example guest domain XML configurations</a></h2> - -<h3>Example config</h3> -<p> -The bhyve driver in libvirt is in its early stage and under active development. So it supports -only limited number of features bhyve provides. -</p> - -<p> -Note: in older libvirt versions, only a single network device and a single -disk device were supported per-domain. However, -<span class="since">since 1.2.6</span> the libvirt bhyve driver supports -up to 31 PCI devices. -</p> - -<p> -Note: the Bhyve driver in libvirt will boot whichever device is first. If you -want to install from CD, put the CD device first. If not, put the root HDD -first. -</p> - -<p> -Note: Only the SATA bus is supported. Only <code>cdrom</code>- and -<code>disk</code>-type disks are supported. -</p> - -<pre> -<domain type='bhyve'> - <name>bhyve</name> - <uuid>df3be7e7-a104-11e3-aeb0-50e5492bd3dc</uuid> - <memory>219136</memory> - <currentMemory>219136</currentMemory> - <vcpu>1</vcpu> - <os> - <type>hvm</type> - </os> - <features> - <apic/> - <acpi/> - </features> - <clock offset='utc'/> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>destroy</on_crash> - <devices> - <disk type='file'> - <driver name='file' type='raw'/> - <source file='/path/to/bhyve_freebsd.img'/> - <target dev='hda' bus='sata'/> - </disk> - <disk type='file' device='cdrom'> - <driver name='file' type='raw'/> - <source file='/path/to/cdrom.iso'/> - <target dev='hdc' bus='sata'/> - <readonly/> - </disk> - <interface type='bridge'> - <model type='virtio'/> - <source bridge="virbr0"/> - </interface> - </devices> -</domain> -</pre> - -<p>(The <disk> sections may be swapped in order to install from -<em>cdrom.iso</em>.)</p> - -<h3>Example config (Linux guest)</h3> - -<p> -Note the addition of <bootloader>. -</p> - -<pre> -<domain type='bhyve'> - <name>linux_guest</name> - <uuid>df3be7e7-a104-11e3-aeb0-50e5492bd3dc</uuid> - <memory>131072</memory> - <currentMemory>131072</currentMemory> - <vcpu>1</vcpu> - <bootloader>/usr/local/sbin/grub-bhyve</bootloader> - <os> - <type>hvm</type> - </os> - <features> - <apic/> - <acpi/> - </features> - <clock offset='utc'/> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>destroy</on_crash> - <devices> - <disk type='file' device='disk'> - <driver name='file' type='raw'/> - <source file='/path/to/guest_hdd.img'/> - <target dev='hda' bus='sata'/> - </disk> - <disk type='file' device='cdrom'> - <driver name='file' type='raw'/> - <source file='/path/to/cdrom.iso'/> - <target dev='hdc' bus='sata'/> - <readonly/> - </disk> - <interface type='bridge'> - <model type='virtio'/> - <source bridge="virbr0"/> - </interface> - </devices> -</domain> -</pre> - -<h3>Example config (Linux UEFI guest, VNC, tablet)</h3> - -<p>This is an example to boot into Fedora 25 installation:</p> - -<pre> -<domain type='bhyve'> - <name>fedora_uefi_vnc_tablet</name> - <memory unit='G'>4</memory> - <vcpu>2</vcpu> - <os> - <type>hvm</type> - <b><loader readonly="yes" type="pflash">/usr/local/share/uefi-firmware/BHYVE_UEFI.fd</loader></b> - </os> - <features> - <apic/> - <acpi/> - </features> - <clock offset='utc'/> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>destroy</on_crash> - <devices> - <disk type='file' device='cdrom'> - <driver name='file' type='raw'/> - <source file='/path/to/Fedora-Workstation-Live-x86_64-25-1.3.iso'/> - <target dev='hdc' bus='sata'/> - <readonly/> - </disk> - <disk type='file' device='disk'> - <driver name='file' type='raw'/> - <source file='/path/to/linux_uefi.img'/> - <target dev='hda' bus='sata'/> - </disk> - <interface type='bridge'> - <model type='virtio'/> - <source bridge="virbr0"/> - </interface> - <serial type="nmdm"> - <source master="/dev/nmdm0A" slave="/dev/nmdm0B"/> - </serial> - <b><graphics type='vnc' port='5904'> - <listen type='address' address='127.0.0.1'/> - </graphics> - <controller type='usb' model='nec-xhci'/> - <input type='tablet' bus='usb'/></b> - </devices> -</domain> -</pre> - -<p>Please refer to the <a href="#uefi">UEFI</a> section for a more detailed explanation.</p> - -<h2><a id="usage">Guest usage / management</a></h2> - -<h3><a id="console">Connecting to a guest console</a></h3> - -<p> -Guest console connection is supported through the <code>nmdm</code> device. It could be enabled by adding -the following to the domain XML (<span class="since">Since 1.2.4</span>): -</p> - -<pre> -... -<devices> - <serial type="nmdm"> - <source master="/dev/nmdm0A" slave="/dev/nmdm0B"/> - </serial> -</devices> -...</pre> - - -<p>Make sure to load the <code>nmdm</code> kernel module if you plan to use that.</p> - -<p> -Then <code>virsh console</code> command can be used to connect to the text console -of a guest.</p> - -<p><b>NB:</b> Some versions of bhyve have a bug that prevents guests from booting -until the console is opened by a client. This bug was fixed in -<a href="https://svnweb.freebsd.org/changeset/base/262884">FreeBSD changeset r262884</a>. If -an older version is used, one either has to open a console manually with <code>virsh console</code> -to let a guest boot or start a guest using:</p> - -<pre>start --console domname</pre> - -<p><b>NB:</b> A bootloader configured to require user interaction will prevent -the domain from starting (and thus <code>virsh console</code> or <code>start ---console</code> from functioning) until the user interacts with it manually on -the VM host. Because users typically do not have access to the VM host, -interactive bootloaders are unsupported by libvirt. <em>However,</em> if you happen to -run into this scenario and also happen to have access to the Bhyve host -machine, you may select a boot option and allow the domain to finish starting -by using an alternative terminal client on the VM host to connect to the -domain-configured null modem device. One example (assuming -<code>/dev/nmdm0B</code> is configured as the slave end of the domain serial -device) is:</p> - -<pre>cu -l /dev/nmdm0B</pre> - -<h3><a id="xmltonative">Converting from domain XML to Bhyve args</a></h3> - -<p> -The <code>virsh domxml-to-native</code> command can preview the actual -<code>bhyve</code> commands that will be executed for a given domain. -It outputs two lines, the first line is a <code>bhyveload</code> command and -the second is a <code>bhyve</code> command. -</p> - -<p>Please note that the <code>virsh domxml-to-native</code> doesn't do any -real actions other than printing the command, for example, it doesn't try to -find a proper TAP interface and create it, like what is done when starting -a domain; and always returns <code>tap0</code> for the network interface. So -if you're going to run these commands manually, most likely you might want to -tweak them.</p> - -<pre> -# virsh -c "bhyve:///system" domxml-to-native --format bhyve-argv --xml /path/to/bhyve.xml -/usr/sbin/bhyveload -m 214 -d /home/user/vm1.img vm1 -/usr/sbin/bhyve -c 2 -m 214 -A -I -H -P -s 0:0,hostbridge \ - -s 3:0,virtio-net,tap0,mac=52:54:00:5d:74:e3 -s 2:0,virtio-blk,/home/user/vm1.img \ - -s 1,lpc -l com1,/dev/nmdm0A vm1 -</pre> - -<h3><a id="zfsvolume">Using ZFS volumes</a></h3> - -<p>It's possible to use ZFS volumes as disk devices <span class="since">since 1.2.8</span>. -An example of domain XML device entry for that will look like:</p> - -<pre> -... -<disk type='volume' device='disk'> - <source pool='zfspool' volume='vol1'/> - <target dev='vdb' bus='virtio'/> -</disk> -...</pre> - -<p>Please refer to the <a href="storage.html">Storage documentation</a> for more details on storage -management.</p> - -<h3><a id="grubbhyve">Using grub2-bhyve or Alternative Bootloaders</a></h3> - -<p>It's possible to boot non-FreeBSD guests by specifying an explicit -bootloader, e.g. <code>grub-bhyve(1)</code>. Arguments to the bootloader may be -specified as well. If the bootloader is <code>grub-bhyve</code> and arguments -are omitted, libvirt will try and infer boot ordering from user-supplied -<boot order='N'> configuration in the domain. Failing that, it will boot -the first disk in the domain (either <code>cdrom</code>- or -<code>disk</code>-type devices). If the disk type is <code>disk</code>, it will -attempt to boot from the first partition in the disk image.</p> - -<pre> -... -<bootloader>/usr/local/sbin/grub-bhyve</bootloader> -<bootloader_args>...</bootloader_args> -... -</pre> - -<p>Caveat: <code>bootloader_args</code> does not support any quoting. -Filenames, etc, must not have spaces or they will be tokenized incorrectly.</p> - -<h3><a id="uefi">Using UEFI bootrom, VNC, and USB tablet</a></h3> - -<p><span class="since">Since 3.2.0</span>, in addition to <a href="#grubbhyve">grub-bhyve</a>, -non-FreeBSD guests could be also booted using an UEFI boot ROM, provided both guest OS and -installed <code>bhyve(1)</code> version support UEFI. To use that, <code>loader</code> -should be specified in the <code>os</code> section:</p> - -<pre> -<domain type='bhyve'> - ... - <os> - <type>hvm</type> - <loader readonly="yes" type="pflash">/usr/local/share/uefi-firmware/BHYVE_UEFI.fd</loader> - </os> - ... -</pre> - -<p>This uses the UEFI firmware provided by -the <a href="https://www.freshports.org/sysutils/bhyve-firmware/">sysutils/bhyve-firmware</a> -FreeBSD port.</p> - -<p>VNC and the tablet input device could be configured this way:</p> - -<pre> -<domain type='bhyve'> - <devices> - ... - <graphics type='vnc' port='5904'> - <listen type='address' address='127.0.0.1'/> - </graphics> - <controller type='usb' model='nec-xhci'/> - <input type='tablet' bus='usb'/> - </devices> - ... -</domain> -</pre> - -<p>This way, VNC will be accessible on <code>127.0.0.1:5904</code>.</p> - -<p>Please note that the tablet device requires to have a USB controller -of the <code>nec-xhci</code> model. Currently, only a single controller of this -type and a single tablet are supported per domain.</p> - -<p><span class="since">Since 3.5.0</span>, it's possible to configure how the video device is exposed -to the guest using the <code>vgaconf</code> attribute:</p> - -<pre> -<domain type='bhyve'> - <devices> - ... - <graphics type='vnc' port='5904'> - <listen type='address' address='127.0.0.1'/> - </graphics> - <video> - <driver vgaconf='on'/> - <model type='gop' heads='1' primary='yes'/> - </video> - ... - </devices> - ... -</domain> -</pre> - -<p>If not specified, bhyve's default mode for <code>vgaconf</code> -will be used. Please refer to the -<a href="https://www.freebsd.org/cgi/man.cgi?query=bhyve&sektion=8&manpath=FreeBSD+12-current">bhyve(8)</a> -manual page and the <a href="https://wiki.freebsd.org/bhyve">bhyve wiki</a> for more details on using -the <code>vgaconf</code> option.</p> - -<p><span class="since">Since 3.7.0</span>, it's possible to use <code>autoport</code> -to let libvirt allocate VNC port automatically (instead of explicitly specifying -it with the <code>port</code> attribute):</p> - -<pre> - <graphics type='vnc' autoport='yes'> -</pre> - -<p><span class="since">Since 6.8.0</span>, it's possible to set framebuffer resolution -using the <code>resolution</code> sub-element:</p> - -<pre> - <video> - <model type='gop' heads='1' primary='yes'> - <resolution x='800' y='600'/> - </model> - </video> -</pre> - -<p><span class="since">Since 6.8.0</span>, VNC server can be configured to use -password based authentication:</p> - -<pre> - <graphics type='vnc' port='5904' passwd='foobar'> - <listen type='address' address='127.0.0.1'/> - </graphics> -</pre> - -<p>Note: VNC password authentication is known to be cryptographically weak. -Additionally, the password is passed as a command line argument in clear text. -Make sure you understand the risks associated with this feature before using it.</p> - -<h3><a id="clockconfig">Clock configuration</a></h3> - -<p>Originally bhyve supported only localtime for RTC. Support for UTC time was introduced in -<a href="https://svnweb.freebsd.org/changeset/base/284894">FreeBSD changeset r284894</a> -for <i>10-STABLE</i> and -in <a href="https://svnweb.freebsd.org/changeset/base/279225">changeset r279225</a> -for <i>-CURRENT</i>. It's possible to use this in libvirt <span class="since">since 1.2.18</span>, -just place the following to domain XML:</p> - -<pre> -<domain type="bhyve"> - ... - <clock offset='utc'/> - ... -</domain> -</pre> - -<p>Please note that if you run the older bhyve version that doesn't support UTC time, you'll -fail to start a domain. As UTC is used as a default when you do not specify clock settings, -you'll need to explicitly specify 'localtime' in this case:</p> - -<pre> -<domain type="bhyve"> - ... - <clock offset='localtime'/> - ... -</domain> -</pre> - -<h3><a id="e1000">e1000 NIC</a></h3> - -<p>As of <a href="https://svnweb.freebsd.org/changeset/base/302504">FreeBSD changeset r302504</a> -bhyve supports Intel e1000 network adapter emulation. It's supported in libvirt -<span class="since">since 3.1.0</span> and could be used as follows:</p> - -<pre> -... - <interface type='bridge'> - <source bridge='virbr0'/> - <model type='<b>e1000</b>'/> - </interface> -... -</pre> - -<h3><a id="sound">Sound device</a></h3> - -<p>As of <a href="https://svnweb.freebsd.org/changeset/base/349355">FreeBSD changeset r349355</a> -bhyve supports sound device emulation. It's supported in libvirt -<span class="since">since 6.7.0</span>.</p> - -<pre> -... - <sound model='ich7'> - <audio id='1'/> - </sound> - <audio id='1' type='oss'> - <input dev='/dev/dsp0'/> - <output dev='/dev/dsp0'/> - </audio> -... -</pre> - -<p>Here, the <code>sound</code> element specifies the sound device as it's exposed -to the guest, with <code>ich7</code> being the only supported model now, -and the <code>audio</code> element specifies how the guest device is mapped -to the host sound device.</p> - -<h3><a id="fs-9p">Virtio-9p filesystem</a></h3> - -<p>As of <a href="https://svnweb.freebsd.org/changeset/base/366413">FreeBSD changeset r366413</a> -bhyve supports sharing arbitrary directory tree between the guest and the host. -It's supported in libvirt <span class="since">since 6.9.0</span>.</p> - -<pre> -... - <filesystem> - <source dir='/shared/dir'/> - <target dir='shared_dir'/> - </filesystem> -... -</pre> - -<p>This share could be made read only by adding the <code><readonly/></code> sub-element.</p> - -<p>In the Linux guest, this could be mounted using:</p> - -<pre>mount -t 9p shared_dir /mnt/shared_dir</pre> - -<h3><a id="wired">Wiring guest memory</a></h3> - -<p><span class="since">Since 4.4.0</span>, it's possible to specify that guest memory should -be wired and cannot be swapped out as follows:</p> -<pre> -<domain type="bhyve"> - ... - <memoryBacking> - <locked/> - </memoryBacking> - ... -</domain> -</pre> - -<h3><a id="cputopology">CPU topology</a></h3> - -<p><span class="since">Since 4.5.0</span>, it's possible to specify guest CPU topology, if bhyve -supports that. Support for specifying guest CPU topology was added to bhyve in -<a href="https://svnweb.freebsd.org/changeset/base/332298">FreeBSD changeset r332298</a> -for <i>-CURRENT</i>. -Example:</p> -<pre> -<domain type="bhyve"> - ... - <cpu> - <topology sockets='1' cores='2' threads='1'/> - </cpu> - ... -</domain> -</pre> - -<h3><a id="msrs">Ignoring unknown MSRs reads and writes</a></h3> - -<p><span class="since">Since 5.1.0</span>, it's possible to make bhyve -ignore accesses to unimplemented Model Specific Registers (MSRs). -Example:</p> - -<pre> -<domain type="bhyve"> - ... - <features> - ... - <msrs unknown='ignore'/> - ... - </features> - ... -</domain> -</pre> - -<h3><a id="bhyvecommand">Pass-through of arbitrary bhyve commands</a></h3> - -<p><span class="since">Since 5.1.0</span>, it's possible to pass additional command-line -arguments to the bhyve process when starting the domain using the -<code><bhyve:commandline></code> element under <code>domain</code>. -To supply an argument, use the element <code><bhyve:arg></code> with -the attribute <code>value</code> set to additional argument to be added. -The arg element may be repeated multiple times. To use this XML addition, it is necessary -to issue an XML namespace request (the special <code>xmlns:<i>name</i></code> attribute) -that pulls in <code>http://libvirt.org/schemas/domain/bhyve/1.0</code>; -typically, the namespace is given the name of <code>bhyve</code>. -</p> -<p>Example:</p> -<pre> -<domain type="bhyve" xmlns:bhyve="http://libvirt.org/schemas/domain/bhyve/1.0"> - ... - <bhyve:commandline> - <bhyve:arg value='-somebhyvearg'/> - </bhyve:commandline> -</domain> -</pre> - -<p>Note that these extensions are for testing and development purposes only. -They are <b>unsupported</b>, using them may result in inconsistent state, -and upgrading either bhyve or libvirtd maybe break behavior of a domain that -was relying on a specific commands pass-through.</p> - - </body> -</html> diff --git a/docs/drvbhyve.rst b/docs/drvbhyve.rst new file mode 100644 index 0000000000..95ef4e9b49 --- /dev/null +++ b/docs/drvbhyve.rst @@ -0,0 +1,582 @@ +.. role:: since + +============ +Bhyve driver +============ + +.. contents:: + +Bhyve is a FreeBSD hypervisor. It first appeared in FreeBSD 10.0. However, it's +recommended to keep tracking FreeBSD 10-STABLE to make sure all new features of +bhyve are supported. In order to enable bhyve on your FreeBSD host, you'll need +to load the ``vmm`` kernel module. Additionally, ``if_tap`` and ``if_bridge`` +modules should be loaded for networking support. Also, :since:`since 3.2.0` the +``virt-host-validate(1)`` supports the bhyve host validation and could be used +like this: + +:: + + $ virt-host-validate bhyve + BHYVE: Checking for vmm module : PASS + BHYVE: Checking for if_tap module : PASS + BHYVE: Checking for if_bridge module : PASS + BHYVE: Checking for nmdm module : PASS + $ + +Additional information on bhyve could be obtained on +`bhyve.org <https://bhyve.org/>`__. + +Connections to the Bhyve driver +------------------------------- + +The libvirt bhyve driver is a single-instance privileged driver. Some sample +connection URIs are: + +:: + + bhyve:///system (local access) + bhyve+unix:///system (local access) + bhyve+ssh://root@example.com/system (remote access, SSH tunnelled) + +Example guest domain XML configurations +--------------------------------------- + +Example config +~~~~~~~~~~~~~~ + +The bhyve driver in libvirt is in its early stage and under active development. +So it supports only limited number of features bhyve provides. + +Note: in older libvirt versions, only a single network device and a single disk +device were supported per-domain. However, :since:`since 1.2.6` the libvirt +bhyve driver supports up to 31 PCI devices. + +Note: the Bhyve driver in libvirt will boot whichever device is first. If you +want to install from CD, put the CD device first. If not, put the root HDD +first. + +Note: Only the SATA bus is supported. Only ``cdrom``- and ``disk``-type disks +are supported. + +:: + + <domain type='bhyve'> + <name>bhyve</name> + <uuid>df3be7e7-a104-11e3-aeb0-50e5492bd3dc</uuid> + <memory>219136</memory> + <currentMemory>219136</currentMemory> + <vcpu>1</vcpu> + <os> + <type>hvm</type> + </os> + <features> + <apic/> + <acpi/> + </features> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <disk type='file'> + <driver name='file' type='raw'/> + <source file='/path/to/bhyve_freebsd.img'/> + <target dev='hda' bus='sata'/> + </disk> + <disk type='file' device='cdrom'> + <driver name='file' type='raw'/> + <source file='/path/to/cdrom.iso'/> + <target dev='hdc' bus='sata'/> + <readonly/> + </disk> + <interface type='bridge'> + <model type='virtio'/> + <source bridge="virbr0"/> + </interface> + </devices> + </domain> + +(The <disk> sections may be swapped in order to install from *cdrom.iso*.) + +Example config (Linux guest) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Note the addition of <bootloader>. + +:: + + <domain type='bhyve'> + <name>linux_guest</name> + <uuid>df3be7e7-a104-11e3-aeb0-50e5492bd3dc</uuid> + <memory>131072</memory> + <currentMemory>131072</currentMemory> + <vcpu>1</vcpu> + <bootloader>/usr/local/sbin/grub-bhyve</bootloader> + <os> + <type>hvm</type> + </os> + <features> + <apic/> + <acpi/> + </features> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <disk type='file' device='disk'> + <driver name='file' type='raw'/> + <source file='/path/to/guest_hdd.img'/> + <target dev='hda' bus='sata'/> + </disk> + <disk type='file' device='cdrom'> + <driver name='file' type='raw'/> + <source file='/path/to/cdrom.iso'/> + <target dev='hdc' bus='sata'/> + <readonly/> + </disk> + <interface type='bridge'> + <model type='virtio'/> + <source bridge="virbr0"/> + </interface> + </devices> + </domain> + +Example config (Linux UEFI guest, VNC, tablet) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This is an example to boot into Fedora 25 installation: + +:: + + <domain type='bhyve'> + <name>fedora_uefi_vnc_tablet</name> + <memory unit='G'>4</memory> + <vcpu>2</vcpu> + <os> + <type>hvm</type> + <loader readonly="yes" type="pflash">/usr/local/share/uefi-firmware/BHYVE_UEFI.fd</loader> + </os> + <features> + <apic/> + <acpi/> + </features> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <disk type='file' device='cdrom'> + <driver name='file' type='raw'/> + <source file='/path/to/Fedora-Workstation-Live-x86_64-25-1.3.iso'/> + <target dev='hdc' bus='sata'/> + <readonly/> + </disk> + <disk type='file' device='disk'> + <driver name='file' type='raw'/> + <source file='/path/to/linux_uefi.img'/> + <target dev='hda' bus='sata'/> + </disk> + <interface type='bridge'> + <model type='virtio'/> + <source bridge="virbr0"/> + </interface> + <serial type="nmdm"> + <source master="/dev/nmdm0A" slave="/dev/nmdm0B"/> + </serial> + <graphics type='vnc' port='5904'> + <listen type='address' address='127.0.0.1'/> + </graphics> + <controller type='usb' model='nec-xhci'/> + <input type='tablet' bus='usb'/> + </devices> + </domain> + +Please refer to the `UEFI <#uefi>`__ section for a more detailed explanation. + +Guest usage / management +------------------------ + +Connecting to a guest console +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Guest console connection is supported through the ``nmdm`` device. It could be +enabled by adding the following to the domain XML ( :since:`Since 1.2.4` ): + +:: + + ... + <devices> + <serial type="nmdm"> + <source master="/dev/nmdm0A" slave="/dev/nmdm0B"/> + </serial> + </devices> + ... + +Make sure to load the ``nmdm`` kernel module if you plan to use that. + +Then ``virsh console`` command can be used to connect to the text console of a +guest. + +**NB:** Some versions of bhyve have a bug that prevents guests from booting +until the console is opened by a client. This bug was fixed in `FreeBSD +changeset r262884 <https://svnweb.freebsd.org/changeset/base/262884>`__. If an +older version is used, one either has to open a console manually with +``virsh console`` to let a guest boot or start a guest using: + +:: + + start --console domname + +**NB:** A bootloader configured to require user interaction will prevent the +domain from starting (and thus ``virsh console`` or ``start --console`` from +functioning) until the user interacts with it manually on the VM host. Because +users typically do not have access to the VM host, interactive bootloaders are +unsupported by libvirt. *However,* if you happen to run into this scenario and +also happen to have access to the Bhyve host machine, you may select a boot +option and allow the domain to finish starting by using an alternative terminal +client on the VM host to connect to the domain-configured null modem device. One +example (assuming ``/dev/nmdm0B`` is configured as the slave end of the domain +serial device) is: + +:: + + cu -l /dev/nmdm0B + +Converting from domain XML to Bhyve args +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh domxml-to-native`` command can preview the actual ``bhyve`` commands +that will be executed for a given domain. It outputs two lines, the first line +is a ``bhyveload`` command and the second is a ``bhyve`` command. + +Please note that the ``virsh domxml-to-native`` doesn't do any real actions +other than printing the command, for example, it doesn't try to find a proper +TAP interface and create it, like what is done when starting a domain; and +always returns ``tap0`` for the network interface. So if you're going to run +these commands manually, most likely you might want to tweak them. + +:: + + # virsh -c "bhyve:///system" domxml-to-native --format bhyve-argv --xml /path/to/bhyve.xml + /usr/sbin/bhyveload -m 214 -d /home/user/vm1.img vm1 + /usr/sbin/bhyve -c 2 -m 214 -A -I -H -P -s 0:0,hostbridge \ + -s 3:0,virtio-net,tap0,mac=52:54:00:5d:74:e3 -s 2:0,virtio-blk,/home/user/vm1.img \ + -s 1,lpc -l com1,/dev/nmdm0A vm1 + +Using ZFS volumes +~~~~~~~~~~~~~~~~~ + +It's possible to use ZFS volumes as disk devices :since:`since 1.2.8` . An +example of domain XML device entry for that will look like: + +:: + + ... + <disk type='volume' device='disk'> + <source pool='zfspool' volume='vol1'/> + <target dev='vdb' bus='virtio'/> + </disk> + ... + +Please refer to the `Storage documentation <storage.html>`__ for more details on +storage management. + +Using grub2-bhyve or Alternative Bootloaders +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +It's possible to boot non-FreeBSD guests by specifying an explicit bootloader, +e.g. ``grub-bhyve(1)``. Arguments to the bootloader may be specified as well. If +the bootloader is ``grub-bhyve`` and arguments are omitted, libvirt will try and +infer boot ordering from user-supplied <boot order='N'> configuration in the +domain. Failing that, it will boot the first disk in the domain (either +``cdrom``- or ``disk``-type devices). If the disk type is ``disk``, it will +attempt to boot from the first partition in the disk image. + +:: + + ... + <bootloader>/usr/local/sbin/grub-bhyve</bootloader> + <bootloader_args>...</bootloader_args> + ... + +Caveat: ``bootloader_args`` does not support any quoting. Filenames, etc, must +not have spaces or they will be tokenized incorrectly. + +Using UEFI bootrom, VNC, and USB tablet +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:since:`Since 3.2.0` , in addition to `grub-bhyve <#grubbhyve>`__, non-FreeBSD +guests could be also booted using an UEFI boot ROM, provided both guest OS and +installed ``bhyve(1)`` version support UEFI. To use that, ``loader`` should be +specified in the ``os`` section: + +:: + + <domain type='bhyve'> + ... + <os> + <type>hvm</type> + <loader readonly="yes" type="pflash">/usr/local/share/uefi-firmware/BHYVE_UEFI.fd</loader> + </os> + ... + +This uses the UEFI firmware provided by the +`sysutils/bhyve-firmware <https://www.freshports.org/sysutils/bhyve-firmware/>`__ +FreeBSD port. + +VNC and the tablet input device could be configured this way: + +:: + + <domain type='bhyve'> + <devices> + ... + <graphics type='vnc' port='5904'> + <listen type='address' address='127.0.0.1'/> + </graphics> + <controller type='usb' model='nec-xhci'/> + <input type='tablet' bus='usb'/> + </devices> + ... + </domain> + +This way, VNC will be accessible on ``127.0.0.1:5904``. + +Please note that the tablet device requires to have a USB controller of the +``nec-xhci`` model. Currently, only a single controller of this type and a +single tablet are supported per domain. + +:since:`Since 3.5.0` , it's possible to configure how the video device is +exposed to the guest using the ``vgaconf`` attribute: + +:: + + <domain type='bhyve'> + <devices> + ... + <graphics type='vnc' port='5904'> + <listen type='address' address='127.0.0.1'/> + </graphics> + <video> + <driver vgaconf='on'/> + <model type='gop' heads='1' primary='yes'/> + </video> + ... + </devices> + ... + </domain> + +If not specified, bhyve's default mode for ``vgaconf`` will be used. Please +refer to the +`bhyve(8) <https://www.freebsd.org/cgi/man.cgi?query=bhyve&sektion=8&manpath=FreeBSD+12-current>`__ +manual page and the `bhyve wiki <https://wiki.freebsd.org/bhyve>`__ for more +details on using the ``vgaconf`` option. + +:since:`Since 3.7.0` , it's possible to use ``autoport`` to let libvirt allocate +VNC port automatically (instead of explicitly specifying it with the ``port`` +attribute): + +:: + + <graphics type='vnc' autoport='yes'> + +:since:`Since 6.8.0` , it's possible to set framebuffer resolution using the +``resolution`` sub-element: + +:: + + <video> + <model type='gop' heads='1' primary='yes'> + <resolution x='800' y='600'/> + </model> + </video> + +:since:`Since 6.8.0` , VNC server can be configured to use password based +authentication: + +:: + + <graphics type='vnc' port='5904' passwd='foobar'> + <listen type='address' address='127.0.0.1'/> + </graphics> + +Note: VNC password authentication is known to be cryptographically weak. +Additionally, the password is passed as a command line argument in clear text. +Make sure you understand the risks associated with this feature before using it. + +Clock configuration +~~~~~~~~~~~~~~~~~~~ + +Originally bhyve supported only localtime for RTC. Support for UTC time was +introduced in `FreeBSD changeset +r284894 <https://svnweb.freebsd.org/changeset/base/284894>`__ for *10-STABLE* +and in `changeset r279225 <https://svnweb.freebsd.org/changeset/base/279225>`__ +for *-CURRENT*. It's possible to use this in libvirt :since:`since 1.2.18` , +just place the following to domain XML: + +:: + + <domain type="bhyve"> + ... + <clock offset='utc'/> + ... + </domain> + +Please note that if you run the older bhyve version that doesn't support UTC +time, you'll fail to start a domain. As UTC is used as a default when you do not +specify clock settings, you'll need to explicitly specify 'localtime' in this +case: + +:: + + <domain type="bhyve"> + ... + <clock offset='localtime'/> + ... + </domain> + +e1000 NIC +~~~~~~~~~ + +As of `FreeBSD changeset +r302504 <https://svnweb.freebsd.org/changeset/base/302504>`__ bhyve supports +Intel e1000 network adapter emulation. It's supported in libvirt :since:`since +3.1.0` and could be used as follows: + +:: + + ... + <interface type='bridge'> + <source bridge='virbr0'/> + <model type='e1000'/> + </interface> + ... + +Sound device +~~~~~~~~~~~~ + +As of `FreeBSD changeset +r349355 <https://svnweb.freebsd.org/changeset/base/349355>`__ bhyve supports +sound device emulation. It's supported in libvirt :since:`since 6.7.0` . + +:: + + ... + <sound model='ich7'> + <audio id='1'/> + </sound> + <audio id='1' type='oss'> + <input dev='/dev/dsp0'/> + <output dev='/dev/dsp0'/> + </audio> + ... + +Here, the ``sound`` element specifies the sound device as it's exposed to the +guest, with ``ich7`` being the only supported model now, and the ``audio`` +element specifies how the guest device is mapped to the host sound device. + +Virtio-9p filesystem +~~~~~~~~~~~~~~~~~~~~ + +As of `FreeBSD changeset +r366413 <https://svnweb.freebsd.org/changeset/base/366413>`__ bhyve supports +sharing arbitrary directory tree between the guest and the host. It's supported +in libvirt :since:`since 6.9.0` . + +:: + + ... + <filesystem> + <source dir='/shared/dir'/> + <target dir='shared_dir'/> + </filesystem> + ... + +This share could be made read only by adding the ``<readonly/>`` sub-element. + +In the Linux guest, this could be mounted using: + +:: + + mount -t 9p shared_dir /mnt/shared_dir + +Wiring guest memory +~~~~~~~~~~~~~~~~~~~ + +:since:`Since 4.4.0` , it's possible to specify that guest memory should be +wired and cannot be swapped out as follows: + +:: + + <domain type="bhyve"> + ... + <memoryBacking> + <locked/> + </memoryBacking> + ... + </domain> + +CPU topology +~~~~~~~~~~~~ + +:since:`Since 4.5.0` , it's possible to specify guest CPU topology, if bhyve +supports that. Support for specifying guest CPU topology was added to bhyve in +`FreeBSD changeset r332298 <https://svnweb.freebsd.org/changeset/base/332298>`__ +for *-CURRENT*. Example: + +:: + + <domain type="bhyve"> + ... + <cpu> + <topology sockets='1' cores='2' threads='1'/> + </cpu> + ... + </domain> + +Ignoring unknown MSRs reads and writes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:since:`Since 5.1.0` , it's possible to make bhyve ignore accesses to +unimplemented Model Specific Registers (MSRs). Example: + +:: + + <domain type="bhyve"> + ... + <features> + ... + <msrs unknown='ignore'/> + ... + </features> + ... + </domain> + +Pass-through of arbitrary bhyve commands +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:since:`Since 5.1.0` , it's possible to pass additional command-line arguments +to the bhyve process when starting the domain using the ``<bhyve:commandline>`` +element under ``domain``. To supply an argument, use the element ``<bhyve:arg>`` +with the attribute ``value`` set to additional argument to be added. The arg +element may be repeated multiple times. To use this XML addition, it is +necessary to issue an XML namespace request (the special ``xmlns:name`` +attribute) that pulls in ``http://libvirt.org/schemas/domain/bhyve/1.0``; +typically, the namespace is given the name of ``bhyve``. + +Example: + +:: + + <domain type="bhyve" xmlns:bhyve="http://libvirt.org/schemas/domain/bhyve/1.0"> + ... + <bhyve:commandline> + <bhyve:arg value='-somebhyvearg'/> + </bhyve:commandline> + </domain> + +Note that these extensions are for testing and development purposes only. They +are **unsupported**, using them may result in inconsistent state, and upgrading +either bhyve or libvirtd maybe break behavior of a domain that was relying on a +specific commands pass-through. diff --git a/docs/meson.build b/docs/meson.build index bb7e27e031..a6c3077f25 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvbhyve', 'drvesx', 'drvhyperv', 'drvlxc', @@ -79,6 +78,7 @@ docs_rst_files = [ 'daemons', 'downloads', 'drivers', + 'drvbhyve', 'drvch', 'drvqemu', 'errors', -- 2.35.1

Reviewed-by: Erik Skultety <eskultet@redhat.com>

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvesx.html.in | 838 -------------------------------------------- docs/drvesx.rst | 681 +++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 682 insertions(+), 839 deletions(-) delete mode 100644 docs/drvesx.html.in create mode 100644 docs/drvesx.rst diff --git a/docs/drvesx.html.in b/docs/drvesx.html.in deleted file mode 100644 index c56da16f57..0000000000 --- a/docs/drvesx.html.in +++ /dev/null @@ -1,838 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>VMware ESX hypervisor driver</h1> - <ul id="toc"></ul> - <p> - The libvirt VMware ESX driver can manage VMware ESX/ESXi 3.5/4.x/5.x and - VMware GSX 2.0, also called VMware Server 2.0, and possibly later - versions. <span class="since">Since 0.8.3</span> the driver can also - connect to a VMware vCenter 2.5/4.x/5.x (VPX). - </p> - - <h2><a id="project">Project Links</a></h2> - - <ul> - <li> - The <a href="https://www.vmware.com/">VMware ESX and GSX</a> - hypervisors - </li> - </ul> - - <h2><a id="prereq">Deployment pre-requisites</a></h2> - <p> - None. Any out-of-the-box installation of VPX/ESX(i)/GSX should work. No - preparations are required on the server side, no libvirtd must be - installed on the ESX server. The driver uses version 2.5 of the remote, - SOAP based - <a href="https://www.vmware.com/support/developer/vc-sdk/visdk25pubs/ReferenceGuide/"> - VMware Virtual Infrastructure API</a> (VI API) to communicate with the - ESX server, like the VMware Virtual Infrastructure Client (VI client) - does. Since version 4.0 this API is called - <a href="https://www.vmware.com/support/developer/vc-sdk/visdk400pubs/ReferenceGuide/"> - VMware vSphere API</a>. - </p> - - <h2><a id="uri">Connections to the VMware ESX driver</a></h2> - <p> - Some example remote connection URIs for the driver are: - </p> -<pre> -vpx://example-vcenter.com/dc1/srv1 (VPX over HTTPS, select ESX server 'srv1' in datacenter 'dc1') -esx://example-esx.com (ESX over HTTPS) -gsx://example-gsx.com (GSX over HTTPS) -esx://example-esx.com/?transport=http (ESX over HTTP) -esx://example-esx.com/?no_verify=1 (ESX over HTTPS, but doesn't verify the server's SSL certificate) -</pre> - <p> - <strong>Note</strong>: In contrast to other drivers, the ESX driver is - a client-side-only driver. It connects to the ESX server using HTTP(S). - Therefore, the <a href="remote.html">remote transport mechanism</a> - provided by the remote driver and libvirtd will not work, and you - cannot use URIs like <code>esx+ssh://example.com</code>. - </p> - - - <h3><a id="uriformat">URI Format</a></h3> - <p> - URIs have this general form (<code>[...]</code> marks an optional part). - </p> -<pre> -type://[username@]hostname[:port]/[[folder/...]datacenter/[folder/...][cluster/]server][?extraparameters] -</pre> - <p> - The <code>type://</code> is either <code>esx://</code> or - <code>gsx://</code> or <code>vpx://</code> <span class="since">since 0.8.3</span>. - The driver selects the default port depending on the <code>type://</code>. - For <code>esx://</code> and <code>vpx://</code> the default HTTPS port - is 443, for <code>gsx://</code> it is 8333. - If the port parameter is given, it overrides the default port. - </p> - <p> - A <code>vpx://</code> connection is currently restricted to a single - ESX server. This might be relaxed in the future. The path part of the - URI is used to specify the datacenter and the ESX server in it. If the - ESX server is part of a cluster then the cluster has to be specified too. - </p> - <p> - An example: ESX server <code>example-esx.com</code> is managed by - vCenter <code>example-vcenter.com</code> and part of cluster - <code>cluster1</code>. This cluster is part of datacenter <code>dc1</code>. - </p> -<pre> -vpx://example-vcenter.com/dc1/cluster1/example-esx.com -</pre> - <p> - Datacenters and clusters can be organized in folders, those have to be - specified as well. The driver can handle folders - <span class="since">since 0.9.7</span>. - </p> -<pre> -vpx://example-vcenter.com/folder1/dc1/folder2/example-esx.com -</pre> - - - <h4><a id="extraparams">Extra parameters</a></h4> - <p> - Extra parameters can be added to a URI as part of the query string - (the part following <code>?</code>). A single parameter is formed by a - <code>name=value</code> pair. Multiple parameters are separated by - <code>&</code>. - </p> -<pre> -?<span style="color: #E50000">no_verify=1</span>&<span style="color: #00B200">auto_answer=1</span>&<span style="color: #0000E5">proxy=socks://example-proxy.com:23456</span> -</pre> - <p> - The driver understands the extra parameters shown below. - </p> - <table class="top_table"> - <tr> - <th>Name</th> - <th>Values</th> - <th>Meaning</th> - </tr> - <tr> - <td> - <code>transport</code> - </td> - <td> - <code>http</code> or <code>https</code> - </td> - <td> - Overrides the default HTTPS transport. For <code>esx://</code> - and <code>vpx://</code> the default HTTP port is 80, for - <code>gsx://</code> it is 8222. - </td> - </tr> - <tr> - <td> - <code>vcenter</code> - </td> - <td> - Hostname of a VMware vCenter or <code>*</code> - </td> - <td> - In order to perform a migration the driver needs to know the - VMware vCenter for the ESX server. If set to <code>*</code>, - the driver connects to the vCenter known to the ESX server. - This parameter in useful when connecting to an ESX server only. - </td> - </tr> - <tr> - <td> - <code>no_verify</code> - </td> - <td> - <code>0</code> or <code>1</code> - </td> - <td> - If set to 1, this disables libcurl client checks of the server's - SSL certificate. The default value is 0. See the - <a href="#certificates">Certificates for HTTPS</a> section for - details. - </td> - </tr> - <tr> - <td> - <code>auto_answer</code> - </td> - <td> - <code>0</code> or <code>1</code> - </td> - <td> - If set to 1, the driver answers all - <a href="#questions">questions</a> with the default answer. - If set to 0, questions are reported as errors. The default - value is 0. <span class="since">Since 0.7.5</span>. - </td> - </tr> - <tr> - <td> - <code>proxy</code> - </td> - <td> - <code>[type://]hostname[:port]</code> - </td> - <td> - Allows to specify a proxy for HTTP and HTTPS communication. - <span class="since">Since 0.8.2</span>. - The optional <code>type</code> part may be one of: - <code>http</code>, <code>socks</code>, <code>socks4</code>, - <code>socks4a</code> or <code>socks5</code>. The default is - <code>http</code> and <code>socks</code> is synonymous for - <code>socks5</code>. The optional <code>port</code> allows to - override the default port 1080. - </td> - </tr> - </table> - - - <h3><a id="auth">Authentication</a></h3> - <p> - In order to perform any useful operation the driver needs to log into - the ESX server. Therefore, only <code>virConnectOpenAuth</code> can be - used to connect to an ESX server, <code>virConnectOpen</code> and - <code>virConnectOpenReadOnly</code> don't work. - To log into an ESX server or vCenter the driver will request - credentials using the callback passed to the - <code>virConnectOpenAuth</code> function. The driver passes the - hostname as challenge parameter to the callback. This enables the - callback to distinguish between requests for ESX server and vCenter. - </p> - <p> - <strong>Note</strong>: During the ongoing driver development, testing - is done using an unrestricted <code>root</code> account. Problems may - occur if you use a restricted account. Detailed testing with restricted - accounts has not been done yet. - </p> - - - <h3><a id="certificates">Certificates for HTTPS</a></h3> - <p> - By default the ESX driver uses HTTPS to communicate with an ESX server. - Proper HTTPS communication requires correctly configured SSL - certificates. This certificates are different from the ones libvirt - uses for <a href="remote.html">secure communication over TLS</a> to a - libvirtd one a remote server. - </p> - <p> - By default the driver tries to verify the server's SSL certificate - using the CA certificate pool installed on your client computer. With - an out-of-the-box installed ESX server this won't work, because a newly - installed ESX server uses auto-generated self-signed certificates. - Those are signed by a CA certificate that is typically not known to your - client computer and libvirt will report an error like this one: - </p> -<pre> -error: internal error curl_easy_perform() returned an error: Peer certificate cannot be authenticated with known CA certificates (60) -</pre> - <p> - Where are two ways to solve this problem: - </p> - <ul> - <li> - Use the <code>no_verify=1</code> <a href="#extraparams">extra parameter</a> - to disable server certificate verification. - </li> - <li> - Generate new SSL certificates signed by a CA known to your client - computer and replace the original ones on your ESX server. See the - section <i>Replace a Default Certificate with a CA-Signed Certificate</i> - in the <a href="https://www.vmware.com/pdf/vsphere4/r40/vsp_40_esx_server_config.pdf">ESX Configuration Guide</a> - </li> - </ul> - - - <h3><a id="connproblems">Connection problems</a></h3> - <p> - There are also other causes for connection problems than the - <a href="#certificates">HTTPS certificate</a> related ones. - </p> - <ul> - <li> - As stated before the ESX driver doesn't need the - <a href="remote.html">remote transport mechanism</a> - provided by the remote driver and libvirtd, nor does the ESX driver - support it. Therefore, using an URI including a transport in the - scheme won't work. Only <a href="#uriformat">URIs as described</a> - are supported by the ESX driver. Here's a collection of possible - error messages: -<pre> -$ virsh -c esx+tcp://example.com/ -error: unable to connect to libvirtd at 'example.com': Connection refused -</pre> -<pre> -$ virsh -c esx+tls://example.com/ -error: Cannot access CA certificate '/etc/pki/CA/cacert.pem': No such file or directory -</pre> -<pre> -$ virsh -c esx+ssh://example.com/ -error: cannot recv data: ssh: connect to host example.com port 22: Connection refused -</pre> -<pre> -$ virsh -c esx+ssh://example.com/ -error: cannot recv data: Resource temporarily unavailable -</pre> - </li> - <li> - <span class="since">Since 0.7.0</span> libvirt contains the ESX - driver. Earlier versions of libvirt will report a misleading error - about missing certificates when you try to connect to an ESX server. -<pre> -$ virsh -c esx://example.com/ -error: Cannot access CA certificate '/etc/pki/CA/cacert.pem': No such file or directory -</pre> - <p> - Don't let this error message confuse you. Setting up certificates - as described on the <a href="remote.html#Remote_certificates">remote transport mechanism</a> page - does not help, as this is not a certificate related problem. - </p> - <p> - To fix this problem you need to update your libvirt to 0.7.0 or newer. - You may also see this error when you use a libvirt version that - contains the ESX driver but you or your distro disabled the ESX - driver during compilation. <span class="since">Since 0.8.3</span> - the error message has been improved in this case: - </p> -<pre> -$ virsh -c esx://example.com/ -error: invalid argument in libvirt was built without the 'esx' driver -</pre> - </li> - </ul> - - - <h2><a id="questions">Questions blocking tasks</a></h2> - <p> - Some methods of the VI API start tasks, for example - <code>PowerOnVM_Task()</code>. Such tasks may be blocked by questions - if the ESX server detects an issue with the domain that requires user - interaction. The ESX driver cannot prompt the user to answer a - question, libvirt doesn't have an API for something like this. - </p> - <p> - The VI API provides the <code>AnswerVM()</code> method to - programmatically answer a questions. So the driver has two options - how to handle such a situation: either answer the questions with the - default answer or report the question as an error and cancel the - blocked task if possible. The - <a href="#uriformat"><code>auto_answer</code></a> query parameter - controls the answering behavior. - </p> - - - <h2><a id="xmlspecial">Specialities in the domain XML config</a></h2> - <p> - There are several specialities in the domain XML config for ESX domains. - </p> - - <h3><a id="restrictions">Restrictions</a></h3> - <p> - There are some restrictions for some values of the domain XML config. - The driver will complain if this restrictions are violated. - </p> - <ul> - <li> - Memory size has to be a multiple of 4096 - </li> - <li> - Number of virtual CPU has to be 1 or a multiple of 2. - <span class="since">Since 4.10.0</span> any number of vCPUs is - supported. - </li> - <li> - Valid MAC address prefixes are <code>00:0c:29</code> and - <code>00:50:56</code>. <span class="since">Since 0.7.6</span> - arbitrary <a href="#macaddresses">MAC addresses</a> are supported. - </li> - </ul> - - - <h3><a id="datastore">Datastore references</a></h3> - <p> - Storage is managed in datastores. VMware uses a special path format to - reference files in a datastore. Basically, the datastore name is put - into squared braces in front of the path. - </p> -<pre> -[datastore] directory/filename -</pre> - <p> - To define a new domain the driver converts the domain XML into a - VMware VMX file and uploads it to a datastore known to the ESX server. - Because multiple datastores may be known to an ESX server the driver - needs to decide to which datastore the VMX file should be uploaded. - The driver deduces this information from the path of the source of the - first file-based harddisk listed in the domain XML. - </p> - - - <h3><a id="macaddresses">MAC addresses</a></h3> - <p> - VMware has registered two MAC address prefixes for domains: - <code>00:0c:29</code> and <code>00:50:56</code>. These prefixes are - split into ranges for different purposes. - </p> - <table class="top_table"> - <tr> - <th>Range</th> - <th>Purpose</th> - </tr> - <tr> - <td> - <code>00:0c:29:00:00:00</code> - <code>00:0c:29:ff:ff:ff</code> - </td> - <td> - An ESX server autogenerates MAC addresses from this range if - the VMX file doesn't contain a MAC address when trying to start - a domain. - </td> - </tr> - <tr> - <td> - <code>00:50:56:00:00:00</code> - <code>00:50:56:3f:ff:ff</code> - </td> - <td> - MAC addresses from this range can by manually assigned by the - user in the VI client. - </td> - </tr> - <tr> - <td> - <code>00:50:56:80:00:00</code> - <code>00:50:56:bf:ff:ff</code> - </td> - <td> - A VI client autogenerates MAC addresses from this range for - newly defined domains. - </td> - </tr> - </table> - <p> - The VMX files generated by the ESX driver always contain a MAC address, - because libvirt generates a random one if an interface element in the - domain XML file lacks a MAC address. - <span class="since">Since 0.7.6</span> the ESX driver sets the prefix - for generated MAC addresses to <code>00:0c:29</code>. Before 0.7.6 - the <code>00:50:56</code> prefix was used. Sometimes this resulted in - the generation of out-of-range MAC address that were rejected by the - ESX server. - </p> - <p> - Also <span class="since">since 0.7.6</span> every MAC address outside - this ranges can be used. For such MAC addresses the ESX server-side - check is disabled in the VMX file to stop the ESX server from rejecting - out-of-predefined-range MAC addresses. - </p> -<pre> -ethernet0.checkMACAddress = "false" -</pre> - <p> - <span class="since">Since 6.6.0</span>, one can force libvirt to keep the - provided MAC address when it's in the reserved VMware range by adding a - <code>type="static"</code> attribute to the <code><mac/></code> element. - Note that this attribute is useless if the provided MAC address is outside of - the reserved VMWare ranges. - </p> - - - <h3><a id="hardware">Available hardware</a></h3> - <p> - VMware ESX supports different models of SCSI controllers and network - cards. - </p> - - <h4>SCSI controller models</h4> - <dl> - <dt><code>auto</code></dt> - <dd> - This isn't an actual controller model. If specified the ESX driver - tries to detect the SCSI controller model referenced in the - <code>.vmdk</code> file and use it. Autodetection fails when a - SCSI controller has multiple disks attached and the SCSI controller - models referenced in the <code>.vmdk</code> files are inconsistent. - <span class="since">Since 0.8.3</span> - </dd> - <dt><code>buslogic</code></dt> - <dd> - BusLogic SCSI controller for older guests. - </dd> - <dt><code>lsilogic</code></dt> - <dd> - LSI Logic SCSI controller for recent guests. - </dd> - <dt><code>lsisas1068</code></dt> - <dd> - LSI Logic SAS 1068 controller. <span class="since">Since 0.8.0</span> - </dd> - <dt><code>vmpvscsi</code></dt> - <dd> - Special VMware Paravirtual SCSI controller, requires VMware tools inside - the guest. See <a href="https://kb.vmware.com/kb/1010398">VMware KB1010398</a> - for details. <span class="since">Since 0.8.3</span> - </dd> - </dl> - <p> - Here a domain XML snippet: - </p> -<pre> -... -<disk type='file' device='disk'> - <source file='[local-storage] Fedora11/Fedora11.vmdk'/> - <target dev='sda' bus='scsi'/> - <address type='drive' controller='0' bus='0' unit='0'/> -</disk> -<controller type='scsi' index='0' model='<strong>lsilogic</strong>'/> -... -</pre> - <p> - The controller element is supported <span class="since">since 0.8.2</span>. - Prior to this <code><driver name='lsilogic'/></code> was abused to - specify the SCSI controller model. This attribute usage is deprecated now. - </p> -<pre> -... -<disk type='file' device='disk'> - <driver name='<strong>lsilogic</strong>'/> - <source file='[local-storage] Fedora11/Fedora11.vmdk'/> - <target dev='sda' bus='scsi'/> -</disk> -... -</pre> - - - <h4>Network card models</h4> - <dl> - <dt><code>vlance</code></dt> - <dd> - AMD PCnet32 network card for older guests. - </dd> - <dt><code>vmxnet</code>, <code>vmxnet2</code>, <code>vmxnet3</code></dt> - <dd> - Special VMware VMXnet network card, requires VMware tools inside - the guest. See <a href="https://kb.vmware.com/kb/1001805">VMware KB1001805</a> - for details. - </dd> - <dt><code>e1000</code></dt> - <dd> - Intel E1000 network card for recent guests. - </dd> - </dl> - <p> - Here a domain XML snippet: - </p> -<pre> -... -<interface type='bridge'> - <mac address='00:50:56:25:48:c7'/> - <source bridge='VM Network'/> - <model type='<strong>e1000</strong>'/> -</interface> -... -</pre> - - - <h2><a id="importexport">Import and export of domain XML configs</a></h2> - <p> - The ESX driver currently supports a native config format known as - <code>vmware-vmx</code> to handle VMware VMX configs. - </p> - - - <h3><a id="xmlimport">Converting from VMware VMX config to domain XML config</a></h3> - <p> - The <code>virsh domxml-from-native</code> provides a way to convert an - existing VMware VMX config into a domain XML config that can then be - used by libvirt. - </p> -<pre> -$ cat > demo.vmx << EOF -#!/usr/bin/vmware -config.version = "8" -virtualHW.version = "4" -floppy0.present = "false" -nvram = "Fedora11.nvram" -deploymentPlatform = "windows" -virtualHW.productCompatibility = "hosted" -tools.upgrade.policy = "useGlobal" -powerType.powerOff = "default" -powerType.powerOn = "default" -powerType.suspend = "default" -powerType.reset = "default" -displayName = "Fedora11" -extendedConfigFile = "Fedora11.vmxf" -scsi0.present = "true" -scsi0.sharedBus = "none" -scsi0.virtualDev = "lsilogic" -memsize = "1024" -scsi0:0.present = "true" -scsi0:0.fileName = "/vmfs/volumes/498076b2-02796c1a-ef5b-000ae484a6a3/Fedora11/Fedora11.vmdk" -scsi0:0.deviceType = "scsi-hardDisk" -ide0:0.present = "true" -ide0:0.clientDevice = "true" -ide0:0.deviceType = "cdrom-raw" -ide0:0.startConnected = "false" -ethernet0.present = "true" -ethernet0.networkName = "VM Network" -ethernet0.addressType = "vpx" -ethernet0.generatedAddress = "00:50:56:91:48:c7" -chipset.onlineStandby = "false" -guestOSAltName = "Red Hat Enterprise Linux 5 (32-Bit)" -guestOS = "rhel5" -uuid.bios = "50 11 5e 16 9b dc 49 d7-f1 71 53 c4 d7 f9 17 10" -snapshot.action = "keep" -sched.cpu.min = "0" -sched.cpu.units = "mhz" -sched.cpu.shares = "normal" -sched.mem.minsize = "0" -sched.mem.shares = "normal" -toolScripts.afterPowerOn = "true" -toolScripts.afterResume = "true" -toolScripts.beforeSuspend = "true" -toolScripts.beforePowerOff = "true" -scsi0:0.redo = "" -tools.syncTime = "false" -uuid.location = "56 4d b5 06 a2 bd fb eb-ae 86 f7 d8 49 27 d0 c4" -sched.cpu.max = "unlimited" -sched.swap.derivedName = "/vmfs/volumes/498076b2-02796c1a-ef5b-000ae484a6a3/Fedora11/Fedora11-7de040d8.vswp" -tools.remindInstall = "TRUE" -EOF - -$ virsh -c esx://example.com domxml-from-native vmware-vmx demo.vmx -Enter username for example.com [root]: -Enter root password for example.com: -<domain type='vmware'> - <name>Fedora11</name> - <uuid>50115e16-9bdc-49d7-f171-53c4d7f91710</uuid> - <memory>1048576</memory> - <currentMemory>1048576</currentMemory> - <vcpu>1</vcpu> - <os> - <type arch='i686'>hvm</type> - </os> - <clock offset='utc'/> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>destroy</on_crash> - <devices> - <disk type='file' device='disk'> - <source file='[local-storage] Fedora11/Fedora11.vmdk'/> - <target dev='sda' bus='scsi'/> - <address type='drive' controller='0' bus='0' unit='0'/> - </disk> - <controller type='scsi' index='0' model='lsilogic'/> - <interface type='bridge'> - <mac address='00:50:56:91:48:c7'/> - <source bridge='VM Network'/> - </interface> - </devices> -</domain> -</pre> - - - <h3><a id="xmlexport">Converting from domain XML config to VMware VMX config</a></h3> - <p> - The <code>virsh domxml-to-native</code> provides a way to convert a - domain XML config into a VMware VMX config. - </p> -<pre> -$ cat > demo.xml << EOF -<domain type='vmware'> - <name>Fedora11</name> - <uuid>50115e16-9bdc-49d7-f171-53c4d7f91710</uuid> - <memory>1048576</memory> - <currentMemory>1048576</currentMemory> - <vcpu>1</vcpu> - <os> - <type arch='x86_64'>hvm</type> - </os> - <devices> - <disk type='file' device='disk'> - <source file='[local-storage] Fedora11/Fedora11.vmdk'/> - <target dev='sda' bus='scsi'/> - <address type='drive' controller='0' bus='0' unit='0'/> - </disk> - <controller type='scsi' index='0' model='lsilogic'/> - <interface type='bridge'> - <mac address='00:50:56:25:48:c7'/> - <source bridge='VM Network'/> - </interface> - </devices> -</domain> -EOF - -$ virsh -c esx://example.com domxml-to-native vmware-vmx demo.xml -Enter username for example.com [root]: -Enter root password for example.com: -config.version = "8" -virtualHW.version = "4" -guestOS = "other-64" -uuid.bios = "50 11 5e 16 9b dc 49 d7-f1 71 53 c4 d7 f9 17 10" -displayName = "Fedora11" -memsize = "1024" -numvcpus = "1" -scsi0.present = "true" -scsi0.virtualDev = "lsilogic" -scsi0:0.present = "true" -scsi0:0.deviceType = "scsi-hardDisk" -scsi0:0.fileName = "/vmfs/volumes/local-storage/Fedora11/Fedora11.vmdk" -ethernet0.present = "true" -ethernet0.networkName = "VM Network" -ethernet0.connectionType = "bridged" -ethernet0.addressType = "static" -ethernet0.address = "00:50:56:25:48:C7" -</pre> - - - <h2><a id="xmlconfig">Example domain XML configs</a></h2> - - <h3>Fedora11 on x86_64</h3> -<pre> -<domain type='vmware'> - <name>Fedora11</name> - <uuid>50115e16-9bdc-49d7-f171-53c4d7f91710</uuid> - <memory>1048576</memory> - <currentMemory>1048576</currentMemory> - <vcpu>1</vcpu> - <os> - <type arch='x86_64'>hvm</type> - </os> - <devices> - <disk type='file' device='disk'> - <source file='[local-storage] Fedora11/Fedora11.vmdk'/> - <target dev='sda' bus='scsi'/> - <address type='drive' controller='0' bus='0' unit='0'/> - </disk> - <controller type='scsi' index='0'/> - <interface type='bridge'> - <mac address='00:50:56:25:48:c7'/> - <source bridge='VM Network'/> - </interface> - </devices> -</domain> -</pre> - - - <h2><a id="migration">Migration</a></h2> - <p> - A migration cannot be initiated on an ESX server directly, a VMware - vCenter is necessary for this. The <code>vcenter</code> query - parameter must be set either to the hostname or IP address of the - vCenter managing the ESX server or to <code>*</code>. Setting it - to <code>*</code> causes the driver to connect to the vCenter known to - the ESX server. If the ESX server is not managed by a vCenter an error - is reported. - </p> -<pre> -esx://example.com/?vcenter=example-vcenter.com -</pre> - <p> - Here's an example how to migrate the domain <code>Fedora11</code> from - ESX server <code>example-src.com</code> to ESX server - <code>example-dst.com</code> implicitly involving vCenter - <code>example-vcenter.com</code> using <code>virsh</code>. - </p> -<pre> -$ virsh -c esx://example-src.com/?vcenter=* migrate Fedora11 esx://example-dst.com/?vcenter=* -Enter username for example-src.com [root]: -Enter root password for example-src.com: -Enter username for example-vcenter.com [administrator]: -Enter administrator password for example-vcenter.com: -Enter username for example-dst.com [root]: -Enter root password for example-dst.com: -Enter username for example-vcenter.com [administrator]: -Enter administrator password for example-vcenter.com: -</pre> - <p> - <span class="since">Since 0.8.3</span> you can directly connect to a vCenter. - This simplifies migration a bit. Here's the same migration as above but - using <code>vpx://</code> connections and assuming both ESX server are in - datacenter <code>dc1</code> and aren't part of a cluster. - </p> -<pre> -$ virsh -c vpx://example-vcenter.com/dc1/example-src.com migrate Fedora11 vpx://example-vcenter.com/dc1/example-dst.com -Enter username for example-vcenter.com [administrator]: -Enter administrator password for example-vcenter.com: -Enter username for example-vcenter.com [administrator]: -Enter administrator password for example-vcenter.com: -</pre> - - - <h2><a id="scheduler">Scheduler configuration</a></h2> - <p> - The driver exposes the ESX CPU scheduler. The parameters listed below - are available to control the scheduler. - </p> - <dl> - <dt><code>reservation</code></dt> - <dd> - The amount of CPU resource in MHz that is guaranteed to be - available to the domain. Valid values are 0 and greater. - </dd> - <dt><code>limit</code></dt> - <dd> - The CPU utilization of the domain will be - limited to this value in MHz, even if more CPU resources are - available. If the limit is set to -1, the CPU utilization of the - domain is unlimited. If the limit is not set to -1, it must be - greater than or equal to the reservation. - </dd> - <dt><code>shares</code></dt> - <dd> - Shares are used to determine relative CPU - allocation between domains. In general, a domain with more shares - gets proportionally more of the CPU resource. Valid values are 0 - and greater. The special values -1, -2 and -3 represent the - predefined shares level <code>low</code>, <code>normal</code> and - <code>high</code>. - </dd> - </dl> - - - <h2><a id="tools">VMware tools</a></h2> - <p> - Some actions require installed VMware tools. If the VMware tools are - not installed in the guest and one of the actions below is to be - performed the ESX server raises an error and the driver reports it. - </p> - <ul> - <li> - <code>virDomainGetHostname</code> - </li> - <li> - <code>virDomainInterfaceAddresses</code> (only for the - <code>VIR_DOMAIN_INTERFACE_ADDRESSES_SRC_AGENT</code> source) - </li> - <li> - <code>virDomainReboot</code> - </li> - <li> - <code>virDomainShutdown</code> - </li> - </ul> - - - <h2><a id="links">Links</a></h2> - <ul> - <li> - <a href="https://www.vmware.com/support/developer/vc-sdk/"> - VMware vSphere Web Services SDK Documentation - </a> - </li> - <li> - <a href="https://www.vmware.com/pdf/esx3_memory.pdf"> - The Role of Memory in VMware ESX Server 3 - </a> - </li> - <li> - <a href="https://www.sanbarrow.com/vmx.html"> - VMware VMX config parameters - </a> - </li> - <li> - <a href="https://www.vmware.com/pdf/vsp_4_pvscsi_perf.pdf"> - VMware ESX 4.0 PVSCSI Storage Performance - </a> - </li> - </ul> -</body></html> diff --git a/docs/drvesx.rst b/docs/drvesx.rst new file mode 100644 index 0000000000..7b35492d21 --- /dev/null +++ b/docs/drvesx.rst @@ -0,0 +1,681 @@ +.. role:: since + +============================ +VMware ESX hypervisor driver +============================ + +.. contents:: + +The libvirt VMware ESX driver can manage VMware ESX/ESXi 3.5/4.x/5.x and VMware +GSX 2.0, also called VMware Server 2.0, and possibly later versions. +:since:`Since 0.8.3` the driver can also connect to a VMware vCenter 2.5/4.x/5.x +(VPX). + +Project Links +------------- + +- The `VMware ESX and GSX <https://www.vmware.com/>`__ hypervisors + +Deployment pre-requisites +------------------------- + +None. Any out-of-the-box installation of VPX/ESX(i)/GSX should work. No +preparations are required on the server side, no libvirtd must be installed on +the ESX server. The driver uses version 2.5 of the remote, SOAP based `VMware +Virtual Infrastructure +API <https://www.vmware.com/support/developer/vc-sdk/visdk25pubs/ReferenceGuide/>`__ +(VI API) to communicate with the ESX server, like the VMware Virtual +Infrastructure Client (VI client) does. Since version 4.0 this API is called +`VMware vSphere +API <https://www.vmware.com/support/developer/vc-sdk/visdk400pubs/ReferenceGuide/>`__. + +Connections to the VMware ESX driver +------------------------------------ + +Some example remote connection URIs for the driver are: + +:: + + vpx://example-vcenter.com/dc1/srv1 (VPX over HTTPS, select ESX server 'srv1' in datacenter 'dc1') + esx://example-esx.com (ESX over HTTPS) + gsx://example-gsx.com (GSX over HTTPS) + esx://example-esx.com/?transport=http (ESX over HTTP) + esx://example-esx.com/?no_verify=1 (ESX over HTTPS, but doesn't verify the server's SSL certificate) + +**Note**: In contrast to other drivers, the ESX driver is a client-side-only +driver. It connects to the ESX server using HTTP(S). Therefore, the `remote +transport mechanism <remote.html>`__ provided by the remote driver and libvirtd +will not work, and you cannot use URIs like ``esx+ssh://example.com``. + +URI Format +~~~~~~~~~~ + +URIs have this general form (``[...]`` marks an optional part). + +:: + + type://[username@]hostname[:port]/[[folder/...]datacenter/[folder/...][cluster/]server][?extraparameters] + +The ``type://`` is either ``esx://`` or ``gsx://`` or ``vpx://`` :since:`since +0.8.3` . The driver selects the default port depending on the ``type://``. For +``esx://`` and ``vpx://`` the default HTTPS port is 443, for ``gsx://`` it is +8333. If the port parameter is given, it overrides the default port. + +A ``vpx://`` connection is currently restricted to a single ESX server. This +might be relaxed in the future. The path part of the URI is used to specify the +datacenter and the ESX server in it. If the ESX server is part of a cluster then +the cluster has to be specified too. + +An example: ESX server ``example-esx.com`` is managed by vCenter +``example-vcenter.com`` and part of cluster ``cluster1``. This cluster is part +of datacenter ``dc1``. + +:: + + vpx://example-vcenter.com/dc1/cluster1/example-esx.com + +Datacenters and clusters can be organized in folders, those have to be specified +as well. The driver can handle folders :since:`since 0.9.7` . + +:: + + vpx://example-vcenter.com/folder1/dc1/folder2/example-esx.com + +Extra parameters +^^^^^^^^^^^^^^^^ + +Extra parameters can be added to a URI as part of the query string (the part +following ``?``). A single parameter is formed by a ``name=value`` pair. +Multiple parameters are separated by ``&``. + +:: + + ?no_verify=1&auto_answer=1&proxy=socks://example-proxy.com:23456 + +The driver understands the extra parameters shown below. + ++-----------------+-----------------------------+-----------------------------+ +| Name | Values | Meaning | ++=================+=============================+=============================+ +| ``transport`` | ``http`` or ``https`` | Overrides the default HTTPS | +| | | transport. For ``esx://`` | +| | | and ``vpx://`` the default | +| | | HTTP port is 80, for | +| | | ``gsx://`` it is 8222. | ++-----------------+-----------------------------+-----------------------------+ +| ``vcenter`` | Hostname of a VMware | In order to perform a | +| | vCenter or ``*`` | migration the driver needs | +| | | to know the VMware vCenter | +| | | for the ESX server. If set | +| | | to ``*``, the driver | +| | | connects to the vCenter | +| | | known to the ESX server. | +| | | This parameter in useful | +| | | when connecting to an ESX | +| | | server only. | ++-----------------+-----------------------------+-----------------------------+ +| ``no_verify`` | ``0`` or ``1`` | If set to 1, this disables | +| | | libcurl client checks of | +| | | the server's SSL | +| | | certificate. The default | +| | | value is 0. See the | +| | | `Certificates for | +| | | HTTPS <#certificates>`__ | +| | | section for details. | ++-----------------+-----------------------------+-----------------------------+ +| ``auto_answer`` | ``0`` or ``1`` | If set to 1, the driver | +| | | answers all | +| | | `questions <#questions>`__ | +| | | with the default answer. If | +| | | set to 0, questions are | +| | | reported as errors. The | +| | | default value is 0. | +| | | :since:`Since 0.7.5` . | ++-----------------+-----------------------------+-----------------------------+ +| ``proxy`` | [type://]hostname[:port] | Allows to specify a proxy | +| | | for HTTP and HTTPS | +| | | communication. | +| | | :since:`Since 0.8.2` . The | +| | | optional ``type`` part may | +| | | be one of: ``http``, | +| | | ``socks``, ``socks4``, | +| | | ``socks4a`` or ``socks5``. | +| | | The default is ``http`` and | +| | | ``socks`` is synonymous for | +| | | ``socks5``. The optional | +| | | ``port`` allows to override | +| | | the default port 1080. | ++-----------------+-----------------------------+-----------------------------+ + +Authentication +~~~~~~~~~~~~~~ + +In order to perform any useful operation the driver needs to log into the ESX +server. Therefore, only ``virConnectOpenAuth`` can be used to connect to an ESX +server, ``virConnectOpen`` and ``virConnectOpenReadOnly`` don't work. To log +into an ESX server or vCenter the driver will request credentials using the +callback passed to the ``virConnectOpenAuth`` function. The driver passes the +hostname as challenge parameter to the callback. This enables the callback to +distinguish between requests for ESX server and vCenter. + +**Note**: During the ongoing driver development, testing is done using an +unrestricted ``root`` account. Problems may occur if you use a restricted +account. Detailed testing with restricted accounts has not been done yet. + +Certificates for HTTPS +~~~~~~~~~~~~~~~~~~~~~~ + +By default the ESX driver uses HTTPS to communicate with an ESX server. Proper +HTTPS communication requires correctly configured SSL certificates. This +certificates are different from the ones libvirt uses for `secure communication +over TLS <remote.html>`__ to a libvirtd one a remote server. + +By default the driver tries to verify the server's SSL certificate using the CA +certificate pool installed on your client computer. With an out-of-the-box +installed ESX server this won't work, because a newly installed ESX server uses +auto-generated self-signed certificates. Those are signed by a CA certificate +that is typically not known to your client computer and libvirt will report an +error like this one: + +:: + + error: internal error curl_easy_perform() returned an error: Peer certificate cannot be authenticated with known CA certificates (60) + +Where are two ways to solve this problem: + +- Use the ``no_verify=1`` `extra parameter <#extraparams>`__ to disable server + certificate verification. +- Generate new SSL certificates signed by a CA known to your client computer + and replace the original ones on your ESX server. See the section *Replace a + Default Certificate with a CA-Signed Certificate* in the `ESX Configuration + Guide <https://www.vmware.com/pdf/vsphere4/r40/vsp_40_esx_server_config.pdf>`__ + +Connection problems +~~~~~~~~~~~~~~~~~~~ + +There are also other causes for connection problems than the `HTTPS +certificate <#certificates>`__ related ones. + +- As stated before the ESX driver doesn't need the `remote transport + mechanism <remote.html>`__ provided by the remote driver and libvirtd, nor + does the ESX driver support it. Therefore, using an URI including a transport + in the scheme won't work. Only `URIs as described <#uriformat>`__ are + supported by the ESX driver. Here's a collection of possible error messages: + + :: + + $ virsh -c esx+tcp://example.com/ + error: unable to connect to libvirtd at 'example.com': Connection refused + + :: + + $ virsh -c esx+tls://example.com/ + error: Cannot access CA certificate '/etc/pki/CA/cacert.pem': No such file or directory + + :: + + $ virsh -c esx+ssh://example.com/ + error: cannot recv data: ssh: connect to host example.com port 22: Connection refused + + :: + + $ virsh -c esx+ssh://example.com/ + error: cannot recv data: Resource temporarily unavailable + +- :since:`Since 0.7.0` libvirt contains the ESX driver. Earlier versions of + libvirt will report a misleading error about missing certificates when you + try to connect to an ESX server. + + :: + + $ virsh -c esx://example.com/ + error: Cannot access CA certificate '/etc/pki/CA/cacert.pem': No such file or directory + + Don't let this error message confuse you. Setting up certificates as + described on the `remote transport + mechanism <remote.html#Remote_certificates>`__ page does not help, as this is + not a certificate related problem. + + To fix this problem you need to update your libvirt to 0.7.0 or newer. You + may also see this error when you use a libvirt version that contains the ESX + driver but you or your distro disabled the ESX driver during compilation. + :since:`Since 0.8.3` the error message has been improved in this case: + + :: + + $ virsh -c esx://example.com/ + error: invalid argument in libvirt was built without the 'esx' driver + +Questions blocking tasks +------------------------ + +Some methods of the VI API start tasks, for example ``PowerOnVM_Task()``. Such +tasks may be blocked by questions if the ESX server detects an issue with the +domain that requires user interaction. The ESX driver cannot prompt the user to +answer a question, libvirt doesn't have an API for something like this. + +The VI API provides the ``AnswerVM()`` method to programmatically answer a +questions. So the driver has two options how to handle such a situation: either +answer the questions with the default answer or report the question as an error +and cancel the blocked task if possible. The `auto_answer <#uriformat>`__ query +parameter controls the answering behavior. + +Specialities in the domain XML config +------------------------------------- + +There are several specialities in the domain XML config for ESX domains. + +Restrictions +~~~~~~~~~~~~ + +There are some restrictions for some values of the domain XML config. The driver +will complain if this restrictions are violated. + +- Memory size has to be a multiple of 4096 +- Number of virtual CPU has to be 1 or a multiple of 2. :since:`Since 4.10.0` + any number of vCPUs is supported. +- Valid MAC address prefixes are ``00:0c:29`` and ``00:50:56``. :since:`Since + 0.7.6` arbitrary `MAC addresses <#macaddresses>`__ are supported. + +Datastore references +~~~~~~~~~~~~~~~~~~~~ + +Storage is managed in datastores. VMware uses a special path format to reference +files in a datastore. Basically, the datastore name is put into squared braces +in front of the path. + +:: + + [datastore] directory/filename + +To define a new domain the driver converts the domain XML into a VMware VMX file +and uploads it to a datastore known to the ESX server. Because multiple +datastores may be known to an ESX server the driver needs to decide to which +datastore the VMX file should be uploaded. The driver deduces this information +from the path of the source of the first file-based harddisk listed in the +domain XML. + +MAC addresses +~~~~~~~~~~~~~ + +VMware has registered two MAC address prefixes for domains: ``00:0c:29`` and +``00:50:56``. These prefixes are split into ranges for different purposes. + ++--------------------------------------+--------------------------------------+ +| Range | Purpose | ++======================================+======================================+ +| ``00:0c:29:00:00:00`` - | An ESX server autogenerates MAC | +| ``00:0c:29:ff:ff:ff`` | addresses from this range if the VMX | +| | file doesn't contain a MAC address | +| | when trying to start a domain. | ++--------------------------------------+--------------------------------------+ +| ``00:50:56:00:00:00`` - | MAC addresses from this range can by | +| ``00:50:56:3f:ff:ff`` | manually assigned by the user in the | +| | VI client. | ++--------------------------------------+--------------------------------------+ +| ``00:50:56:80:00:00`` - | A VI client autogenerates MAC | +| ``00:50:56:bf:ff:ff`` | addresses from this range for newly | +| | defined domains. | ++--------------------------------------+--------------------------------------+ + +The VMX files generated by the ESX driver always contain a MAC address, because +libvirt generates a random one if an interface element in the domain XML file +lacks a MAC address. :since:`Since 0.7.6` the ESX driver sets the prefix for +generated MAC addresses to ``00:0c:29``. Before 0.7.6 the ``00:50:56`` prefix +was used. Sometimes this resulted in the generation of out-of-range MAC address +that were rejected by the ESX server. + +Also :since:`since 0.7.6` every MAC address outside this ranges can be used. For +such MAC addresses the ESX server-side check is disabled in the VMX file to stop +the ESX server from rejecting out-of-predefined-range MAC addresses. + +:: + + ethernet0.checkMACAddress = "false" + +:since:`Since 6.6.0` , one can force libvirt to keep the provided MAC address +when it's in the reserved VMware range by adding a ``type="static"`` attribute +to the ``<mac/>`` element. Note that this attribute is useless if the provided +MAC address is outside of the reserved VMWare ranges. + +Available hardware +~~~~~~~~~~~~~~~~~~ + +VMware ESX supports different models of SCSI controllers and network cards. + +SCSI controller models +^^^^^^^^^^^^^^^^^^^^^^ + +``auto`` + This isn't an actual controller model. If specified the ESX driver tries to + detect the SCSI controller model referenced in the ``.vmdk`` file and use it. + Autodetection fails when a SCSI controller has multiple disks attached and + the SCSI controller models referenced in the ``.vmdk`` files are + inconsistent. :since:`Since 0.8.3` +``buslogic`` + BusLogic SCSI controller for older guests. +``lsilogic`` + LSI Logic SCSI controller for recent guests. +``lsisas1068`` + LSI Logic SAS 1068 controller. :since:`Since 0.8.0` +``vmpvscsi`` + Special VMware Paravirtual SCSI controller, requires VMware tools inside the + guest. See `VMware KB1010398 <https://kb.vmware.com/kb/1010398>`__ for + details. :since:`Since 0.8.3` + +Here a domain XML snippet: + +:: + + ... + <disk type='file' device='disk'> + <source file='[local-storage] Fedora11/Fedora11.vmdk'/> + <target dev='sda' bus='scsi'/> + <address type='drive' controller='0' bus='0' unit='0'/> + </disk> + <controller type='scsi' index='0' model='lsilogic'/> + ... + +The controller element is supported :since:`since 0.8.2` . Prior to this +``<driver name='lsilogic'/>`` was abused to specify the SCSI controller model. +This attribute usage is deprecated now. + +:: + + ... + <disk type='file' device='disk'> + <driver name='lsilogic'/> + <source file='[local-storage] Fedora11/Fedora11.vmdk'/> + <target dev='sda' bus='scsi'/> + </disk> + ... + +Network card models +^^^^^^^^^^^^^^^^^^^ + +``vlance`` + AMD PCnet32 network card for older guests. +``vmxnet``, ``vmxnet2``, ``vmxnet3`` + Special VMware VMXnet network card, requires VMware tools inside the guest. + See `VMware KB1001805 <https://kb.vmware.com/kb/1001805>`__ for details. +``e1000`` + Intel E1000 network card for recent guests. + +Here a domain XML snippet: + +:: + + ... + <interface type='bridge'> + <mac address='00:50:56:25:48:c7'/> + <source bridge='VM Network'/> + <model type='e1000'/> + </interface> + ... + +Import and export of domain XML configs +--------------------------------------- + +The ESX driver currently supports a native config format known as ``vmware-vmx`` +to handle VMware VMX configs. + +Converting from VMware VMX config to domain XML config +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh domxml-from-native`` provides a way to convert an existing VMware +VMX config into a domain XML config that can then be used by libvirt. + +:: + + $ cat > demo.vmx << EOF + #!/usr/bin/vmware + config.version = "8" + virtualHW.version = "4" + floppy0.present = "false" + nvram = "Fedora11.nvram" + deploymentPlatform = "windows" + virtualHW.productCompatibility = "hosted" + tools.upgrade.policy = "useGlobal" + powerType.powerOff = "default" + powerType.powerOn = "default" + powerType.suspend = "default" + powerType.reset = "default" + displayName = "Fedora11" + extendedConfigFile = "Fedora11.vmxf" + scsi0.present = "true" + scsi0.sharedBus = "none" + scsi0.virtualDev = "lsilogic" + memsize = "1024" + scsi0:0.present = "true" + scsi0:0.fileName = "/vmfs/volumes/498076b2-02796c1a-ef5b-000ae484a6a3/Fedora11/Fedora11.vmdk" + scsi0:0.deviceType = "scsi-hardDisk" + ide0:0.present = "true" + ide0:0.clientDevice = "true" + ide0:0.deviceType = "cdrom-raw" + ide0:0.startConnected = "false" + ethernet0.present = "true" + ethernet0.networkName = "VM Network" + ethernet0.addressType = "vpx" + ethernet0.generatedAddress = "00:50:56:91:48:c7" + chipset.onlineStandby = "false" + guestOSAltName = "Red Hat Enterprise Linux 5 (32-Bit)" + guestOS = "rhel5" + uuid.bios = "50 11 5e 16 9b dc 49 d7-f1 71 53 c4 d7 f9 17 10" + snapshot.action = "keep" + sched.cpu.min = "0" + sched.cpu.units = "mhz" + sched.cpu.shares = "normal" + sched.mem.minsize = "0" + sched.mem.shares = "normal" + toolScripts.afterPowerOn = "true" + toolScripts.afterResume = "true" + toolScripts.beforeSuspend = "true" + toolScripts.beforePowerOff = "true" + scsi0:0.redo = "" + tools.syncTime = "false" + uuid.location = "56 4d b5 06 a2 bd fb eb-ae 86 f7 d8 49 27 d0 c4" + sched.cpu.max = "unlimited" + sched.swap.derivedName = "/vmfs/volumes/498076b2-02796c1a-ef5b-000ae484a6a3/Fedora11/Fedora11-7de040d8.vswp" + tools.remindInstall = "TRUE" + EOF + + $ virsh -c esx://example.com domxml-from-native vmware-vmx demo.vmx + Enter username for example.com [root]: + Enter root password for example.com: + <domain type='vmware'> + <name>Fedora11</name> + <uuid>50115e16-9bdc-49d7-f171-53c4d7f91710</uuid> + <memory>1048576</memory> + <currentMemory>1048576</currentMemory> + <vcpu>1</vcpu> + <os> + <type arch='i686'>hvm</type> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <disk type='file' device='disk'> + <source file='[local-storage] Fedora11/Fedora11.vmdk'/> + <target dev='sda' bus='scsi'/> + <address type='drive' controller='0' bus='0' unit='0'/> + </disk> + <controller type='scsi' index='0' model='lsilogic'/> + <interface type='bridge'> + <mac address='00:50:56:91:48:c7'/> + <source bridge='VM Network'/> + </interface> + </devices> + </domain> + +Converting from domain XML config to VMware VMX config +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh domxml-to-native`` provides a way to convert a domain XML config +into a VMware VMX config. + +:: + + $ cat > demo.xml << EOF + <domain type='vmware'> + <name>Fedora11</name> + <uuid>50115e16-9bdc-49d7-f171-53c4d7f91710</uuid> + <memory>1048576</memory> + <currentMemory>1048576</currentMemory> + <vcpu>1</vcpu> + <os> + <type arch='x86_64'>hvm</type> + </os> + <devices> + <disk type='file' device='disk'> + <source file='[local-storage] Fedora11/Fedora11.vmdk'/> + <target dev='sda' bus='scsi'/> + <address type='drive' controller='0' bus='0' unit='0'/> + </disk> + <controller type='scsi' index='0' model='lsilogic'/> + <interface type='bridge'> + <mac address='00:50:56:25:48:c7'/> + <source bridge='VM Network'/> + </interface> + </devices> + </domain> + EOF + + $ virsh -c esx://example.com domxml-to-native vmware-vmx demo.xml + Enter username for example.com [root]: + Enter root password for example.com: + config.version = "8" + virtualHW.version = "4" + guestOS = "other-64" + uuid.bios = "50 11 5e 16 9b dc 49 d7-f1 71 53 c4 d7 f9 17 10" + displayName = "Fedora11" + memsize = "1024" + numvcpus = "1" + scsi0.present = "true" + scsi0.virtualDev = "lsilogic" + scsi0:0.present = "true" + scsi0:0.deviceType = "scsi-hardDisk" + scsi0:0.fileName = "/vmfs/volumes/local-storage/Fedora11/Fedora11.vmdk" + ethernet0.present = "true" + ethernet0.networkName = "VM Network" + ethernet0.connectionType = "bridged" + ethernet0.addressType = "static" + ethernet0.address = "00:50:56:25:48:C7" + +Example domain XML configs +-------------------------- + +Fedora11 on x86_64 +~~~~~~~~~~~~~~~~~~ + +:: + + <domain type='vmware'> + <name>Fedora11</name> + <uuid>50115e16-9bdc-49d7-f171-53c4d7f91710</uuid> + <memory>1048576</memory> + <currentMemory>1048576</currentMemory> + <vcpu>1</vcpu> + <os> + <type arch='x86_64'>hvm</type> + </os> + <devices> + <disk type='file' device='disk'> + <source file='[local-storage] Fedora11/Fedora11.vmdk'/> + <target dev='sda' bus='scsi'/> + <address type='drive' controller='0' bus='0' unit='0'/> + </disk> + <controller type='scsi' index='0'/> + <interface type='bridge'> + <mac address='00:50:56:25:48:c7'/> + <source bridge='VM Network'/> + </interface> + </devices> + </domain> + +Migration +--------- + +A migration cannot be initiated on an ESX server directly, a VMware vCenter is +necessary for this. The ``vcenter`` query parameter must be set either to the +hostname or IP address of the vCenter managing the ESX server or to ``*``. +Setting it to ``*`` causes the driver to connect to the vCenter known to the ESX +server. If the ESX server is not managed by a vCenter an error is reported. + +:: + + esx://example.com/?vcenter=example-vcenter.com + +Here's an example how to migrate the domain ``Fedora11`` from ESX server +``example-src.com`` to ESX server ``example-dst.com`` implicitly involving +vCenter ``example-vcenter.com`` using ``virsh``. + +:: + + $ virsh -c esx://example-src.com/?vcenter=* migrate Fedora11 esx://example-dst.com/?vcenter=* + Enter username for example-src.com [root]: + Enter root password for example-src.com: + Enter username for example-vcenter.com [administrator]: + Enter administrator password for example-vcenter.com: + Enter username for example-dst.com [root]: + Enter root password for example-dst.com: + Enter username for example-vcenter.com [administrator]: + Enter administrator password for example-vcenter.com: + +:since:`Since 0.8.3` you can directly connect to a vCenter. This simplifies +migration a bit. Here's the same migration as above but using ``vpx://`` +connections and assuming both ESX server are in datacenter ``dc1`` and aren't +part of a cluster. + +:: + + $ virsh -c vpx://example-vcenter.com/dc1/example-src.com migrate Fedora11 vpx://example-vcenter.com/dc1/example-dst.com + Enter username for example-vcenter.com [administrator]: + Enter administrator password for example-vcenter.com: + Enter username for example-vcenter.com [administrator]: + Enter administrator password for example-vcenter.com: + +Scheduler configuration +----------------------- + +The driver exposes the ESX CPU scheduler. The parameters listed below are +available to control the scheduler. + +``reservation`` + The amount of CPU resource in MHz that is guaranteed to be available to the + domain. Valid values are 0 and greater. +``limit`` + The CPU utilization of the domain will be limited to this value in MHz, even + if more CPU resources are available. If the limit is set to -1, the CPU + utilization of the domain is unlimited. If the limit is not set to -1, it + must be greater than or equal to the reservation. +``shares`` + Shares are used to determine relative CPU allocation between domains. In + general, a domain with more shares gets proportionally more of the CPU + resource. Valid values are 0 and greater. The special values -1, -2 and -3 + represent the predefined shares level ``low``, ``normal`` and ``high``. + +VMware tools +------------ + +Some actions require installed VMware tools. If the VMware tools are not +installed in the guest and one of the actions below is to be performed the ESX +server raises an error and the driver reports it. + +- ``virDomainGetHostname`` +- ``virDomainInterfaceAddresses`` (only for the + ``VIR_DOMAIN_INTERFACE_ADDRESSES_SRC_AGENT`` source) +- ``virDomainReboot`` +- ``virDomainShutdown`` + +Links +----- + +- `VMware vSphere Web Services SDK + Documentation <https://www.vmware.com/support/developer/vc-sdk/>`__ +- `The Role of Memory in VMware ESX Server + 3 <https://www.vmware.com/pdf/esx3_memory.pdf>`__ +- `VMware VMX config parameters <https://www.sanbarrow.com/vmx.html>`__ +- `VMware ESX 4.0 PVSCSI Storage + Performance <https://www.vmware.com/pdf/vsp_4_pvscsi_perf.pdf>`__ diff --git a/docs/meson.build b/docs/meson.build index a6c3077f25..0465c22274 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvesx', 'drvhyperv', 'drvlxc', 'drvnodedev', @@ -80,6 +79,7 @@ docs_rst_files = [ 'drivers', 'drvbhyve', 'drvch', + 'drvesx', 'drvqemu', 'errors', 'formatbackup', -- 2.35.1

...
++-----------------+-----------------------------+-----------------------------+ +| ``auto_answer`` | ``0`` or ``1`` | If set to 1, the driver | +| | | answers all | +| | | `questions <#questions>`__ | +| | | with the default answer. If | +| | | set to 0, questions are | +| | | reported as errors. The | +| | | default value is 0. | +| | | :since:`Since 0.7.5` . | ++-----------------+-----------------------------+-----------------------------+ +| ``proxy`` | [type://]hostname[:port] | Allows to specify a proxy |
^This one failed to convert to verbatim and needs be tweaked by hand - I tried and it looks like nothing broke Reviewed-by: Erik Skultety <eskultet@redhat.com>

On Fri, Apr 01, 2022 at 13:46:42 +0200, Erik Skultety wrote:
...
++-----------------+-----------------------------+-----------------------------+ +| ``auto_answer`` | ``0`` or ``1`` | If set to 1, the driver | +| | | answers all | +| | | `questions <#questions>`__ | +| | | with the default answer. If | +| | | set to 0, questions are | +| | | reported as errors. The | +| | | default value is 0. | +| | | :since:`Since 0.7.5` . | ++-----------------+-----------------------------+-----------------------------+ +| ``proxy`` | [type://]hostname[:port] | Allows to specify a proxy |
^This one failed to convert to verbatim and needs be tweaked by hand - I tried and it looks like nothing broke
I think the issue was that it's too long once you add backticks to turn it into verbatim. Anyways. I can change 'hostname' to 'host' to make it fit if you agree.

On Fri, Apr 01, 2022 at 04:26:49PM +0200, Peter Krempa wrote:
On Fri, Apr 01, 2022 at 13:46:42 +0200, Erik Skultety wrote:
...
++-----------------+-----------------------------+-----------------------------+ +| ``auto_answer`` | ``0`` or ``1`` | If set to 1, the driver | +| | | answers all | +| | | `questions <#questions>`__ | +| | | with the default answer. If | +| | | set to 0, questions are | +| | | reported as errors. The | +| | | default value is 0. | +| | | :since:`Since 0.7.5` . | ++-----------------+-----------------------------+-----------------------------+ +| ``proxy`` | [type://]hostname[:port] | Allows to specify a proxy |
^This one failed to convert to verbatim and needs be tweaked by hand - I tried and it looks like nothing broke
I think the issue was that it's too long once you add backticks to turn it into verbatim. Anyways. I can change 'hostname' to 'host' to make it fit if you agree.
Yes, but even without the spaces it works, but yours works too, go ahead. Erik

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvhyperv.html.in | 150 ----------------------------------------- docs/drvhyperv.rst | 121 +++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 122 insertions(+), 151 deletions(-) delete mode 100644 docs/drvhyperv.html.in create mode 100644 docs/drvhyperv.rst diff --git a/docs/drvhyperv.html.in b/docs/drvhyperv.html.in deleted file mode 100644 index bce4e4128b..0000000000 --- a/docs/drvhyperv.html.in +++ /dev/null @@ -1,150 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Microsoft Hyper-V hypervisor driver</h1> - <ul id="toc"></ul> - <p> - The libvirt Microsoft Hyper-V driver can manage Hyper-V 2012 R2 and newer. - </p> - - - <h2><a id="project">Project Links</a></h2> - <ul> - <li> - The <a href="http://www.microsoft.com/hyper-v-server/">Microsoft Hyper-V</a> - hypervisor - </li> - </ul> - - - <h2><a id="uri">Connections to the Microsoft Hyper-V driver</a></h2> - <p> - Some example remote connection URIs for the driver are: - </p> -<pre> -hyperv://example-hyperv.com (over HTTPS) -hyperv://example-hyperv.com/?transport=http (over HTTP) -</pre> - <p> - <strong>Note</strong>: In contrast to other drivers, the Hyper-V driver - is a client-side-only driver. It connects to the Hyper-V server using - WS-Management over HTTP(S). Therefore, the - <a href="remote.html">remote transport mechanism</a> provided by the - remote driver and libvirtd will not work, and you cannot use URIs like - <code>hyperv+ssh://example.com</code>. - </p> - - - <h3><a id="uriformat">URI Format</a></h3> - <p> - URIs have this general form (<code>[...]</code> marks an optional part). - </p> -<pre> -hyperv://[username@]hostname[:port]/[?extraparameters] -</pre> - <p> - The default HTTPS ports is 5986. If the port parameter is given, it - overrides the default port. - </p> - - - <h4><a id="extraparams">Extra parameters</a></h4> - <p> - Extra parameters can be added to a URI as part of the query string - (the part following <code>?</code>). A single parameter is formed by a - <code>name=value</code> pair. Multiple parameters are separated by - <code>&</code>. - </p> -<pre> -?transport=http -</pre> - <p> - The driver understands the extra parameters shown below. - </p> - <table class="top_table"> - <tr> - <th>Name</th> - <th>Values</th> - <th>Meaning</th> - </tr> - <tr> - <td> - <code>transport</code> - </td> - <td> - <code>http</code> or <code>https</code> - </td> - <td> - Overrides the default HTTPS transport. The default HTTP port - is 5985. - </td> - </tr> - </table> - - - <h3><a id="auth">Authentication</a></h3> - <p> - In order to perform any useful operation the driver needs to log into - the Hyper-V server. Therefore, only <code>virConnectOpenAuth</code> can - be used to connect to an Hyper-V server, <code>virConnectOpen</code> and - <code>virConnectOpenReadOnly</code> don't work. - To log into an Hyper-V server the driver will request credentials using - the callback passed to the <code>virConnectOpenAuth</code> function. - The driver passes the hostname as challenge parameter to the callback. - </p> - <p> - <strong>Note</strong>: Currently only <code>Basic</code> authentication - is supported by libvirt. This method is disabled by default on the - Hyper-V server and can be enabled via the WinRM commandline tool. - </p> -<pre> -winrm set winrm/config/service/auth @{Basic="true"} -</pre> - <p> - To allow <code>Basic</code> authentication with HTTP transport WinRM - needs to allow unencrypted communication. This can be enabled via the - WinRM commandline tool. However, this is not the recommended - communication mode. - </p> -<pre> -winrm set winrm/config/service @{AllowUnencrypted="true"} -</pre> - - - <h2><a id="versions">Version Numbers</a></h2> - <p> - Since Microsoft's build numbers are almost always over 1000, this driver - needs to pack the value differently compared to the format defined by - <code>virConnectGetVersion</code>. - To preserve all of the digits, the following format is used: - </p> - <pre>major * 100000000 + minor * 1000000 + micro</pre> - <p> - This results in <code>virsh version</code> producing unexpected output. - </p> - <table class="top_table"> - <thead> - <th>Windows Release</th> - <th>Kernel Version</th> - <th>libvirt Representation</th> - </thead> - <tr> - <td>Windows Server 2012 R2</td> - <td>6.3.9600</td> - <td>603.9.600</td> - </tr> - <tr> - <td>Windows Server 2016</td> - <td>10.0.14393</td> - <td>1000.14.393</td> - </tr> - <tr> - <td>Windows Server 2019</td> - <td>10.0.17763</td> - <td>1000.17.763</td> - </tr> - </table> - - -</body></html> diff --git a/docs/drvhyperv.rst b/docs/drvhyperv.rst new file mode 100644 index 0000000000..17d620f29c --- /dev/null +++ b/docs/drvhyperv.rst @@ -0,0 +1,121 @@ +=================================== +Microsoft Hyper-V hypervisor driver +=================================== + +.. contents:: + +The libvirt Microsoft Hyper-V driver can manage Hyper-V 2012 R2 and newer. + +Project Links +------------- + +- The `Microsoft Hyper-V <http://www.microsoft.com/hyper-v-server/>`__ + hypervisor + +Connections to the Microsoft Hyper-V driver +------------------------------------------- + +Some example remote connection URIs for the driver are: + +:: + + hyperv://example-hyperv.com (over HTTPS) + hyperv://example-hyperv.com/?transport=http (over HTTP) + +**Note**: In contrast to other drivers, the Hyper-V driver is a client-side-only +driver. It connects to the Hyper-V server using WS-Management over HTTP(S). +Therefore, the `remote transport mechanism <remote.html>`__ provided by the +remote driver and libvirtd will not work, and you cannot use URIs like +``hyperv+ssh://example.com``. + +URI Format +~~~~~~~~~~ + +URIs have this general form (``[...]`` marks an optional part). + +:: + + hyperv://[username@]hostname[:port]/[?extraparameters] + +The default HTTPS ports is 5986. If the port parameter is given, it overrides +the default port. + +Extra parameters +^^^^^^^^^^^^^^^^ + +Extra parameters can be added to a URI as part of the query string (the part +following ``?``). A single parameter is formed by a ``name=value`` pair. +Multiple parameters are separated by ``&``. + +:: + + ?transport=http + +The driver understands the extra parameters shown below. + ++---------------+-----------------------+-------------------------------------+ +| Name | Values | Meaning | ++===============+=======================+=====================================+ +| ``transport`` | ``http`` or ``https`` | Overrides the default HTTPS | +| | | transport. The default HTTP port is | +| | | 5985. | ++---------------+-----------------------+-------------------------------------+ + +Authentication +~~~~~~~~~~~~~~ + +In order to perform any useful operation the driver needs to log into the +Hyper-V server. Therefore, only ``virConnectOpenAuth`` can be used to connect to +an Hyper-V server, ``virConnectOpen`` and ``virConnectOpenReadOnly`` don't work. +To log into an Hyper-V server the driver will request credentials using the +callback passed to the ``virConnectOpenAuth`` function. The driver passes the +hostname as challenge parameter to the callback. + +**Note**: Currently only ``Basic`` authentication is supported by libvirt. This +method is disabled by default on the Hyper-V server and can be enabled via the +WinRM commandline tool. + +:: + + winrm set winrm/config/service/auth @{Basic="true"} + +To allow ``Basic`` authentication with HTTP transport WinRM needs to allow +unencrypted communication. This can be enabled via the WinRM commandline tool. +However, this is not the recommended communication mode. + +:: + + winrm set winrm/config/service @{AllowUnencrypted="true"} + +Version Numbers +--------------- + +Since Microsoft's build numbers are almost always over 1000, this driver needs +to pack the value differently compared to the format defined by +``virConnectGetVersion``. To preserve all of the digits, the following format is +used: + +:: + + major * 100000000 + minor * 1000000 + micro + +This results in ``virsh version`` producing unexpected output. + +.. list-table:: + :header-rows: 1 + + * - Windows Release + - Kernel Version + - libvirt Representation + + * - Windows Server 2012 R2 + - 6.3.9600 + - 603.9.600 + + * - Windows Server 2016 + - 10.0.14393 + - 1000.14.393 + + * - Windows Server 2019 + - 10.0.17763 + - 1000.17.763 diff --git a/docs/meson.build b/docs/meson.build index 0465c22274..cfbde2a58d 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvhyperv', 'drvlxc', 'drvnodedev', 'drvopenvz', @@ -80,6 +79,7 @@ docs_rst_files = [ 'drvbhyve', 'drvch', 'drvesx', + 'drvhyperv', 'drvqemu', 'errors', 'formatbackup', -- 2.35.1

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvlxc.html.in | 822 -------------------------------------------- docs/drvlxc.rst | 670 ++++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 671 insertions(+), 823 deletions(-) delete mode 100644 docs/drvlxc.html.in create mode 100644 docs/drvlxc.rst diff --git a/docs/drvlxc.html.in b/docs/drvlxc.html.in deleted file mode 100644 index 23aae991a2..0000000000 --- a/docs/drvlxc.html.in +++ /dev/null @@ -1,822 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>LXC container driver</h1> - - <ul id="toc"></ul> - -<p> -The libvirt LXC driver manages "Linux Containers". At their simplest, containers -can just be thought of as a collection of processes, separated from the main -host processes via a set of resource namespaces and constrained via control -groups resource tunables. The libvirt LXC driver has no dependency on the LXC -userspace tools hosted on sourceforge.net. It directly utilizes the relevant -kernel features to build the container environment. This allows for sharing -of many libvirt technologies across both the QEMU/KVM and LXC drivers. In -particular sVirt for mandatory access control, auditing of operations, -integration with control groups and many other features. -</p> - -<h2><a id="cgroups">Control groups Requirements</a></h2> - -<p> -In order to control the resource usage of processes inside containers, the -libvirt LXC driver requires that certain cgroups controllers are mounted on -the host OS. The minimum required controllers are 'cpuacct', 'memory' and -'devices', while recommended extra controllers are 'cpu', 'freezer' and -'blkio'. Libvirt will not mount the cgroups filesystem itself, leaving -this up to the init system to take care of. Systemd will do the right thing -in this respect, while for other init systems the <code>cgconfig</code> -init service will be required. For further information, consult the general -libvirt <a href="cgroups.html">cgroups documentation</a>. -</p> - -<h2><a id="namespaces">Namespace requirements</a></h2> - -<p> -In order to separate processes inside a container from those in the -primary "host" OS environment, the libvirt LXC driver requires that -certain kernel namespaces are compiled in. Libvirt currently requires -the 'mount', 'ipc', 'pid', and 'uts' namespaces to be available. If -separate network interfaces are desired, then the 'net' namespace is -required. If the guest configuration declares a -<a href="formatdomain.html#elementsOSContainer">UID or GID mapping</a>, -the 'user' namespace will be enabled to apply these. <strong>A suitably -configured UID/GID mapping is a pre-requisite to making containers -secure, in the absence of sVirt confinement.</strong> -</p> - -<h2><a id="init">Default container setup</a></h2> - -<h3><a id="cliargs">Command line arguments</a></h3> - -<p> -When the container "init" process is started, it will typically -not be given any command line arguments (eg the equivalent of -the bootloader args visible in <code>/proc/cmdline</code>). If -any arguments are desired, then must be explicitly set in the -container XML configuration via one or more <code>initarg</code> -elements. For example, to run <code>systemd --unit emergency.service</code> -would use the following XML -</p> - -<pre> -<os> - <type arch='x86_64'>exe</type> - <init>/bin/systemd</init> - <initarg>--unit</initarg> - <initarg>emergency.service</initarg> -</os> -</pre> - -<h3><a id="envvars">Environment variables</a></h3> - -<p> -When the container "init" process is started, it will be given several useful -environment variables. The following standard environment variables are mandated -by <a href="https://www.freedesktop.org/wiki/Software/systemd/ContainerInterface">systemd container interface</a> -to be provided by all container technologies on Linux. -</p> - -<dl> -<dt><code>container</code></dt> -<dd>The fixed string <code>libvirt-lxc</code> to identify libvirt as the creator</dd> -<dt><code>container_uuid</code></dt> -<dd>The UUID assigned to the container by libvirt</dd> -<dt><code>PATH</code></dt> -<dd>The fixed string <code>/bin:/usr/bin</code></dd> -<dt><code>TERM</code></dt> -<dd>The fixed string <code>linux</code></dd> -<dt><code>HOME</code></dt> -<dd>The fixed string <code>/</code></dd> -</dl> - -<p> -In addition to the standard variables, the following libvirt specific -environment variables are also provided -</p> - -<dl> -<dt><code>LIBVIRT_LXC_NAME</code></dt> -<dd>The name assigned to the container by libvirt</dd> -<dt><code>LIBVIRT_LXC_UUID</code></dt> -<dd>The UUID assigned to the container by libvirt</dd> -<dt><code>LIBVIRT_LXC_CMDLINE</code></dt> -<dd>The unparsed command line arguments specified in the container configuration. -Use of this is discouraged, in favour of passing arguments directly to the -container init process via the <code>initarg</code> config element.</dd> -</dl> - -<h3><a id="fsmounts">Filesystem mounts</a></h3> - -<p> -In the absence of any explicit configuration, the container will -inherit the host OS filesystem mounts. A number of mount points will -be made read only, or re-mounted with new instances to provide -container specific data. The following special mounts are setup -by libvirt -</p> - -<ul> -<li><code>/dev</code> a new "tmpfs" pre-populated with authorized device nodes</li> -<li><code>/dev/pts</code> a new private "devpts" instance for console devices</li> -<li><code>/sys</code> the host "sysfs" instance remounted read-only</li> -<li><code>/proc</code> a new instance of the "proc" filesystem</li> -<li><code>/proc/sys</code> the host "/proc/sys" bind-mounted read-only</li> -<li><code>/sys/fs/selinux</code> the host "selinux" instance remounted read-only</li> -<li><code>/sys/fs/cgroup/NNNN</code> the host cgroups controllers bind-mounted to -only expose the sub-tree associated with the container</li> -<li><code>/proc/meminfo</code> a FUSE backed file reflecting memory limits of the container</li> -</ul> - - -<h3><a id="devnodes">Device nodes</a></h3> - -<p> -The container init process will be started with <code>CAP_MKNOD</code> -capability removed and blocked from re-acquiring it. As such it will -not be able to create any device nodes in <code>/dev</code> or anywhere -else in its filesystems. Libvirt itself will take care of pre-populating -the <code>/dev</code> filesystem with any devices that the container -is authorized to use. The current devices that will be made available -to all containers are -</p> - -<ul> -<li><code>/dev/zero</code></li> -<li><code>/dev/null</code></li> -<li><code>/dev/full</code></li> -<li><code>/dev/random</code></li> -<li><code>/dev/urandom</code></li> -<li><code>/dev/stdin</code> symlinked to <code>/proc/self/fd/0</code></li> -<li><code>/dev/stdout</code> symlinked to <code>/proc/self/fd/1</code></li> -<li><code>/dev/stderr</code> symlinked to <code>/proc/self/fd/2</code></li> -<li><code>/dev/fd</code> symlinked to <code>/proc/self/fd</code></li> -<li><code>/dev/ptmx</code> symlinked to <code>/dev/pts/ptmx</code></li> -<li><code>/dev/console</code> symlinked to <code>/dev/pts/0</code></li> -</ul> - -<p> -In addition, for every console defined in the guest configuration, -a symlink will be created from <code>/dev/ttyN</code> symlinked to -the corresponding <code>/dev/pts/M</code> pseudo TTY device. The -first console will be <code>/dev/tty1</code>, with further consoles -numbered incrementally from there. -</p> - -<p> -Since /dev/ttyN and /dev/console are linked to the pts devices. The -tty device of login program is pts device. The pam module securetty -may prevent root user from logging in container. If you want root -user to log in container successfully, add the pts device to the file -/etc/securetty of container. -</p> - -<p> -Further block or character devices will be made available to containers -depending on their configuration. -</p> - -<h2><a id="security">Security considerations</a></h2> - -<p> -The libvirt LXC driver is fairly flexible in how it can be configured, -and as such does not enforce a requirement for strict security -separation between a container and the host. This allows it to be used -in scenarios where only resource control capabilities are important, -and resource sharing is desired. Applications wishing to ensure secure -isolation between a container and the host must ensure that they are -writing a suitable configuration. -</p> - -<h3><a id="securenetworking">Network isolation</a></h3> - -<p> -If the guest configuration does not list any network interfaces, -the <code>network</code> namespace will not be activated, and thus -the container will see all the host's network interfaces. This will -allow apps in the container to bind to/connect from TCP/UDP addresses -and ports from the host OS. It also allows applications to access -UNIX domain sockets associated with the host OS, which are in the -abstract namespace. If access to UNIX domains sockets in the abstract -namespace is not wanted, then applications should set the -<code><privnet/></code> flag in the -<code><features>....</features></code> element. -</p> - -<h3><a id="securefs">Filesystem isolation</a></h3> - -<p> -If the guest configuration does not list any filesystems, then -the container will be set up with a root filesystem that matches -the host's root filesystem. As noted earlier, only a few locations -such as <code>/dev</code>, <code>/proc</code> and <code>/sys</code> -will be altered. This means that, in the absence of restrictions -from sVirt, a process running as user/group N:M inside the container -will be able to access almost exactly the same files as a process -running as user/group N:M in the host. -</p> - -<p> -There are multiple options for restricting this. It is possible to -simply map the existing root filesystem through to the container in -read-only mode. Alternatively a completely separate root filesystem -can be configured for the guest. In both cases, further sub-mounts -can be applied to customize the content that is made visible. Note -that in the absence of sVirt controls, it is still possible for the -root user in a container to unmount any sub-mounts applied. The user -namespace feature can also be used to restrict access to files based -on the UID/GID mappings. -</p> - -<p> -Sharing the host filesystem tree, also allows applications to access -UNIX domains sockets associated with the host OS, which are in the -filesystem namespaces. It should be noted that a number of init -systems including at least <code>systemd</code> and <code>upstart</code> -have UNIX domain socket which are used to control their operation. -Thus, if the directory/filesystem holding their UNIX domain socket is -exposed to the container, it will be possible for a user in the container -to invoke operations on the init service in the same way it could if -outside the container. This also applies to other applications in the -host which use UNIX domain sockets in the filesystem, such as DBus, -Libvirtd, and many more. If this is not desired, then applications -should either specify the UID/GID mapping in the configuration to -enable user namespaces and thus block access to the UNIX domain socket -based on permissions, or should ensure the relevant directories have -a bind mount to hide them. This is particularly important for the -<code>/run</code> or <code>/var/run</code> directories. -</p> - - -<h3><a id="secureusers">User and group isolation</a></h3> - -<p> -If the guest configuration does not list any ID mapping, then the -user and group IDs used inside the container will match those used -outside the container. In addition, the capabilities associated with -a process in the container will infer the same privileges they would -for a process in the host. This has obvious implications for security, -since a root user inside the container will be able to access any -file owned by root that is visible to the container, and perform more -or less any privileged kernel operation. In the absence of additional -protection from sVirt, this means that the root user inside a container -is effectively as powerful as the root user in the host. There is no -security isolation of the root user. -</p> - -<p> -The ID mapping facility was introduced to allow for stricter control -over the privileges of users inside the container. It allows apps to -define rules such as "user ID 0 in the container maps to user ID 1000 -in the host". In addition the privileges associated with capabilities -are somewhat reduced so that they cannot be used to escape from the -container environment. A full description of user namespaces is outside -the scope of this document, however LWN has -<a href="https://lwn.net/Articles/532593/">a good write-up on the topic</a>. -From the libvirt point of view, the key thing to remember is that defining -an ID mapping for users and groups in the container XML configuration -causes libvirt to activate the user namespace feature. -</p> - - -<h2><a id="configFiles">Location of configuration files</a></h2> - -<p> -The LXC driver comes with sane default values. However, during its -initialization it reads a configuration file which offers system -administrator to override some of that default. The file is located -under <code>/etc/libvirt/lxc.conf</code> -</p> - - -<h2><a id="activation">Systemd Socket Activation Integration</a></h2> - -<p> -The libvirt LXC driver provides the ability to pass across pre-opened file -descriptors when starting LXC guests. This allows for libvirt LXC to support -systemd's <a href="http://0pointer.de/blog/projects/socket-activated-containers.html">socket -activation capability</a>, where an incoming client connection -in the host OS will trigger the startup of a container, which runs another -copy of systemd which gets passed the server socket, and then activates the -actual service handler in the container. -</p> - -<p> -Let us assume that you already have a LXC guest created, running -a systemd instance as PID 1 inside the container, which has an -SSHD service configured. The goal is to automatically activate -the container when the first SSH connection is made. The first -step is to create a couple of unit files for the host OS systemd -instance. The <code>/etc/systemd/system/mycontainer.service</code> -unit file specifies how systemd will start the libvirt LXC container -</p> - -<pre> -[Unit] -Description=My little container - -[Service] -ExecStart=/usr/bin/virsh -c lxc:///system start --pass-fds 3 mycontainer -ExecStop=/usr/bin/virsh -c lxc:///system destroy mycontainer -Type=oneshot -RemainAfterExit=yes -KillMode=none -</pre> - -<p> -The <code>--pass-fds 3</code> argument specifies that the file -descriptor number 3 that <code>virsh</code> inherits from systemd, -is to be passed into the container. Since <code>virsh</code> will -exit immediately after starting the container, the <code>RemainAfterExit</code> -and <code>KillMode</code> settings must be altered from their defaults. -</p> - -<p> -Next, the <code>/etc/systemd/system/mycontainer.socket</code> unit -file is created to get the host systemd to listen on port 23 for -TCP connections. When this unit file is activated by the first -incoming connection, it will cause the <code>mycontainer.service</code> -unit to be activated with the FD corresponding to the listening TCP -socket passed in as FD 3. -</p> - -<pre> -[Unit] -Description=The SSH socket of my little container - -[Socket] -ListenStream=23 -</pre> - -<p> -Port 23 was picked here so that the container doesn't conflict -with the host's SSH which is on the normal port 22. That's it -in terms of host side configuration. -</p> - -<p> -Inside the container, the <code>/etc/systemd/system/sshd.socket</code> -unit file must be created -</p> - -<pre> -[Unit] -Description=SSH Socket for Per-Connection Servers - -[Socket] -ListenStream=23 -Accept=yes -</pre> - -<p> -The <code>ListenStream</code> value listed in this unit file, must -match the value used in the host file. When systemd in the container -receives the pre-opened FD from libvirt during container startup, it -looks at the <code>ListenStream</code> values to figure out which -FD to give to which service. The actual service to start is defined -by a correspondingly named <code>/etc/systemd/system/sshd@.service</code> -</p> - -<pre> -[Unit] -Description=SSH Per-Connection Server for %I - -[Service] -ExecStart=-/usr/sbin/sshd -i -StandardInput=socket -</pre> - -<p> -Finally, make sure this SSH service is set to start on boot of the container, -by running the following command inside the container: -</p> - -<pre> -# mkdir -p /etc/systemd/system/sockets.target.wants/ -# ln -s /etc/systemd/system/sshd.socket /etc/systemd/system/sockets.target.wants/ -</pre> - -<p> -This example shows how to activate the container based on an incoming -SSH connection. If the container was also configured to have an httpd -service, it may be desirable to activate it upon either an httpd or a -sshd connection attempt. In this case, the <code>mycontainer.socket</code> -file in the host would simply list multiple socket ports. Inside the -container a separate <code>xxxxx.socket</code> file would need to be -created for each service, with a corresponding <code>ListenStream</code> -value set. -</p> - -<!-- -<h2>Container configuration</h2> - -<h3>Init process</h3> - -<h3>Console devices</h3> - -<h3>Filesystem devices</h3> - -<h3>Disk devices</h3> - -<h3>Block devices</h3> - -<h3>USB devices</h3> - -<h3>Character devices</h3> - -<h3>Network devices</h3> ---> - -<h2>Container security</h2> - -<h3>sVirt SELinux</h3> - -<p> -In the absence of the "user" namespace being used, containers cannot -be considered secure against exploits of the host OS. The sVirt SELinux -driver provides a way to secure containers even when the "user" namespace -is not used. The cost is that writing a policy to allow execution of -arbitrary OS is not practical. The SELinux sVirt policy is typically -tailored to work with a simpler application confinement use case, -as provided by the "libvirt-sandbox" project. -</p> - -<h3>Auditing</h3> - -<p> -The LXC driver is integrated with libvirt's auditing subsystem, which -causes audit messages to be logged whenever there is an operation -performed against a container which has impact on host resources. -So for example, start/stop, device hotplug will all log audit messages -providing details about what action occurred and any resources -associated with it. There are the following 3 types of audit messages -</p> - -<ul> -<li><code>VIRT_MACHINE_ID</code> - details of the SELinux process and -image security labels assigned to the container.</li> -<li><code>VIRT_CONTROL</code> - details of an action / operation -performed against a container. There are the following types of -operation - <ul> - <li><code>op=start</code> - a container has been started. Provides - the machine name, uuid and PID of the <code>libvirt_lxc</code> - controller process</li> - <li><code>op=init</code> - the init PID of the container has been - started. Provides the machine name, uuid and PID of the - <code>libvirt_lxc</code> controller process and PID of the - init process (in the host PID namespace)</li> - <li><code>op=stop</code> - a container has been stopped. Provides - the machine name, uuid</li> - </ul> -</li> -<li><code>VIRT_RESOURCE</code> - details of a host resource -associated with a container action.</li> -</ul> - -<h3>Device access</h3> - -<p> -All containers are launched with the CAP_MKNOD capability cleared -and removed from the bounding set. Libvirt will ensure that the -/dev filesystem is pre-populated with all devices that a container -is allowed to use. In addition, the cgroup "device" controller is -configured to block read/write/mknod from all devices except those -that a container is authorized to use. -</p> - -<h2><a id="exconfig">Example configurations</a></h2> - -<h3>Example config version 1</h3> -<p></p> -<pre> -<domain type='lxc'> - <name>vm1</name> - <memory>500000</memory> - <os> - <type>exe</type> - <init>/bin/sh</init> - </os> - <vcpu>1</vcpu> - <clock offset='utc'/> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>destroy</on_crash> - <devices> - <emulator>/usr/libexec/libvirt_lxc</emulator> - <interface type='network'> - <source network='default'/> - </interface> - <console type='pty' /> - </devices> -</domain> -</pre> - -<p> -In the <emulator> element, be sure you specify the correct path -to libvirt_lxc, if it does not live in /usr/libexec on your system. -</p> - -<p> -The next example assumes there is a private root filesystem -(perhaps hand-crafted using busybox, or installed from media, -debootstrap, whatever) under /opt/vm-1-root: -</p> -<p></p> -<pre> -<domain type='lxc'> - <name>vm1</name> - <memory>32768</memory> - <os> - <type>exe</type> - <init>/init</init> - </os> - <vcpu>1</vcpu> - <clock offset='utc'/> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>destroy</on_crash> - <devices> - <emulator>/usr/libexec/libvirt_lxc</emulator> - <filesystem type='mount'> - <source dir='/opt/vm-1-root'/> - <target dir='/'/> - </filesystem> - <interface type='network'> - <source network='default'/> - </interface> - <console type='pty' /> - </devices> -</domain> -</pre> - -<h2><a id="capabilities">Altering the available capabilities</a></h2> - -<p> -By default the libvirt LXC driver drops some capabilities among which CAP_MKNOD. -However <span class="since">since 1.2.6</span> libvirt can be told to keep or -drop some capabilities using a domain configuration like the following: -</p> -<pre> -... -<features> - <capabilities policy='default'> - <mknod state='on'/> - <sys_chroot state='off'/> - </capabilities> -</features> -... -</pre> -<p> -The capabilities children elements are named after the capabilities as defined in -<code>man 7 capabilities</code>. An <code>off</code> state tells libvirt to drop the -capability, while an <code>on</code> state will force to keep the capability even though -this one is dropped by default. -</p> -<p> -The <code>policy</code> attribute can be one of <code>default</code>, <code>allow</code> -or <code>deny</code>. It defines the default rules for capabilities: either keep the -default behavior that is dropping a few selected capabilities, or keep all capabilities -or drop all capabilities. The interest of <code>allow</code> and <code>deny</code> is that -they guarantee that all capabilities will be kept (or removed) even if new ones are added -later. -</p> -<p> -The following example, drops all capabilities but CAP_MKNOD: -</p> -<pre> -... -<features> - <capabilities policy='deny'> - <mknod state='on'/> - </capabilities> -</features> -... -</pre> -<p> -Note that allowing capabilities that are normally dropped by default can seriously -affect the security of the container and the host. -</p> - -<h2><a id="share">Inherit namespaces</a></h2> - -<p> -Libvirt allows you to inherit the namespace from container/process just like lxc tools -or docker provides to share the network namespace. The following can be used to share -required namespaces. If we want to share only one then the other namespaces can be ignored. -The netns option is specific to sharenet. It can be used in cases we want to use existing network namespace -rather than creating new network namespace for the container. In this case privnet option will be -ignored. -</p> -<pre> -<domain type='lxc' xmlns:lxc='http://libvirt.org/schemas/domain/lxc/1.0'> -... -<lxc:namespace> - <lxc:sharenet type='netns' value='red'/> - <lxc:shareuts type='name' value='container1'/> - <lxc:shareipc type='pid' value='12345'/> -</lxc:namespace> -</domain> -</pre> - -<p> -The use of namespace passthrough requires libvirt >= 1.2.19 -</p> - -<h2><a id="usage">Container usage / management</a></h2> - -<p> -As with any libvirt virtualization driver, LXC containers can be -managed via a wide variety of libvirt based tools. At the lowest -level the <code>virsh</code> command can be used to perform many -tasks, by passing the <code>-c lxc:///system</code> argument. As an -alternative to repeating the URI with every command, the <code>LIBVIRT_DEFAULT_URI</code> -environment variable can be set to <code>lxc:///system</code>. The -examples that follow outline some common operations with virsh -and LXC. For further details about usage of virsh consult its -manual page. -</p> - -<h3><a id="usageSave">Defining (saving) container configuration</a></h3> - -<p> -The <code>virsh define</code> command takes an XML configuration -document and loads it into libvirt, saving the configuration on disk -</p> - -<pre> -# virsh -c lxc:///system define myguest.xml -</pre> - -<h3><a id="usageView">Viewing container configuration</a></h3> - -<p> -The <code>virsh dumpxml</code> command can be used to view the -current XML configuration of a container. By default the XML -output reflects the current state of the container. If the -container is running, it is possible to explicitly request the -persistent configuration, instead of the current live configuration -using the <code>--inactive</code> flag -</p> - -<pre> -# virsh -c lxc:///system dumpxml myguest -</pre> - -<h3><a id="usageStart">Starting containers</a></h3> - -<p> -The <code>virsh start</code> command can be used to start a -container from a previously defined persistent configuration -</p> - -<pre> -# virsh -c lxc:///system start myguest -</pre> - -<p> -It is also possible to start so called "transient" containers, -which do not require a persistent configuration to be saved -by libvirt, using the <code>virsh create</code> command. -</p> - -<pre> -# virsh -c lxc:///system create myguest.xml -</pre> - - -<h3><a id="usageStop">Stopping containers</a></h3> - -<p> -The <code>virsh shutdown</code> command can be used -to request a graceful shutdown of the container. By default -this command will first attempt to send a message to the -init process via the <code>/dev/initctl</code> device node. -If no such device node exists, then it will send SIGTERM -to PID 1 inside the container. -</p> - -<pre> -# virsh -c lxc:///system shutdown myguest -</pre> - -<p> -If the container does not respond to the graceful shutdown -request, it can be forcibly stopped using the <code>virsh destroy</code> -</p> - -<pre> -# virsh -c lxc:///system destroy myguest -</pre> - - -<h3><a id="usageReboot">Rebooting a container</a></h3> - -<p> -The <code>virsh reboot</code> command can be used -to request a graceful shutdown of the container. By default -this command will first attempt to send a message to the -init process via the <code>/dev/initctl</code> device node. -If no such device node exists, then it will send SIGHUP -to PID 1 inside the container. -</p> - -<pre> -# virsh -c lxc:///system reboot myguest -</pre> - -<h3><a id="usageDelete">Undefining (deleting) a container configuration</a></h3> - -<p> -The <code>virsh undefine</code> command can be used to delete the -persistent configuration of a container. If the guest is currently -running, this will turn it into a "transient" guest. -</p> - -<pre> -# virsh -c lxc:///system undefine myguest -</pre> - -<h3><a id="usageConnect">Connecting to a container console</a></h3> - -<p> -The <code>virsh console</code> command can be used to connect -to the text console associated with a container. -</p> - -<pre> -# virsh -c lxc:///system console myguest -</pre> - -<p> -If the container has been configured with multiple console devices, -then the <code>--devname</code> argument can be used to choose the -console to connect to. -In LXC, multiple consoles will be named -as 'console0', 'console1', 'console2', etc. -</p> - -<pre> -# virsh -c lxc:///system console myguest --devname console1 -</pre> - -<h3><a id="usageEnter">Running commands in a container</a></h3> - -<p> -The <code>virsh lxc-enter-namespace</code> command can be used -to enter the namespaces and security context of a container -and then execute an arbitrary command. -</p> - -<pre> -# virsh -c lxc:///system lxc-enter-namespace myguest -- /bin/ls -al /dev -</pre> - -<h3><a id="usageTop">Monitoring container utilization</a></h3> - -<p> -The <code>virt-top</code> command can be used to monitor the -activity and resource utilization of all containers on a -host -</p> - -<pre> -# virt-top -c lxc:///system -</pre> - -<h3><a id="usageConvert">Converting LXC container configuration</a></h3> - -<p> -The <code>virsh domxml-from-native</code> command can be used to convert -most of the LXC container configuration into a domain XML fragment -</p> - -<pre> -# virsh -c lxc:///system domxml-from-native lxc-tools /var/lib/lxc/myguest/config -</pre> - -<p> -This conversion has some limitations due to the fact that the -domxml-from-native command output has to be independent of the host. Here -are a few things to take care of before converting: -</p> - -<ul> -<li> -Replace the fstab file referenced by <tt>lxc.mount</tt> by the corresponding -lxc.mount.entry lines. -</li> -<li> -Replace all relative sizes of tmpfs mount entries to absolute sizes. Also -make sure that tmpfs entries all have a size option (default is 50%). -</li> -<li> -Define <tt>lxc.cgroup.memory.limit_in_bytes</tt> to properly limit the memory -available to the container. The conversion will use 64MiB as the default. -</li> -</ul> - - </body> -</html> diff --git a/docs/drvlxc.rst b/docs/drvlxc.rst new file mode 100644 index 0000000000..b88323b165 --- /dev/null +++ b/docs/drvlxc.rst @@ -0,0 +1,670 @@ +.. role:: since + +==================== +LXC container driver +==================== + +.. contents:: + +The libvirt LXC driver manages "Linux Containers". At their simplest, containers +can just be thought of as a collection of processes, separated from the main +host processes via a set of resource namespaces and constrained via control +groups resource tunables. The libvirt LXC driver has no dependency on the LXC +userspace tools hosted on sourceforge.net. It directly utilizes the relevant +kernel features to build the container environment. This allows for sharing of +many libvirt technologies across both the QEMU/KVM and LXC drivers. In +particular sVirt for mandatory access control, auditing of operations, +integration with control groups and many other features. + +Control groups Requirements +--------------------------- + +In order to control the resource usage of processes inside containers, the +libvirt LXC driver requires that certain cgroups controllers are mounted on the +host OS. The minimum required controllers are 'cpuacct', 'memory' and 'devices', +while recommended extra controllers are 'cpu', 'freezer' and 'blkio'. Libvirt +will not mount the cgroups filesystem itself, leaving this up to the init system +to take care of. Systemd will do the right thing in this respect, while for +other init systems the ``cgconfig`` init service will be required. For further +information, consult the general libvirt `cgroups +documentation <cgroups.html>`__. + +Namespace requirements +---------------------- + +In order to separate processes inside a container from those in the primary +"host" OS environment, the libvirt LXC driver requires that certain kernel +namespaces are compiled in. Libvirt currently requires the 'mount', 'ipc', +'pid', and 'uts' namespaces to be available. If separate network interfaces are +desired, then the 'net' namespace is required. If the guest configuration +declares a `UID or GID mapping <formatdomain.html#elementsOSContainer>`__, the +'user' namespace will be enabled to apply these. **A suitably configured UID/GID +mapping is a pre-requisite to making containers secure, in the absence of sVirt +confinement.** + +Default container setup +----------------------- + +Command line arguments +~~~~~~~~~~~~~~~~~~~~~~ + +When the container "init" process is started, it will typically not be given any +command line arguments (eg the equivalent of the bootloader args visible in +``/proc/cmdline``). If any arguments are desired, then must be explicitly set in +the container XML configuration via one or more ``initarg`` elements. For +example, to run ``systemd --unit emergency.service`` would use the following XML + +:: + + <os> + <type arch='x86_64'>exe</type> + <init>/bin/systemd</init> + <initarg>--unit</initarg> + <initarg>emergency.service</initarg> + </os> + +Environment variables +~~~~~~~~~~~~~~~~~~~~~ + +When the container "init" process is started, it will be given several useful +environment variables. The following standard environment variables are mandated +by `systemd container +interface <https://www.freedesktop.org/wiki/Software/systemd/ContainerInterface>`__ +to be provided by all container technologies on Linux. + +``container`` + The fixed string ``libvirt-lxc`` to identify libvirt as the creator +``container_uuid`` + The UUID assigned to the container by libvirt +``PATH`` + The fixed string ``/bin:/usr/bin`` +``TERM`` + The fixed string ``linux`` +``HOME`` + The fixed string ``/`` + +In addition to the standard variables, the following libvirt specific +environment variables are also provided + +``LIBVIRT_LXC_NAME`` + The name assigned to the container by libvirt +``LIBVIRT_LXC_UUID`` + The UUID assigned to the container by libvirt +``LIBVIRT_LXC_CMDLINE`` + The unparsed command line arguments specified in the container configuration. + Use of this is discouraged, in favour of passing arguments directly to the + container init process via the ``initarg`` config element. + +Filesystem mounts +~~~~~~~~~~~~~~~~~ + +In the absence of any explicit configuration, the container will inherit the +host OS filesystem mounts. A number of mount points will be made read only, or +re-mounted with new instances to provide container specific data. The following +special mounts are setup by libvirt + +- ``/dev`` a new "tmpfs" pre-populated with authorized device nodes +- ``/dev/pts`` a new private "devpts" instance for console devices +- ``/sys`` the host "sysfs" instance remounted read-only +- ``/proc`` a new instance of the "proc" filesystem +- ``/proc/sys`` the host "/proc/sys" bind-mounted read-only +- ``/sys/fs/selinux`` the host "selinux" instance remounted read-only +- ``/sys/fs/cgroup/NNNN`` the host cgroups controllers bind-mounted to only + expose the sub-tree associated with the container +- ``/proc/meminfo`` a FUSE backed file reflecting memory limits of the + container + +Device nodes +~~~~~~~~~~~~ + +The container init process will be started with ``CAP_MKNOD`` capability removed +and blocked from re-acquiring it. As such it will not be able to create any +device nodes in ``/dev`` or anywhere else in its filesystems. Libvirt itself +will take care of pre-populating the ``/dev`` filesystem with any devices that +the container is authorized to use. The current devices that will be made +available to all containers are + +- ``/dev/zero`` +- ``/dev/null`` +- ``/dev/full`` +- ``/dev/random`` +- ``/dev/urandom`` +- ``/dev/stdin`` symlinked to ``/proc/self/fd/0`` +- ``/dev/stdout`` symlinked to ``/proc/self/fd/1`` +- ``/dev/stderr`` symlinked to ``/proc/self/fd/2`` +- ``/dev/fd`` symlinked to ``/proc/self/fd`` +- ``/dev/ptmx`` symlinked to ``/dev/pts/ptmx`` +- ``/dev/console`` symlinked to ``/dev/pts/0`` + +In addition, for every console defined in the guest configuration, a symlink +will be created from ``/dev/ttyN`` symlinked to the corresponding ``/dev/pts/M`` +pseudo TTY device. The first console will be ``/dev/tty1``, with further +consoles numbered incrementally from there. + +Since /dev/ttyN and /dev/console are linked to the pts devices. The tty device +of login program is pts device. The pam module securetty may prevent root user +from logging in container. If you want root user to log in container +successfully, add the pts device to the file /etc/securetty of container. + +Further block or character devices will be made available to containers +depending on their configuration. + +Security considerations +----------------------- + +The libvirt LXC driver is fairly flexible in how it can be configured, and as +such does not enforce a requirement for strict security separation between a +container and the host. This allows it to be used in scenarios where only +resource control capabilities are important, and resource sharing is desired. +Applications wishing to ensure secure isolation between a container and the host +must ensure that they are writing a suitable configuration. + +Network isolation +~~~~~~~~~~~~~~~~~ + +If the guest configuration does not list any network interfaces, the ``network`` +namespace will not be activated, and thus the container will see all the host's +network interfaces. This will allow apps in the container to bind to/connect +from TCP/UDP addresses and ports from the host OS. It also allows applications +to access UNIX domain sockets associated with the host OS, which are in the +abstract namespace. If access to UNIX domains sockets in the abstract namespace +is not wanted, then applications should set the ``<privnet/>`` flag in the +``<features>....</features>`` element. + +Filesystem isolation +~~~~~~~~~~~~~~~~~~~~ + +If the guest configuration does not list any filesystems, then the container +will be set up with a root filesystem that matches the host's root filesystem. +As noted earlier, only a few locations such as ``/dev``, ``/proc`` and ``/sys`` +will be altered. This means that, in the absence of restrictions from sVirt, a +process running as user/group N:M inside the container will be able to access +almost exactly the same files as a process running as user/group N:M in the +host. + +There are multiple options for restricting this. It is possible to simply map +the existing root filesystem through to the container in read-only mode. +Alternatively a completely separate root filesystem can be configured for the +guest. In both cases, further sub-mounts can be applied to customize the content +that is made visible. Note that in the absence of sVirt controls, it is still +possible for the root user in a container to unmount any sub-mounts applied. The +user namespace feature can also be used to restrict access to files based on the +UID/GID mappings. + +Sharing the host filesystem tree, also allows applications to access UNIX +domains sockets associated with the host OS, which are in the filesystem +namespaces. It should be noted that a number of init systems including at least +``systemd`` and ``upstart`` have UNIX domain socket which are used to control +their operation. Thus, if the directory/filesystem holding their UNIX domain +socket is exposed to the container, it will be possible for a user in the +container to invoke operations on the init service in the same way it could if +outside the container. This also applies to other applications in the host which +use UNIX domain sockets in the filesystem, such as DBus, Libvirtd, and many +more. If this is not desired, then applications should either specify the +UID/GID mapping in the configuration to enable user namespaces and thus block +access to the UNIX domain socket based on permissions, or should ensure the +relevant directories have a bind mount to hide them. This is particularly +important for the ``/run`` or ``/var/run`` directories. + +User and group isolation +~~~~~~~~~~~~~~~~~~~~~~~~ + +If the guest configuration does not list any ID mapping, then the user and group +IDs used inside the container will match those used outside the container. In +addition, the capabilities associated with a process in the container will infer +the same privileges they would for a process in the host. This has obvious +implications for security, since a root user inside the container will be able +to access any file owned by root that is visible to the container, and perform +more or less any privileged kernel operation. In the absence of additional +protection from sVirt, this means that the root user inside a container is +effectively as powerful as the root user in the host. There is no security +isolation of the root user. + +The ID mapping facility was introduced to allow for stricter control over the +privileges of users inside the container. It allows apps to define rules such as +"user ID 0 in the container maps to user ID 1000 in the host". In addition the +privileges associated with capabilities are somewhat reduced so that they cannot +be used to escape from the container environment. A full description of user +namespaces is outside the scope of this document, however LWN has `a good +write-up on the topic <https://lwn.net/Articles/532593/>`__. From the libvirt +point of view, the key thing to remember is that defining an ID mapping for +users and groups in the container XML configuration causes libvirt to activate +the user namespace feature. + +Location of configuration files +------------------------------- + +The LXC driver comes with sane default values. However, during its +initialization it reads a configuration file which offers system administrator +to override some of that default. The file is located under +``/etc/libvirt/lxc.conf`` + +Systemd Socket Activation Integration +------------------------------------- + +The libvirt LXC driver provides the ability to pass across pre-opened file +descriptors when starting LXC guests. This allows for libvirt LXC to support +systemd's `socket activation +capability <http://0pointer.de/blog/projects/socket-activated-containers.html>`__, +where an incoming client connection in the host OS will trigger the startup of a +container, which runs another copy of systemd which gets passed the server +socket, and then activates the actual service handler in the container. + +Let us assume that you already have a LXC guest created, running a systemd +instance as PID 1 inside the container, which has an SSHD service configured. +The goal is to automatically activate the container when the first SSH +connection is made. The first step is to create a couple of unit files for the +host OS systemd instance. The ``/etc/systemd/system/mycontainer.service`` unit +file specifies how systemd will start the libvirt LXC container + +:: + + [Unit] + Description=My little container + + [Service] + ExecStart=/usr/bin/virsh -c lxc:///system start --pass-fds 3 mycontainer + ExecStop=/usr/bin/virsh -c lxc:///system destroy mycontainer + Type=oneshot + RemainAfterExit=yes + KillMode=none + +The ``--pass-fds 3`` argument specifies that the file descriptor number 3 that +``virsh`` inherits from systemd, is to be passed into the container. Since +``virsh`` will exit immediately after starting the container, the +``RemainAfterExit`` and ``KillMode`` settings must be altered from their +defaults. + +Next, the ``/etc/systemd/system/mycontainer.socket`` unit file is created to get +the host systemd to listen on port 23 for TCP connections. When this unit file +is activated by the first incoming connection, it will cause the +``mycontainer.service`` unit to be activated with the FD corresponding to the +listening TCP socket passed in as FD 3. + +:: + + [Unit] + Description=The SSH socket of my little container + + [Socket] + ListenStream=23 + +Port 23 was picked here so that the container doesn't conflict with the host's +SSH which is on the normal port 22. That's it in terms of host side +configuration. + +Inside the container, the ``/etc/systemd/system/sshd.socket`` unit file must be +created + +:: + + [Unit] + Description=SSH Socket for Per-Connection Servers + + [Socket] + ListenStream=23 + Accept=yes + +The ``ListenStream`` value listed in this unit file, must match the value used +in the host file. When systemd in the container receives the pre-opened FD from +libvirt during container startup, it looks at the ``ListenStream`` values to +figure out which FD to give to which service. The actual service to start is +defined by a correspondingly named ``/etc/systemd/system/sshd@.service`` + +:: + + [Unit] + Description=SSH Per-Connection Server for %I + + [Service] + ExecStart=-/usr/sbin/sshd -i + StandardInput=socket + +Finally, make sure this SSH service is set to start on boot of the container, by +running the following command inside the container: + +:: + + # mkdir -p /etc/systemd/system/sockets.target.wants/ + # ln -s /etc/systemd/system/sshd.socket /etc/systemd/system/sockets.target.wants/ + +This example shows how to activate the container based on an incoming SSH +connection. If the container was also configured to have an httpd service, it +may be desirable to activate it upon either an httpd or a sshd connection +attempt. In this case, the ``mycontainer.socket`` file in the host would simply +list multiple socket ports. Inside the container a separate ``xxxxx.socket`` +file would need to be created for each service, with a corresponding +``ListenStream`` value set. + +Container security +------------------ + +sVirt SELinux +~~~~~~~~~~~~~ + +In the absence of the "user" namespace being used, containers cannot be +considered secure against exploits of the host OS. The sVirt SELinux driver +provides a way to secure containers even when the "user" namespace is not used. +The cost is that writing a policy to allow execution of arbitrary OS is not +practical. The SELinux sVirt policy is typically tailored to work with a simpler +application confinement use case, as provided by the "libvirt-sandbox" project. + +Auditing +~~~~~~~~ + +The LXC driver is integrated with libvirt's auditing subsystem, which causes +audit messages to be logged whenever there is an operation performed against a +container which has impact on host resources. So for example, start/stop, device +hotplug will all log audit messages providing details about what action occurred +and any resources associated with it. There are the following 3 types of audit +messages + +- ``VIRT_MACHINE_ID`` - details of the SELinux process and image security + labels assigned to the container. +- ``VIRT_CONTROL`` - details of an action / operation performed against a + container. There are the following types of operation + + - ``op=start`` - a container has been started. Provides the machine name, + uuid and PID of the ``libvirt_lxc`` controller process + - ``op=init`` - the init PID of the container has been started. Provides the + machine name, uuid and PID of the ``libvirt_lxc`` controller process and + PID of the init process (in the host PID namespace) + - ``op=stop`` - a container has been stopped. Provides the machine name, + uuid + +- ``VIRT_RESOURCE`` - details of a host resource associated with a container + action. + +Device access +~~~~~~~~~~~~~ + +All containers are launched with the CAP_MKNOD capability cleared and removed +from the bounding set. Libvirt will ensure that the /dev filesystem is +pre-populated with all devices that a container is allowed to use. In addition, +the cgroup "device" controller is configured to block read/write/mknod from all +devices except those that a container is authorized to use. + +Example configurations +---------------------- + +Example config version 1 +~~~~~~~~~~~~~~~~~~~~~~~~ + +:: + + <domain type='lxc'> + <name>vm1</name> + <memory>500000</memory> + <os> + <type>exe</type> + <init>/bin/sh</init> + </os> + <vcpu>1</vcpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/libexec/libvirt_lxc</emulator> + <interface type='network'> + <source network='default'/> + </interface> + <console type='pty' /> + </devices> + </domain> + +In the <emulator> element, be sure you specify the correct path to libvirt_lxc, +if it does not live in /usr/libexec on your system. + +The next example assumes there is a private root filesystem (perhaps +hand-crafted using busybox, or installed from media, debootstrap, whatever) +under /opt/vm-1-root: + +:: + + <domain type='lxc'> + <name>vm1</name> + <memory>32768</memory> + <os> + <type>exe</type> + <init>/init</init> + </os> + <vcpu>1</vcpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/libexec/libvirt_lxc</emulator> + <filesystem type='mount'> + <source dir='/opt/vm-1-root'/> + <target dir='/'/> + </filesystem> + <interface type='network'> + <source network='default'/> + </interface> + <console type='pty' /> + </devices> + </domain> + +Altering the available capabilities +----------------------------------- + +By default the libvirt LXC driver drops some capabilities among which CAP_MKNOD. +However :since:`since 1.2.6` libvirt can be told to keep or drop some +capabilities using a domain configuration like the following: + +:: + + ... + <features> + <capabilities policy='default'> + <mknod state='on'/> + <sys_chroot state='off'/> + </capabilities> + </features> + ... + +The capabilities children elements are named after the capabilities as defined +in ``man 7 capabilities``. An ``off`` state tells libvirt to drop the +capability, while an ``on`` state will force to keep the capability even though +this one is dropped by default. + +The ``policy`` attribute can be one of ``default``, ``allow`` or ``deny``. It +defines the default rules for capabilities: either keep the default behavior +that is dropping a few selected capabilities, or keep all capabilities or drop +all capabilities. The interest of ``allow`` and ``deny`` is that they guarantee +that all capabilities will be kept (or removed) even if new ones are added +later. + +The following example, drops all capabilities but CAP_MKNOD: + +:: + + ... + <features> + <capabilities policy='deny'> + <mknod state='on'/> + </capabilities> + </features> + ... + +Note that allowing capabilities that are normally dropped by default can +seriously affect the security of the container and the host. + +Inherit namespaces +------------------ + +Libvirt allows you to inherit the namespace from container/process just like lxc +tools or docker provides to share the network namespace. The following can be +used to share required namespaces. If we want to share only one then the other +namespaces can be ignored. The netns option is specific to sharenet. It can be +used in cases we want to use existing network namespace rather than creating new +network namespace for the container. In this case privnet option will be +ignored. + +:: + + <domain type='lxc' xmlns:lxc='http://libvirt.org/schemas/domain/lxc/1.0'> + ... + <lxc:namespace> + <lxc:sharenet type='netns' value='red'/> + <lxc:shareuts type='name' value='container1'/> + <lxc:shareipc type='pid' value='12345'/> + </lxc:namespace> + </domain> + +The use of namespace passthrough requires libvirt >= 1.2.19 + +Container usage / management +---------------------------- + +As with any libvirt virtualization driver, LXC containers can be managed via a +wide variety of libvirt based tools. At the lowest level the ``virsh`` command +can be used to perform many tasks, by passing the ``-c lxc:///system`` argument. +As an alternative to repeating the URI with every command, the +``LIBVIRT_DEFAULT_URI`` environment variable can be set to ``lxc:///system``. +The examples that follow outline some common operations with virsh and LXC. For +further details about usage of virsh consult its manual page. + +Defining (saving) container configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh define`` command takes an XML configuration document and loads it +into libvirt, saving the configuration on disk + +:: + + # virsh -c lxc:///system define myguest.xml + +Viewing container configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh dumpxml`` command can be used to view the current XML configuration +of a container. By default the XML output reflects the current state of the +container. If the container is running, it is possible to explicitly request the +persistent configuration, instead of the current live configuration using the +``--inactive`` flag + +:: + + # virsh -c lxc:///system dumpxml myguest + +Starting containers +~~~~~~~~~~~~~~~~~~~ + +The ``virsh start`` command can be used to start a container from a previously +defined persistent configuration + +:: + + # virsh -c lxc:///system start myguest + +It is also possible to start so called "transient" containers, which do not +require a persistent configuration to be saved by libvirt, using the +``virsh create`` command. + +:: + + # virsh -c lxc:///system create myguest.xml + +Stopping containers +~~~~~~~~~~~~~~~~~~~ + +The ``virsh shutdown`` command can be used to request a graceful shutdown of the +container. By default this command will first attempt to send a message to the +init process via the ``/dev/initctl`` device node. If no such device node +exists, then it will send SIGTERM to PID 1 inside the container. + +:: + + # virsh -c lxc:///system shutdown myguest + +If the container does not respond to the graceful shutdown request, it can be +forcibly stopped using the ``virsh destroy`` + +:: + + # virsh -c lxc:///system destroy myguest + +Rebooting a container +~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh reboot`` command can be used to request a graceful shutdown of the +container. By default this command will first attempt to send a message to the +init process via the ``/dev/initctl`` device node. If no such device node +exists, then it will send SIGHUP to PID 1 inside the container. + +:: + + # virsh -c lxc:///system reboot myguest + +Undefining (deleting) a container configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh undefine`` command can be used to delete the persistent +configuration of a container. If the guest is currently running, this will turn +it into a "transient" guest. + +:: + + # virsh -c lxc:///system undefine myguest + +Connecting to a container console +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh console`` command can be used to connect to the text console +associated with a container. + +:: + + # virsh -c lxc:///system console myguest + +If the container has been configured with multiple console devices, then the +``--devname`` argument can be used to choose the console to connect to. In LXC, +multiple consoles will be named as 'console0', 'console1', 'console2', etc. + +:: + + # virsh -c lxc:///system console myguest --devname console1 + +Running commands in a container +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh lxc-enter-namespace`` command can be used to enter the namespaces +and security context of a container and then execute an arbitrary command. + +:: + + # virsh -c lxc:///system lxc-enter-namespace myguest -- /bin/ls -al /dev + +Monitoring container utilization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virt-top`` command can be used to monitor the activity and resource +utilization of all containers on a host + +:: + + # virt-top -c lxc:///system + +Converting LXC container configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh domxml-from-native`` command can be used to convert most of the LXC +container configuration into a domain XML fragment + +:: + + # virsh -c lxc:///system domxml-from-native lxc-tools /var/lib/lxc/myguest/config + +This conversion has some limitations due to the fact that the domxml-from-native +command output has to be independent of the host. Here are a few things to take +care of before converting: + +- Replace the fstab file referenced by lxc.mount by the corresponding + lxc.mount.entry lines. +- Replace all relative sizes of tmpfs mount entries to absolute sizes. Also + make sure that tmpfs entries all have a size option (default is 50%). +- Define lxc.cgroup.memory.limit_in_bytes to properly limit the memory + available to the container. The conversion will use 64MiB as the default. diff --git a/docs/meson.build b/docs/meson.build index cfbde2a58d..fdf1714da8 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvlxc', 'drvnodedev', 'drvopenvz', 'drvsecret', @@ -80,6 +79,7 @@ docs_rst_files = [ 'drvch', 'drvesx', 'drvhyperv', + 'drvlxc', 'drvqemu', 'errors', 'formatbackup', -- 2.35.1

Fix one cross link anchor along with the conversion. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvnodedev.html.in | 383 ---------------------------------------- docs/drvnodedev.rst | 348 ++++++++++++++++++++++++++++++++++++ docs/formatdomain.rst | 3 +- docs/meson.build | 2 +- 4 files changed, 351 insertions(+), 385 deletions(-) delete mode 100644 docs/drvnodedev.html.in create mode 100644 docs/drvnodedev.rst diff --git a/docs/drvnodedev.html.in b/docs/drvnodedev.html.in deleted file mode 100644 index ee75eeb25c..0000000000 --- a/docs/drvnodedev.html.in +++ /dev/null @@ -1,383 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Host device management</h1> - - <p> - Libvirt provides management of both physical and virtual host devices - (historically also referred to as node devices) like USB, PCI, SCSI, and - network devices. This also includes various virtualization capabilities - which the aforementioned devices provide for utilization, for example - SR-IOV, NPIV, MDEV, DRM, etc. - </p> - - <p> - The node device driver provides means to list and show details about host - devices (<code>virsh nodedev-list</code>, <code>virsh nodedev-info</code>, - and <code>virsh nodedev-dumpxml</code>), which are generic and can be used - with all devices. It also provides the means to manage virtual devices. - Persistently-defined virtual devices are only supported for mediated - devices, while transient devices are supported by both mediated devices - and NPIV (<a href="https://wiki.libvirt.org/page/NPIV_in_libvirt">more - info about NPIV)</a>). - </p> - <p> - Persistent virtual devices are managed with - <code>virsh nodedev-define</code> and <code>virsh nodedev-undefine</code>. - Persistent devices can be configured to start manually or automatically - using <code>virsh nodedev-autostart</code>. Inactive devices can be made - active with <code>virsh nodedev-start</code>. - </p> - <p> - Transient virtual devices are started and stopped with the commands - <code>virsh nodedev-create</code> and <code>virsh nodedev-destroy</code>. - </p> - <p> - Devices on the host system are arranged in a tree-like hierarchy, with - the root node being called <code>computer</code>. The node device driver - supports udev backend (HAL backend was removed in <code>6.8.0</code>). - </p> - - <p> - Details of the XML format of a host device can be found <a - href="formatnode.html">here</a>. Of particular interest is the - <code>capability</code> element, which describes features supported by - the device. Some specific device types are addressed in more detail - below. - </p> - <h2>Basic structure of a node device</h2> - <pre> -<device> - <name>pci_0000_00_17_0</name> - <path>/sys/devices/pci0000:00/0000:00:17.0</path> - <parent>computer</parent> - <driver> - <name>ahci</name> - </driver> - <capability type='pci'> -... - </capability> -</device></pre> - - <ul id="toc"/> - - <h2><a id="PCI">PCI host devices</a></h2> - <dl> - <dt><code>capability</code></dt> - <dd> - When used as top level element, the supported values for the - <code>type</code> attribute are <code>pci</code> and - <code>phys_function</code> (see <a href="#SRIOVCap">SR-IOV below</a>). - </dd> - </dl> - <pre> -<device> - <name>pci_0000_04_00_1</name> - <path>/sys/devices/pci0000:00/0000:00:06.0/0000:04:00.1</path> - <parent>pci_0000_00_06_0</parent> - <driver> - <name>igb</name> - </driver> - <capability type='pci'> - <domain>0</domain> - <bus>4</bus> - <slot>0</slot> - <function>1</function> - <product id='0x10c9'>82576 Gigabit Network Connection</product> - <vendor id='0x8086'>Intel Corporation</vendor> - <iommuGroup number='15'> - <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/> - </iommuGroup> - <numa node='0'/> - <pci-express> - <link validity='cap' port='1' speed='2.5' width='2'/> - <link validity='sta' speed='2.5' width='2'/> - </pci-express> - </capability> -</device></pre> - - <p> - The XML format for a PCI device stays the same for any further - capabilities it supports, a single nested <code><capability></code> - element will be included for each capability the device supports. - </p> - - <h3><a id="SRIOVCap">SR-IOV capability</a></h3> - <p> - Single root input/output virtualization (SR-IOV) allows sharing of the - PCIe resources by multiple virtual environments. That is achieved by - slicing up a single full-featured physical resource called physical - function (PF) into multiple devices called virtual functions (VFs) sharing - their configuration with the underlying PF. Despite the SR-IOV - specification, the amount of VFs that can be created on a PF varies among - manufacturers. - </p> - - <p> - Suppose the NIC <a href="#PCI">above</a> was also SR-IOV capable, it would - also include a nested - <code><capability></code> element enumerating all virtual - functions available on the physical device (physical port) like in the - example below. - </p> - - <pre> -<capability type='pci'> -... - <capability type='virt_functions' maxCount='7'> - <address domain='0x0000' bus='0x04' slot='0x10' function='0x1'/> - <address domain='0x0000' bus='0x04' slot='0x10' function='0x3'/> - <address domain='0x0000' bus='0x04' slot='0x10' function='0x5'/> - <address domain='0x0000' bus='0x04' slot='0x10' function='0x7'/> - <address domain='0x0000' bus='0x04' slot='0x11' function='0x1'/> - <address domain='0x0000' bus='0x04' slot='0x11' function='0x3'/> - <address domain='0x0000' bus='0x04' slot='0x11' function='0x5'/> - </capability> -... -</capability></pre> - <p> - A SR-IOV child device on the other hand, would then report its top level - capability type as a <code>phys_function</code> instead: - </p> - - <pre> -<device> -... - <capability type='phys_function'> - <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> - </capability> -... -</device></pre> - - <h3><a id="MDEVCap">MDEV capability</a></h3> - <p> - A device capable of creating mediated devices will include a nested - capability <code>mdev_types</code> which enumerates all supported mdev - types on the physical device, along with the type attributes available - through sysfs. A detailed description of the XML format for the - <code>mdev_types</code> capability can be found - <a href="formatnode.html#MDEVTypesCap">here</a>. - </p> - <p> - The following example shows how we might represent an NVIDIA GPU device - that supports mediated devices. See below for <a href="#MDEV">more - information about mediated devices</a>. - </p> - -<pre> -<device> -... - <driver> - <name>nvidia</name> - </driver> - <capability type='pci'> -... - <capability type='mdev_types'> - <type id='nvidia-11'> - <name>GRID M60-0B</name> - <deviceAPI>vfio-pci</deviceAPI> - <availableInstances>16</availableInstances> - </type> - <!-- Here would come the rest of the available mdev types --> - </capability> -... - </capability> -</device></pre> - - <h3><a id="VPDCap">VPD capability</a></h3> - <p> - A device that exposes a PCI/PCIe VPD capability will include a nested - capability <code>vpd</code> which presents data stored in the Vital Product - Data (VPD). VPD provides a device name and a number of other standard-defined - read-only fields (change level, manufacture id, part number, serial number) and - vendor-specific read-only fields. Additionally, if a device supports it, - read-write fields (asset tag, vendor-specific fields or system fields) may - also be present. The VPD capability is optional for PCI/PCIe devices and the - set of exposed fields may vary depending on a device. The XML format follows - the binary format described in "I.3. VPD Definitions" in PCI Local Bus (2.2+) - and the identical format in PCIe 4.0+. At the time of writing, the support for - exposing this capability is only present on Linux-based systems (kernel version - v2.6.26 is the first one to expose VPD via sysfs which Libvirt relies on). - Reading the VPD contents requires root privileges, therefore, - <code>virsh nodedev-dumpxml</code> must be executed accordingly. - A description of the XML format for the <code>vpd</code> capability can - be found <a href="formatnode.html#VPDCap">here</a>. - </p> - <p> - The following example shows a VPD representation for a device that exposes the - VPD capability with read-only and read-write fields. Among other things, - the VPD of this particular device includes a unique board serial number. - </p> -<pre> -<device> - <name>pci_0000_42_00_0</name> - <capability type='pci'> - <class>0x020000</class> - <domain>0</domain> - <bus>66</bus> - <slot>0</slot> - <function>0</function> - <product id='0xa2d6'>MT42822 BlueField-2 integrated ConnectX-6 Dx network controller</product> - <vendor id='0x15b3'>Mellanox Technologies</vendor> - <capability type='virt_functions' maxCount='16'/> - <capability type='vpd'> - <name>BlueField-2 DPU 25GbE Dual-Port SFP56, Crypto Enabled, 16GB on-board DDR, 1GbE OOB management, Tall Bracket</name> - <fields access='readonly'> - <change_level>B1</change_level> - <manufacture_id>foobar</manufacture_id> - <part_number>MBF2H332A-AEEOT</part_number> - <serial_number>MT2113X00000</serial_number> - <vendor_field index='0'>PCIeGen4 x8</vendor_field> - <vendor_field index='2'>MBF2H332A-AEEOT</vendor_field> - <vendor_field index='3'>3c53d07eec484d8aab34dabd24fe575aa</vendor_field> - <vendor_field index='A'>MLX:MN=MLNX:CSKU=V2:UUID=V3:PCI=V0:MODL=BF2H332A</vendor_field> - </fields> - <fields access='readwrite'> - <asset_tag>fooasset</asset_tag> - <vendor_field index='0'>vendorfield0</vendor_field> - <vendor_field index='2'>vendorfield2</vendor_field> - <vendor_field index='A'>vendorfieldA</vendor_field> - <system_field index='B'>systemfieldB</system_field> - <system_field index='0'>systemfield0</system_field> - </fields> - </capability> - <iommuGroup number='65'> - <address domain='0x0000' bus='0x42' slot='0x00' function='0x0'/> - </iommuGroup> - <numa node='0'/> - <pci-express> - <link validity='cap' port='0' speed='16' width='8'/> - <link validity='sta' speed='8' width='8'/> - </pci-express> - </capability> -</device> -</pre> - - <h2><a id="MDEV">Mediated devices (MDEVs)</a></h2> - <p> - Mediated devices (<span class="since">Since 3.2.0</span>) are software - devices defining resource allocation on the backing physical device which - in turn allows the parent physical device's resources to be divided into - several mediated devices, thus sharing the physical device's performance - among multiple guests. Unlike SR-IOV however, where a PCIe device appears - as multiple separate PCIe devices on the host's PCI bus, mediated devices - only appear on the mdev virtual bus. Therefore, no detach/reattach - procedure from/to the host driver procedure is involved even though - mediated devices are used in a direct device assignment manner. A - detailed description of the XML format for the <code>mdev</code> - capability can be found <a href="formatnode.html#mdev">here</a>. - </p> - - <h3>Example of a mediated device</h3> - <pre> -<device> - <name>mdev_4b20d080_1b54_4048_85b3_a6a62d165c01</name> - <path>/sys/devices/pci0000:00/0000:00:02.0/4b20d080-1b54-4048-85b3-a6a62d165c01</path> - <parent>pci_0000_06_00_0</parent> - <driver> - <name>vfio_mdev</name> - </driver> - <capability type='mdev'> - <type id='nvidia-11'/> - <uuid>4b20d080-1b54-4048-85b3-a6a62d165c01</uuid> - <iommuGroup number='12'/> - </capability> -</device></pre> - - <p> - The support of mediated device's framework in libvirt's node device driver - covers the following features: - </p> - - <ul> - <li> - list available mediated devices on the host - (<span class="since">Since 3.4.0</span>) - </li> - <li> - display device details - (<span class="since">Since 3.4.0</span>) - </li> - <li> - create transient mediated devices - (<span class="since">Since 6.5.0</span>) - </li> - <li> - define persistent mediated devices - (<span class="since">Since 7.3.0</span>) - </li> - </ul> - - <p> - Because mediated devices are instantiated from vendor specific templates, - simply called 'types', information describing these types is contained - within the parent device's capabilities (see the example in <a - href="#PCI">PCI host devices</a>). To list all devices capable of - creating mediated devices, the following command can be used. - </p> - <pre>$ virsh nodedev-list --cap mdev_types</pre> - - <p> - To see the supported mediated device types on a specific physical device - use the following: - </p> - - <pre>$ virsh nodedev-dumpxml <device></pre> - - <p> - Before creating a mediated device, unbind the device from the respective - device driver, eg. subchannel I/O driver for a CCW device. Then bind the - device to the respective VFIO driver. For a CCW device, also unbind the - corresponding subchannel of the CCW device from the subchannel I/O driver - and then bind the subchannel (instead of the CCW device) to the vfio_ccw - driver. The below example shows the unbinding and binding steps for a CCW - device. - </p> - - <pre> -device="0.0.1234" -subchannel="0.0.0123" -echo $device > /sys/bus/ccw/devices/$device/driver/unbind -echo $subchannel > /sys/bus/css/devices/$subchannel/driver/unbind -echo $subchannel > /sys/bus/css/drivers/vfio_ccw/bind - </pre> - - <p> - To instantiate a transient mediated device, create an XML file representing the - device. See above for information about the mediated device xml format. - </p> - - <pre>$ virsh nodedev-create <xml-file> -Node device '<device-name>' created from '<xml-file>'</pre> - - <p> - If you would like to persistently define the device so that it will be - maintained across host reboots, use <code>virsh nodedev-define</code> - instead of <code>nodedev-create</code>: - </p> - - <pre>$ virsh nodedev-define <xml-file> -Node device '<device-name>' defined from '<xml-file>'</pre> - - <p> - To start an instance of this device definition, use the following command: - </p> - - <pre>$ virsh nodedev-start <device-name></pre> - <p> - Active mediated device instances can be stopped using <code>virsh - nodedev-destroy</code>, and persistent device definitions can be removed - using <code>virsh nodedev-undefine</code>. - </p> - - <p> - If a mediated device is defined persistently, it can also be set to be - automatically started whenever the host reboots or when the parent device - becomes available. In order to autostart a mediated device, use the - following command: - </p> - - <pre>$ virsh nodedev-autostart <device-name></pre> - </body> -</html> diff --git a/docs/drvnodedev.rst b/docs/drvnodedev.rst new file mode 100644 index 0000000000..cddb36c73b --- /dev/null +++ b/docs/drvnodedev.rst @@ -0,0 +1,348 @@ +.. role:: since + +====================== +Host device management +====================== + +.. contents:: + +Libvirt provides management of both physical and virtual host devices +(historically also referred to as node devices) like USB, PCI, SCSI, and network +devices. This also includes various virtualization capabilities which the +aforementioned devices provide for utilization, for example SR-IOV, NPIV, MDEV, +DRM, etc. + +The node device driver provides means to list and show details about host +devices (``virsh nodedev-list``, ``virsh nodedev-info``, and +``virsh nodedev-dumpxml``), which are generic and can be used with all devices. +It also provides the means to manage virtual devices. Persistently-defined +virtual devices are only supported for mediated devices, while transient devices +are supported by both mediated devices and NPIV (`more info about +NPIV) <https://wiki.libvirt.org/page/NPIV_in_libvirt>`__). + +Persistent virtual devices are managed with ``virsh nodedev-define`` and +``virsh nodedev-undefine``. Persistent devices can be configured to start +manually or automatically using ``virsh nodedev-autostart``. Inactive devices +can be made active with ``virsh nodedev-start``. + +Transient virtual devices are started and stopped with the commands +``virsh nodedev-create`` and ``virsh nodedev-destroy``. + +Devices on the host system are arranged in a tree-like hierarchy, with the root +node being called ``computer``. The node device driver supports udev backend +(HAL backend was removed in ``6.8.0``). + +Details of the XML format of a host device can be found +`here <formatnode.html>`__. Of particular interest is the ``capability`` +element, which describes features supported by the device. Some specific device +types are addressed in more detail below. + +Basic structure of a node device +-------------------------------- + +:: + + <device> + <name>pci_0000_00_17_0</name> + <path>/sys/devices/pci0000:00/0000:00:17.0</path> + <parent>computer</parent> + <driver> + <name>ahci</name> + </driver> + <capability type='pci'> + ... + </capability> + </device> + +PCI host devices +---------------- + +``capability`` + When used as top level element, the supported values for the ``type`` + attribute are ``pci`` and ``phys_function`` (see `SR-IOV + below <#SRIOVCap>`__). + +:: + + <device> + <name>pci_0000_04_00_1</name> + <path>/sys/devices/pci0000:00/0000:00:06.0/0000:04:00.1</path> + <parent>pci_0000_00_06_0</parent> + <driver> + <name>igb</name> + </driver> + <capability type='pci'> + <domain>0</domain> + <bus>4</bus> + <slot>0</slot> + <function>1</function> + <product id='0x10c9'>82576 Gigabit Network Connection</product> + <vendor id='0x8086'>Intel Corporation</vendor> + <iommuGroup number='15'> + <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/> + </iommuGroup> + <numa node='0'/> + <pci-express> + <link validity='cap' port='1' speed='2.5' width='2'/> + <link validity='sta' speed='2.5' width='2'/> + </pci-express> + </capability> + </device> + +The XML format for a PCI device stays the same for any further capabilities it +supports, a single nested ``<capability>`` element will be included for each +capability the device supports. + +SR-IOV capability +~~~~~~~~~~~~~~~~~ + +Single root input/output virtualization (SR-IOV) allows sharing of the PCIe +resources by multiple virtual environments. That is achieved by slicing up a +single full-featured physical resource called physical function (PF) into +multiple devices called virtual functions (VFs) sharing their configuration with +the underlying PF. Despite the SR-IOV specification, the amount of VFs that can +be created on a PF varies among manufacturers. + +Suppose the NIC `above <#PCI>`__ was also SR-IOV capable, it would also include +a nested ``<capability>`` element enumerating all virtual functions available on +the physical device (physical port) like in the example below. + +:: + + <capability type='pci'> + ... + <capability type='virt_functions' maxCount='7'> + <address domain='0x0000' bus='0x04' slot='0x10' function='0x1'/> + <address domain='0x0000' bus='0x04' slot='0x10' function='0x3'/> + <address domain='0x0000' bus='0x04' slot='0x10' function='0x5'/> + <address domain='0x0000' bus='0x04' slot='0x10' function='0x7'/> + <address domain='0x0000' bus='0x04' slot='0x11' function='0x1'/> + <address domain='0x0000' bus='0x04' slot='0x11' function='0x3'/> + <address domain='0x0000' bus='0x04' slot='0x11' function='0x5'/> + </capability> + ... + </capability> + +A SR-IOV child device on the other hand, would then report its top level +capability type as a ``phys_function`` instead: + +:: + + <device> + ... + <capability type='phys_function'> + <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> + </capability> + ... + </device> + +MDEV capability +~~~~~~~~~~~~~~~ + +A device capable of creating mediated devices will include a nested capability +``mdev_types`` which enumerates all supported mdev types on the physical device, +along with the type attributes available through sysfs. A detailed description +of the XML format for the ``mdev_types`` capability can be found +`here <formatnode.html#MDEVTypesCap>`__. + +The following example shows how we might represent an NVIDIA GPU device that +supports mediated devices. See below for `more information about mediated +devices <#MDEV>`__. + +:: + + <device> + ... + <driver> + <name>nvidia</name> + </driver> + <capability type='pci'> + ... + <capability type='mdev_types'> + <type id='nvidia-11'> + <name>GRID M60-0B</name> + <deviceAPI>vfio-pci</deviceAPI> + <availableInstances>16</availableInstances> + </type> + <!-- Here would come the rest of the available mdev types --> + </capability> + ... + </capability> + </device> + +VPD capability +~~~~~~~~~~~~~~ + +A device that exposes a PCI/PCIe VPD capability will include a nested capability +``vpd`` which presents data stored in the Vital Product Data (VPD). VPD provides +a device name and a number of other standard-defined read-only fields (change +level, manufacture id, part number, serial number) and vendor-specific read-only +fields. Additionally, if a device supports it, read-write fields (asset tag, +vendor-specific fields or system fields) may also be present. The VPD capability +is optional for PCI/PCIe devices and the set of exposed fields may vary +depending on a device. The XML format follows the binary format described in +"I.3. VPD Definitions" in PCI Local Bus (2.2+) and the identical format in PCIe +4.0+. At the time of writing, the support for exposing this capability is only +present on Linux-based systems (kernel version v2.6.26 is the first one to +expose VPD via sysfs which Libvirt relies on). Reading the VPD contents requires +root privileges, therefore, ``virsh nodedev-dumpxml`` must be executed +accordingly. A description of the XML format for the ``vpd`` capability can be +found `here <formatnode.html#VPDCap>`__. + +The following example shows a VPD representation for a device that exposes the +VPD capability with read-only and read-write fields. Among other things, the VPD +of this particular device includes a unique board serial number. + +:: + + <device> + <name>pci_0000_42_00_0</name> + <capability type='pci'> + <class>0x020000</class> + <domain>0</domain> + <bus>66</bus> + <slot>0</slot> + <function>0</function> + <product id='0xa2d6'>MT42822 BlueField-2 integrated ConnectX-6 Dx network controller</product> + <vendor id='0x15b3'>Mellanox Technologies</vendor> + <capability type='virt_functions' maxCount='16'/> + <capability type='vpd'> + <name>BlueField-2 DPU 25GbE Dual-Port SFP56, Crypto Enabled, 16GB on-board DDR, 1GbE OOB management, Tall Bracket</name> + <fields access='readonly'> + <change_level>B1</change_level> + <manufacture_id>foobar</manufacture_id> + <part_number>MBF2H332A-AEEOT</part_number> + <serial_number>MT2113X00000</serial_number> + <vendor_field index='0'>PCIeGen4 x8</vendor_field> + <vendor_field index='2'>MBF2H332A-AEEOT</vendor_field> + <vendor_field index='3'>3c53d07eec484d8aab34dabd24fe575aa</vendor_field> + <vendor_field index='A'>MLX:MN=MLNX:CSKU=V2:UUID=V3:PCI=V0:MODL=BF2H332A</vendor_field> + </fields> + <fields access='readwrite'> + <asset_tag>fooasset</asset_tag> + <vendor_field index='0'>vendorfield0</vendor_field> + <vendor_field index='2'>vendorfield2</vendor_field> + <vendor_field index='A'>vendorfieldA</vendor_field> + <system_field index='B'>systemfieldB</system_field> + <system_field index='0'>systemfield0</system_field> + </fields> + </capability> + <iommuGroup number='65'> + <address domain='0x0000' bus='0x42' slot='0x00' function='0x0'/> + </iommuGroup> + <numa node='0'/> + <pci-express> + <link validity='cap' port='0' speed='16' width='8'/> + <link validity='sta' speed='8' width='8'/> + </pci-express> + </capability> + </device> + +Mediated devices (MDEVs) +------------------------ + +Mediated devices ( :since:`Since 3.2.0` ) are software devices defining resource +allocation on the backing physical device which in turn allows the parent +physical device's resources to be divided into several mediated devices, thus +sharing the physical device's performance among multiple guests. Unlike SR-IOV +however, where a PCIe device appears as multiple separate PCIe devices on the +host's PCI bus, mediated devices only appear on the mdev virtual bus. Therefore, +no detach/reattach procedure from/to the host driver procedure is involved even +though mediated devices are used in a direct device assignment manner. A +detailed description of the XML format for the ``mdev`` capability can be found +`here <formatnode.html#mdev>`__. + +Example of a mediated device +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:: + + <device> + <name>mdev_4b20d080_1b54_4048_85b3_a6a62d165c01</name> + <path>/sys/devices/pci0000:00/0000:00:02.0/4b20d080-1b54-4048-85b3-a6a62d165c01</path> + <parent>pci_0000_06_00_0</parent> + <driver> + <name>vfio_mdev</name> + </driver> + <capability type='mdev'> + <type id='nvidia-11'/> + <uuid>4b20d080-1b54-4048-85b3-a6a62d165c01</uuid> + <iommuGroup number='12'/> + </capability> + </device> + +The support of mediated device's framework in libvirt's node device driver +covers the following features: + +- list available mediated devices on the host ( :since:`Since 3.4.0` ) +- display device details ( :since:`Since 3.4.0` ) +- create transient mediated devices ( :since:`Since 6.5.0` ) +- define persistent mediated devices ( :since:`Since 7.3.0` ) + +Because mediated devices are instantiated from vendor specific templates, simply +called 'types', information describing these types is contained within the +parent device's capabilities (see the example in `PCI host devices <#PCI>`__). +To list all devices capable of creating mediated devices, the following command +can be used. + +:: + + $ virsh nodedev-list --cap mdev_types + +To see the supported mediated device types on a specific physical device use the +following: + +:: + + $ virsh nodedev-dumpxml <device> + +Before creating a mediated device, unbind the device from the respective device +driver, eg. subchannel I/O driver for a CCW device. Then bind the device to the +respective VFIO driver. For a CCW device, also unbind the corresponding +subchannel of the CCW device from the subchannel I/O driver and then bind the +subchannel (instead of the CCW device) to the vfio_ccw driver. The below example +shows the unbinding and binding steps for a CCW device. + +:: + + device="0.0.1234" + subchannel="0.0.0123" + echo $device > /sys/bus/ccw/devices/$device/driver/unbind + echo $subchannel > /sys/bus/css/devices/$subchannel/driver/unbind + echo $subchannel > /sys/bus/css/drivers/vfio_ccw/bind + +To instantiate a transient mediated device, create an XML file representing the +device. See above for information about the mediated device xml format. + +:: + + $ virsh nodedev-create <xml-file> + Node device '<device-name>' created from '<xml-file>' + +If you would like to persistently define the device so that it will be +maintained across host reboots, use ``virsh nodedev-define`` instead of +``nodedev-create``: + +:: + + $ virsh nodedev-define <xml-file> + Node device '<device-name>' defined from '<xml-file>' + +To start an instance of this device definition, use the following command: + +:: + + $ virsh nodedev-start <device-name> + +Active mediated device instances can be stopped using +``virsh nodedev-destroy``, and persistent device definitions can be +removed using ``virsh nodedev-undefine``. + +If a mediated device is defined persistently, it can also be set to be +automatically started whenever the host reboots or when the parent device +becomes available. In order to autostart a mediated device, use the following +command: + +:: + + $ virsh nodedev-autostart <device-name> diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index e492532004..4fb2e1a9f4 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -4166,7 +4166,8 @@ or: specifies the device API which determines how the host's vfio driver will expose the device to the guest. Currently, ``model='vfio-pci'``, ``model='vfio-ccw'`` ( :since:`Since 4.4.0` ) and ``model='vfio-ap'`` ( - :since:`Since 4.9.0` ) is supported. `MDEV <drvnodedev.html#MDEV>`__ + :since:`Since 4.9.0` ) is supported. + `MDEV <drvnodedev.html#mediated-devices-mdevs>`__ section provides more information about mediated devices as well as how to create mediated devices on the host. :since:`Since 4.6.0 (QEMU 2.12)` an optional ``display`` attribute may be used to enable or disable support diff --git a/docs/meson.build b/docs/meson.build index fdf1714da8..bf5a978b07 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvnodedev', 'drvopenvz', 'drvsecret', 'drvtest', @@ -80,6 +79,7 @@ docs_rst_files = [ 'drvesx', 'drvhyperv', 'drvlxc', + 'drvnodedev', 'drvqemu', 'errors', 'formatbackup', -- 2.35.1

On Mon, Mar 28, 2022 at 02:10:22PM +0200, Peter Krempa wrote:
Fix one cross link anchor along with the conversion.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvopenvz.html.in | 123 ----------------------------------------- docs/drvopenvz.rst | 97 ++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 98 insertions(+), 124 deletions(-) delete mode 100644 docs/drvopenvz.html.in create mode 100644 docs/drvopenvz.rst diff --git a/docs/drvopenvz.html.in b/docs/drvopenvz.html.in deleted file mode 100644 index 64a75e3fec..0000000000 --- a/docs/drvopenvz.html.in +++ /dev/null @@ -1,123 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>OpenVZ container driver</h1> - - <ul id="toc"></ul> - - <p> - The OpenVZ driver for libvirt allows use and management of container - based virtualization on a Linux host OS. Prior to using the OpenVZ - driver, the OpenVZ enabled kernel must be installed & booted, and the - OpenVZ userspace tools installed. The libvirt driver has been tested - with OpenVZ 3.0.22, but other 3.0.x versions should also work without - undue trouble. - </p> - - <h2><a id="project">Project Links</a></h2> - - <ul> - <li> - The <a href="https://openvz.org/">OpenVZ</a> Linux container - system - </li> - </ul> - - <h2><a id="connections">Connections to OpenVZ driver</a></h2> - - <p> - The libvirt OpenVZ driver is a single-instance privileged driver, - with a driver name of 'openvz'. Some example connection URIs for - the libvirt driver are: - </p> - -<pre> -openvz:///system (local access) -openvz+unix:///system (local access) -openvz://example.com/system (remote access, TLS/x509) -openvz+tcp://example.com/system (remote access, SASl/Kerberos) -openvz+ssh://root@example.com/system (remote access, SSH tunnelled) -</pre> - - <h2><a id="notes">Notes on bridged networking</a></h2> - - <p> - Bridged networking enables a guest domain (ie container) to have its - network interface connected directly to the host's physical LAN. Before - this can be used there are a couple of configuration pre-requisites for - the host OS. - </p> - - <h3><a id="host">Host network devices</a></h3> - - <p> - One or more of the physical devices must be attached to a bridge. The - process for this varies according to the operating system in use, so - for up to date notes consult the <a href="https://wiki.libvirt.org">Wiki</a> - or your operating system's networking documentation. The basic idea is - that the host OS should end up with a bridge device "br0" containing a - physical device "eth0", or a bonding device "bond0". - </p> - - <h3><a id="tools">OpenVZ tools configuration</a></h3> - - <p> - OpenVZ releases later than 3.0.23 ship with a standard network device - setup script that is able to setup bridging, named - <code>/usr/sbin/vznetaddbr</code>. For releases prior to 3.0.23, this - script must be created manually by the host OS administrator. The - simplest way is to just download the latest version of this script - from a newer OpenVZ release, or upstream source repository. Then - a generic configuration file <code>/etc/vz/vznet.conf</code> - must be created containing - </p> - -<pre> -#!/bin/bash -EXTERNAL_SCRIPT="/usr/sbin/vznetaddbr" -</pre> - - <p> - The host OS is now ready to allow bridging of guest containers, which - will work whether the container is started with libvirt, or OpenVZ - tools. - </p> - - - <h2><a id="example">Example guest domain XML configuration</a></h2> - - <p> - The current libvirt OpenVZ driver has a restriction that the - domain names must match the OpenVZ container VEID, which by - convention start at 100, and are incremented from there. The - choice of OS template to use inside the container is determined - by the <code>filesystem</code> tag, and the template source name - matches the templates known to OpenVZ tools. - </p> - -<pre> -<domain type='openvz' id='104'> - <name>104</name> - <uuid>86c12009-e591-a159-6e9f-91d18b85ef78</uuid> - <vcpu>3</vcpu> - <os> - <type>exe</type> - <init>/sbin/init</init> - </os> - <devices> - <filesystem type='template'> - <source name='fedora-9-i386-minimal'/> - <target dir='/'/> - </filesystem> - <interface type='bridge'> - <mac address='00:18:51:5b:ea:bf'/> - <source bridge='br0'/> - <target dev='veth101.0'/> - </interface> - </devices> -</domain> -</pre> - - </body> -</html> diff --git a/docs/drvopenvz.rst b/docs/drvopenvz.rst new file mode 100644 index 0000000000..ff6e1f994d --- /dev/null +++ b/docs/drvopenvz.rst @@ -0,0 +1,97 @@ +======================= +OpenVZ container driver +======================= + +.. contents:: + +The OpenVZ driver for libvirt allows use and management of container based +virtualization on a Linux host OS. Prior to using the OpenVZ driver, the OpenVZ +enabled kernel must be installed & booted, and the OpenVZ userspace tools +installed. The libvirt driver has been tested with OpenVZ 3.0.22, but other +3.0.x versions should also work without undue trouble. + +Project Links +------------- + +- The `OpenVZ <https://openvz.org/>`__ Linux container system + +Connections to OpenVZ driver +---------------------------- + +The libvirt OpenVZ driver is a single-instance privileged driver, with a driver +name of 'openvz'. Some example connection URIs for the libvirt driver are: + +:: + + openvz:///system (local access) + openvz+unix:///system (local access) + openvz://example.com/system (remote access, TLS/x509) + openvz+tcp://example.com/system (remote access, SASl/Kerberos) + openvz+ssh://root@example.com/system (remote access, SSH tunnelled) + +Notes on bridged networking +--------------------------- + +Bridged networking enables a guest domain (ie container) to have its network +interface connected directly to the host's physical LAN. Before this can be used +there are a couple of configuration pre-requisites for the host OS. + +Host network devices +~~~~~~~~~~~~~~~~~~~~ + +One or more of the physical devices must be attached to a bridge. The process +for this varies according to the operating system in use, so for up to date +notes consult the `Wiki <https://wiki.libvirt.org>`__ or your operating system's +networking documentation. The basic idea is that the host OS should end up with +a bridge device "br0" containing a physical device "eth0", or a bonding device +"bond0". + +OpenVZ tools configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +OpenVZ releases later than 3.0.23 ship with a standard network device setup +script that is able to setup bridging, named ``/usr/sbin/vznetaddbr``. For +releases prior to 3.0.23, this script must be created manually by the host OS +administrator. The simplest way is to just download the latest version of this +script from a newer OpenVZ release, or upstream source repository. Then a +generic configuration file ``/etc/vz/vznet.conf`` must be created containing + +:: + + #!/bin/bash + EXTERNAL_SCRIPT="/usr/sbin/vznetaddbr" + +The host OS is now ready to allow bridging of guest containers, which will work +whether the container is started with libvirt, or OpenVZ tools. + +Example guest domain XML configuration +-------------------------------------- + +The current libvirt OpenVZ driver has a restriction that the domain names must +match the OpenVZ container VEID, which by convention start at 100, and are +incremented from there. The choice of OS template to use inside the container is +determined by the ``filesystem`` tag, and the template source name matches the +templates known to OpenVZ tools. + +:: + + <domain type='openvz' id='104'> + <name>104</name> + <uuid>86c12009-e591-a159-6e9f-91d18b85ef78</uuid> + <vcpu>3</vcpu> + <os> + <type>exe</type> + <init>/sbin/init</init> + </os> + <devices> + <filesystem type='template'> + <source name='fedora-9-i386-minimal'/> + <target dir='/'/> + </filesystem> + <interface type='bridge'> + <mac address='00:18:51:5b:ea:bf'/> + <source bridge='br0'/> + <target dev='veth101.0'/> + </interface> + </devices> + </domain> diff --git a/docs/meson.build b/docs/meson.build index bf5a978b07..d936091091 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvopenvz', 'drvsecret', 'drvtest', 'drvvbox', @@ -80,6 +79,7 @@ docs_rst_files = [ 'drvhyperv', 'drvlxc', 'drvnodedev', + 'drvopenvz', 'drvqemu', 'errors', 'formatbackup', -- 2.35.1

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvsecret.html.in | 82 ------------------------------------------ docs/drvsecret.rst | 65 +++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 66 insertions(+), 83 deletions(-) delete mode 100644 docs/drvsecret.html.in create mode 100644 docs/drvsecret.rst diff --git a/docs/drvsecret.html.in b/docs/drvsecret.html.in deleted file mode 100644 index 1bd7d75215..0000000000 --- a/docs/drvsecret.html.in +++ /dev/null @@ -1,82 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Secret information management</h1> - - <p> - The secrets driver in libvirt provides a simple interface for - storing and retrieving secret information. - </p> - - <h2><a id="uris">Connections to SECRET driver</a></h2> - - <p> - The libvirt SECRET driver is a multi-instance driver, providing a single - system wide privileged driver (the "system" instance), and per-user - unprivileged drivers (the "session" instance). A connection to the secret - driver is automatically available when opening a connection to one of the - stateful primary hypervisor drivers. It is none the less also possible to - explicitly open just the secret driver, using the URI protocol "secret" - Some example connection URIs for the driver are: - </p> - -<pre> -secret:///session (local access to per-user instance) -secret+unix:///session (local access to per-user instance) - -secret:///system (local access to system instance) -secret+unix:///system (local access to system instance) -secret://example.com/system (remote access, TLS/x509) -secret+tcp://example.com/system (remote access, SASl/Kerberos) -secret+ssh://root@example.com/system (remote access, SSH tunnelled) -</pre> - - <h3><a id="uriembedded">Embedded driver</a></h3> - - <p> - Since 6.1.0 the secret driver has experimental support for operating - in an embedded mode. In this scenario, rather than connecting to - the libvirtd daemon, the secret driver runs in the client application - process directly. To open the driver in embedded mode the app use the - new URI path and specify a virtual root directory under which the - driver will create content. - </p> - - <pre> - secret:///embed?root=/some/dir - </pre> - - <p> - Under the specified root directory the following locations will - be used - </p> - - <pre> -/some/dir - | - +- etc - | | - | +- secrets - | - +- run - | - +- secrets - </pre> - - <p> - The application is responsible for recursively purging the contents - of this directory tree once they no longer require a connection, - though it can also be left intact for reuse when opening a future - connection. - </p> - - <p> - The range of functionality is intended to be on a par with that - seen when using the traditional system or session libvirt connections - to QEMU. Normal practice would be to open the secret driver in embedded - mode any time one of the other drivers is opened in embedded mode so - that the two drivers can interact in-process. - </p> - </body> -</html> diff --git a/docs/drvsecret.rst b/docs/drvsecret.rst new file mode 100644 index 0000000000..76a9097d2b --- /dev/null +++ b/docs/drvsecret.rst @@ -0,0 +1,65 @@ +============================= +Secret information management +============================= + +The secrets driver in libvirt provides a simple interface for storing and +retrieving secret information. + +Connections to SECRET driver +---------------------------- + +The libvirt SECRET driver is a multi-instance driver, providing a single system +wide privileged driver (the "system" instance), and per-user unprivileged +drivers (the "session" instance). A connection to the secret driver is +automatically available when opening a connection to one of the stateful primary +hypervisor drivers. It is none the less also possible to explicitly open just +the secret driver, using the URI protocol "secret" Some example connection URIs +for the driver are: + +:: + + secret:///session (local access to per-user instance) + secret+unix:///session (local access to per-user instance) + + secret:///system (local access to system instance) + secret+unix:///system (local access to system instance) + secret://example.com/system (remote access, TLS/x509) + secret+tcp://example.com/system (remote access, SASl/Kerberos) + secret+ssh://root@example.com/system (remote access, SSH tunnelled) + +Embedded driver +~~~~~~~~~~~~~~~ + +Since 6.1.0 the secret driver has experimental support for operating in an +embedded mode. In this scenario, rather than connecting to the libvirtd daemon, +the secret driver runs in the client application process directly. To open the +driver in embedded mode the app use the new URI path and specify a virtual root +directory under which the driver will create content. + +:: + + secret:///embed?root=/some/dir + +Under the specified root directory the following locations will be used + +:: + + /some/dir + | + +- etc + | | + | +- secrets + | + +- run + | + +- secrets + +The application is responsible for recursively purging the contents of this +directory tree once they no longer require a connection, though it can also be +left intact for reuse when opening a future connection. + +The range of functionality is intended to be on a par with that seen when using +the traditional system or session libvirt connections to QEMU. Normal practice +would be to open the secret driver in embedded mode any time one of the other +drivers is opened in embedded mode so that the two drivers can interact +in-process. diff --git a/docs/meson.build b/docs/meson.build index d936091091..8499b3d595 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvsecret', 'drvtest', 'drvvbox', 'drvvirtuozzo', @@ -81,6 +80,7 @@ docs_rst_files = [ 'drvnodedev', 'drvopenvz', 'drvqemu', + 'drvsecret', 'errors', 'formatbackup', 'formatcheckpoint', -- 2.35.1

The first sentence was moved up a paragraph to stop treating the first sub-heading as a page subtitle. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvtest.html.in | 27 --------------------------- docs/drvtest.rst | 21 +++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 22 insertions(+), 28 deletions(-) delete mode 100644 docs/drvtest.html.in create mode 100644 docs/drvtest.rst diff --git a/docs/drvtest.html.in b/docs/drvtest.html.in deleted file mode 100644 index 6884184e6f..0000000000 --- a/docs/drvtest.html.in +++ /dev/null @@ -1,27 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Test "mock" driver</h1> - - <h2>Connections to Test driver</h2> - - <p> - The libvirt Test driver is a per-process fake hypervisor driver, - with a driver name of 'test'. The driver maintains all its state - in memory. It can start with a pre-configured default config, or - be given a path to an alternate config. Some example connection URIs - for the libvirt driver are: - </p> - -<pre> -test:///default (local access, default config) -test:///path/to/driver/config.xml (local access, custom config) -test+unix:///default (local access, default config, via daemon) -test://example.com/default (remote access, TLS/x509) -test+tcp://example.com/default (remote access, SASl/Kerberos) -test+ssh://root@example.com/default (remote access, SSH tunnelled) -</pre> - - </body> -</html> diff --git a/docs/drvtest.rst b/docs/drvtest.rst new file mode 100644 index 0000000000..99578a47ba --- /dev/null +++ b/docs/drvtest.rst @@ -0,0 +1,21 @@ +================== +Test "mock" driver +================== + +The libvirt ``test`` driver is a per-process fake hypervisor driver. + +Connections to Test driver +-------------------------- + +The driver maintains all its state in memory. It can start with +a pre-configured default config, or be given a path to an alternate config. Some +example connection URIs for the libvirt driver are: + +:: + + test:///default (local access, default config) + test:///path/to/driver/config.xml (local access, custom config) + test+unix:///default (local access, default config, via daemon) + test://example.com/default (remote access, TLS/x509) + test+tcp://example.com/default (remote access, SASl/Kerberos) + test+ssh://root@example.com/default (remote access, SSH tunnelled) diff --git a/docs/meson.build b/docs/meson.build index 8499b3d595..5995b2ec91 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvtest', 'drvvbox', 'drvvirtuozzo', 'drvvmware', @@ -81,6 +80,7 @@ docs_rst_files = [ 'drvopenvz', 'drvqemu', 'drvsecret', + 'drvtest', 'errors', 'formatbackup', 'formatcheckpoint', -- 2.35.1

On Mon, Mar 28, 2022 at 02:10:25PM +0200, Peter Krempa wrote:
The first sentence was moved up a paragraph to stop treating the first sub-heading as a page subtitle.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvvbox.html.in | 172 ------------------------------------------- docs/drvvbox.rst | 161 ++++++++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 162 insertions(+), 173 deletions(-) delete mode 100644 docs/drvvbox.html.in create mode 100644 docs/drvvbox.rst diff --git a/docs/drvvbox.html.in b/docs/drvvbox.html.in deleted file mode 100644 index 0c0d14fa6a..0000000000 --- a/docs/drvvbox.html.in +++ /dev/null @@ -1,172 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>VirtualBox hypervisor driver</h1> - <p> - The libvirt VirtualBox driver can manage any VirtualBox version - from version 4.0 onwards - (<span class="since">since libvirt 3.0.0</span>). - </p> - - <h2><a id="project">Project Links</a></h2> - - <ul> - <li> - The <a href="https://www.virtualbox.org/">VirtualBox</a> - hypervisor - </li> - </ul> - - <h2>Connections to VirtualBox driver</h2> - - <p> - The libvirt VirtualBox driver provides per-user drivers (the "session" instance). - The uri of the driver protocol is "vbox". Some example connection URIs for the driver are: - </p> - -<pre> -vbox:///session (local access to per-user instance) -vbox+unix:///session (local access to per-user instance) -vbox+tcp://user@example.com/session (remote access, SASl/Kerberos) -vbox+ssh://user@example.com/session (remote access, SSH tunnelled) -</pre> - - <p> - <strong>NOTE: as of libvirt 1.0.6, the VirtualBox driver will always - run inside the libvirtd daemon, instead of being built-in to the - libvirt.so library directly. This change was required due to the - fact that VirtualBox code is LGPLv2-only licensed, which is not - compatible with the libvirt.so license of LGPLv2-or-later. The - daemon will be auto-started when the first connection to VirtualBox - is requested. This change also means that it will not be possible - to use VirtualBox URIs on the Windows platform, until additional - work is completed to get the libvirtd daemon working there.</strong> - </p> - - <h2><a id="xmlconfig">Example domain XML config</a></h2> - -<pre> -<domain type='vbox'> - <name>vbox</name> - <uuid>4dab22b31d52d8f32516782e98ab3fa0</uuid> - - <os> - <type>hvm</type> - <boot dev='cdrom'/> - <boot dev='hd'/> - <boot dev='fd'/> - <boot dev='network'/> - </os> - - <memory>654321</memory> - <vcpu>1</vcpu> - - <features> - <pae/> - <acpi/> - <apic/> - </features> - - <devices> - <!--Set IDE controller model to PIIX4 (default PIIX3)--> - <controller type='ide' model='piix4'/> - - <controller type='scsi' index='0'/> - - <!--VirtualBox SAS Controller--> - <controller type='scsi' index='1' model='lsisas1068'/> - - <disk type='file' device='cdrom'> - <source file='/home/user/Downloads/slax-6.0.9.iso'/> - <target dev='hdc'/> - <readonly/> - </disk> - - <disk type='file' device='disk'> - <source file='/home/user/tmp/vbox.vdi'/> - <target dev='hdd'/> - </disk> - - <!--Attach to the SCSI controller (index=0, default)--> - <disk type='file' device='disk'> - <source file='/home/user/tmp/vbox2.vdi'/> - <target dev='sda' bus='scsi'/> - </disk> - - <!--Attach to the SAS controller (index=1)--> - <disk type='file' device='disk'> - <source file='/home/user/tmp/vbox3.vdi'/> - <target dev='sda' bus='scsi'/> - <address type='drive' controller='1' bus='0' target='0' unit='0'/> - </disk> - - <disk type='file' device='floppy'> - <source file='/home/user/tmp/WIN98C.IMG'/> - <target dev='fda'/> - </disk> - - <filesystem type='mount'> - <source dir='/home/user/stuff'/> - <target dir='my-shared-folder'/> - </filesystem> - - <!--BRIDGE--> - <interface type='bridge'> - <source bridge='eth0'/> - <mac address='00:16:3e:5d:c7:9e'/> - <model type='am79c973'/> - </interface> - - <!--NAT--> - <interface type='user'> - <mac address='56:16:3e:5d:c7:9e'/> - <model type='82540eM'/> - </interface> - - <graphics type='desktop'/> - - <!--Activate the VRDE server with a port in 3389-3689 range--> - <graphics type='rdp' autoport='yes' multiUser='yes'/> - - <sound model='sb16'/> - - <parallel type='dev'> - <source path='/dev/pts/1'/> - <target port='0'/> - </parallel> - - <parallel type='dev'> - <source path='/dev/pts/2'/> - <target port='1'/> - </parallel> - - <serial type="dev"> - <source path="/dev/ttyS0"/> - <target port="0"/> - </serial> - - <serial type="pipe"> - <source path="/tmp/serial.txt"/> - <target port="1"/> - </serial> - - <hostdev mode='subsystem' type='usb'> - <source> - <vendor id='0x1234'/> - <product id='0xbeef'/> - </source> - </hostdev> - - <hostdev mode='subsystem' type='usb'> - <source> - <vendor id='0x4321'/> - <product id='0xfeeb'/> - </source> - </hostdev> - </devices> -</domain> -</pre> - - </body> -</html> diff --git a/docs/drvvbox.rst b/docs/drvvbox.rst new file mode 100644 index 0000000000..5154280ca2 --- /dev/null +++ b/docs/drvvbox.rst @@ -0,0 +1,161 @@ +.. role:: since + +============================ +VirtualBox hypervisor driver +============================ + +The libvirt VirtualBox driver can manage any VirtualBox version from version 4.0 +onwards ( :since:`since libvirt 3.0.0` ). + +Project Links +------------- + +- The `VirtualBox <https://www.virtualbox.org/>`__ hypervisor + +Connections to VirtualBox driver +-------------------------------- + +The libvirt VirtualBox driver provides per-user drivers (the "session" +instance). The uri of the driver protocol is "vbox". Some example connection +URIs for the driver are: + +:: + + vbox:///session (local access to per-user instance) + vbox+unix:///session (local access to per-user instance) + vbox+tcp://user@example.com/session (remote access, SASl/Kerberos) + vbox+ssh://user@example.com/session (remote access, SSH tunnelled) + +**NOTE: as of libvirt 1.0.6, the VirtualBox driver will always run inside the +libvirtd daemon, instead of being built-in to the libvirt.so library directly. +This change was required due to the fact that VirtualBox code is LGPLv2-only +licensed, which is not compatible with the libvirt.so license of +LGPLv2-or-later. The daemon will be auto-started when the first connection to +VirtualBox is requested. This change also means that it will not be possible to +use VirtualBox URIs on the Windows platform, until additional work is completed +to get the libvirtd daemon working there.** + +Example domain XML config +------------------------- + +:: + + <domain type='vbox'> + <name>vbox</name> + <uuid>4dab22b31d52d8f32516782e98ab3fa0</uuid> + + <os> + <type>hvm</type> + <boot dev='cdrom'/> + <boot dev='hd'/> + <boot dev='fd'/> + <boot dev='network'/> + </os> + + <memory>654321</memory> + <vcpu>1</vcpu> + + <features> + <pae/> + <acpi/> + <apic/> + </features> + + <devices> + <!--Set IDE controller model to PIIX4 (default PIIX3)--> + <controller type='ide' model='piix4'/> + + <controller type='scsi' index='0'/> + + <!--VirtualBox SAS Controller--> + <controller type='scsi' index='1' model='lsisas1068'/> + + <disk type='file' device='cdrom'> + <source file='/home/user/Downloads/slax-6.0.9.iso'/> + <target dev='hdc'/> + <readonly/> + </disk> + + <disk type='file' device='disk'> + <source file='/home/user/tmp/vbox.vdi'/> + <target dev='hdd'/> + </disk> + + <!--Attach to the SCSI controller (index=0, default)--> + <disk type='file' device='disk'> + <source file='/home/user/tmp/vbox2.vdi'/> + <target dev='sda' bus='scsi'/> + </disk> + + <!--Attach to the SAS controller (index=1)--> + <disk type='file' device='disk'> + <source file='/home/user/tmp/vbox3.vdi'/> + <target dev='sda' bus='scsi'/> + <address type='drive' controller='1' bus='0' target='0' unit='0'/> + </disk> + + <disk type='file' device='floppy'> + <source file='/home/user/tmp/WIN98C.IMG'/> + <target dev='fda'/> + </disk> + + <filesystem type='mount'> + <source dir='/home/user/stuff'/> + <target dir='my-shared-folder'/> + </filesystem> + + <!--BRIDGE--> + <interface type='bridge'> + <source bridge='eth0'/> + <mac address='00:16:3e:5d:c7:9e'/> + <model type='am79c973'/> + </interface> + + <!--NAT--> + <interface type='user'> + <mac address='56:16:3e:5d:c7:9e'/> + <model type='82540eM'/> + </interface> + + <graphics type='desktop'/> + + <!--Activate the VRDE server with a port in 3389-3689 range--> + <graphics type='rdp' autoport='yes' multiUser='yes'/> + + <sound model='sb16'/> + + <parallel type='dev'> + <source path='/dev/pts/1'/> + <target port='0'/> + </parallel> + + <parallel type='dev'> + <source path='/dev/pts/2'/> + <target port='1'/> + </parallel> + + <serial type="dev"> + <source path="/dev/ttyS0"/> + <target port="0"/> + </serial> + + <serial type="pipe"> + <source path="/tmp/serial.txt"/> + <target port="1"/> + </serial> + + <hostdev mode='subsystem' type='usb'> + <source> + <vendor id='0x1234'/> + <product id='0xbeef'/> + </source> + </hostdev> + + <hostdev mode='subsystem' type='usb'> + <source> + <vendor id='0x4321'/> + <product id='0xfeeb'/> + </source> + </hostdev> + </devices> + </domain> diff --git a/docs/meson.build b/docs/meson.build index 5995b2ec91..954c4e4b96 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvvbox', 'drvvirtuozzo', 'drvvmware', 'drvxen', @@ -81,6 +80,7 @@ docs_rst_files = [ 'drvqemu', 'drvsecret', 'drvtest', + 'drvvbox', 'errors', 'formatbackup', 'formatcheckpoint', -- 2.35.1

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvvirtuozzo.html.in | 70 --------------------------------------- docs/drvvirtuozzo.rst | 60 +++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 61 insertions(+), 71 deletions(-) delete mode 100644 docs/drvvirtuozzo.html.in create mode 100644 docs/drvvirtuozzo.rst diff --git a/docs/drvvirtuozzo.html.in b/docs/drvvirtuozzo.html.in deleted file mode 100644 index e47f72bad1..0000000000 --- a/docs/drvvirtuozzo.html.in +++ /dev/null @@ -1,70 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Virtuozzo driver</h1> - <ul id="toc"></ul> - <p> - The libvirt vz driver can manage Virtuozzo starting from version 6.0. - </p> - - - <h2><a id="project">Project Links</a></h2> - <ul> - <li> - The <a href="https://www.virtuozzo.com/">Virtuozzo</a> Solution. - </li> - </ul> - - - <h2><a id="uri">Connections to the Virtuozzo driver</a></h2> - <p> - The libvirt Virtuozzo driver is a single-instance privileged driver, with a driver name of 'virtuozzo'. Some example connection URIs for the libvirt driver are: - </p> -<pre> -vz:///system (local access) -vz+unix:///system (local access) -vz://example.com/system (remote access, TLS/x509) -vz+tcp://example.com/system (remote access, SASl/Kerberos) -vz+ssh://root@example.com/system (remote access, SSH tunnelled) -</pre> - - <h2><a id="example">Example guest domain XML configuration</a></h2> - - <p> - Virtuozzo driver require at least one hard disk for new domains - at this time. It is used for defining directory, where VM should - be created. - </p> - -<pre> -<domain type='vz'> - <name>demo</name> - <uuid>54cdecad-4492-4e31-a209-33cc21d64057</uuid> - <description>some description</description> - <memory unit='KiB'>1048576</memory> - <currentMemory unit='KiB'>1048576</currentMemory> - <vcpu placement='static'>2</vcpu> - <os> - <type arch='x86_64'>hvm</type> - </os> - <clock offset='utc'/> - <on_poweroff>destroy</on_poweroff> - <on_reboot>destroy</on_reboot> - <on_crash>destroy</on_crash> - <devices> - <disk type='file' device='disk'> - <source file='/storage/vol1'/> - <target dev='hda'/> - </disk> - <video> - <model type='vga' vram='33554432' heads='1'> - <acceleration accel3d='no' accel2d='no'/> - </model> - </video> - </devices> -</domain> - -</pre> - -</body></html> diff --git a/docs/drvvirtuozzo.rst b/docs/drvvirtuozzo.rst new file mode 100644 index 0000000000..fbb6ab0e71 --- /dev/null +++ b/docs/drvvirtuozzo.rst @@ -0,0 +1,60 @@ +================ +Virtuozzo driver +================ + +The libvirt vz driver can manage Virtuozzo starting from version 6.0. + +Project Links +------------- + +- The `Virtuozzo <https://www.virtuozzo.com/>`__ Solution. + +Connections to the Virtuozzo driver +----------------------------------- + +The libvirt Virtuozzo driver is a single-instance privileged driver, with a +driver name of 'virtuozzo'. Some example connection URIs for the libvirt driver +are: + +:: + + vz:///system (local access) + vz+unix:///system (local access) + vz://example.com/system (remote access, TLS/x509) + vz+tcp://example.com/system (remote access, SASl/Kerberos) + vz+ssh://root@example.com/system (remote access, SSH tunnelled) + +Example guest domain XML configuration +-------------------------------------- + +Virtuozzo driver require at least one hard disk for new domains at this time. It +is used for defining directory, where VM should be created. + +:: + + <domain type='vz'> + <name>demo</name> + <uuid>54cdecad-4492-4e31-a209-33cc21d64057</uuid> + <description>some description</description> + <memory unit='KiB'>1048576</memory> + <currentMemory unit='KiB'>1048576</currentMemory> + <vcpu placement='static'>2</vcpu> + <os> + <type arch='x86_64'>hvm</type> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>destroy</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <disk type='file' device='disk'> + <source file='/storage/vol1'/> + <target dev='hda'/> + </disk> + <video> + <model type='vga' vram='33554432' heads='1'> + <acceleration accel3d='no' accel2d='no'/> + </model> + </video> + </devices> + </domain> diff --git a/docs/meson.build b/docs/meson.build index 954c4e4b96..99732f57ba 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvvirtuozzo', 'drvvmware', 'drvxen', 'firewall', @@ -81,6 +80,7 @@ docs_rst_files = [ 'drvsecret', 'drvtest', 'drvvbox', + 'drvvirtuozzo', 'errors', 'formatbackup', 'formatcheckpoint', -- 2.35.1

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvvmware.html.in | 89 ------------------------------------------ docs/drvvmware.rst | 72 ++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 73 insertions(+), 90 deletions(-) delete mode 100644 docs/drvvmware.html.in create mode 100644 docs/drvvmware.rst diff --git a/docs/drvvmware.html.in b/docs/drvvmware.html.in deleted file mode 100644 index d581ad1d1c..0000000000 --- a/docs/drvvmware.html.in +++ /dev/null @@ -1,89 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>VMware Workstation / Player / Fusion hypervisors driver</h1> - <p> - The libvirt VMware driver should be able to manage any Workstation, - Player, Fusion version supported by the VMware VIX API. See the - compatibility list - <a href="https://www.vmware.com/support/developer/vix-api/vix110_reference/">here</a>. - </p> - <p> - This driver uses the "vmrun" utility which is distributed with the VMware VIX API. - You can download the VIX API - from <a href="https://www.vmware.com/support/developer/vix-api/">here</a>. - </p> - - <h2><a id="project">Project Links</a></h2> - - <ul> - <li> - The <a href="https://www.vmware.com/">VMware Workstation and - Player</a> hypervisors - </li> - <li> - The <a href="https://www.vmware.com/fusion">VMware Fusion</a> - hypervisor - </li> - </ul> - - <h2>Connections to VMware driver</h2> - - <p> - The libvirt VMware driver provides per-user drivers (the "session" instance). - Three uris are available: - </p> - <ul> - <li>"vmwareplayer" for VMware Player</li> - <li>"vmwarews" for VMware Workstation</li> - <li>"vmwarefusion" for VMware Fusion</li> - </ul> - <p> - Some example connection URIs for the driver are: - </p> - -<pre> -vmwareplayer:///session (local access to VMware Player per-user instance) -vmwarews:///session (local access to VMware Workstation per-user instance) -vmwarefusion:///session (local access to VMware Fusion per-user instance) -vmwarews+tcp://user@example.com/session (remote access to VMware Workstation, SASl/Kerberos) -vmwarews+ssh://user@example.com/session (remote access to VMware Workstation, SSH tunnelled) -</pre> - - <h2><a id="xmlconfig">Example domain XML config</a></h2> - -<pre> -<domain type='vmware'> - <name>vmware</name> - <uuid>bea92244-8885-4562-828b-3b086731c5b1</uuid> - - <os> - <type>hvm</type> - </os> - - <memory>524288</memory> - <vcpu>1</vcpu> - - <features> - <pae/> - <acpi/> - </features> - - <devices> - <disk type='file' device='disk'> - <source file='/home/user/tmp/disk.vmdk'/> - <target bus='ide' dev='hda'/> - </disk> - - <interface type='bridge'> - <target dev='/dev/vmnet1'/> - <source bridge=''/> - <mac address='00:16:3e:5d:c7:9e'/> - </interface> - </devices> -</domain> -</pre> - - </body> -</html> diff --git a/docs/drvvmware.rst b/docs/drvvmware.rst new file mode 100644 index 0000000000..6db1a78a17 --- /dev/null +++ b/docs/drvvmware.rst @@ -0,0 +1,72 @@ +======================================================= +VMware Workstation / Player / Fusion hypervisors driver +======================================================= + +The libvirt VMware driver should be able to manage any Workstation, Player, +Fusion version supported by the VMware VIX API. See the compatibility list +`here <https://www.vmware.com/support/developer/vix-api/vix110_reference/>`__. + +This driver uses the "vmrun" utility which is distributed with the VMware VIX +API. You can download the VIX API from +`here <https://www.vmware.com/support/developer/vix-api/>`__. + +Project Links +------------- + +- The `VMware Workstation and Player <https://www.vmware.com/>`__ hypervisors +- The `VMware Fusion <https://www.vmware.com/fusion>`__ hypervisor + +Connections to VMware driver +---------------------------- + +The libvirt VMware driver provides per-user drivers (the "session" instance). +Three uris are available: + +- "vmwareplayer" for VMware Player +- "vmwarews" for VMware Workstation +- "vmwarefusion" for VMware Fusion + +Some example connection URIs for the driver are: + +:: + + vmwareplayer:///session (local access to VMware Player per-user instance) + vmwarews:///session (local access to VMware Workstation per-user instance) + vmwarefusion:///session (local access to VMware Fusion per-user instance) + vmwarews+tcp://user@example.com/session (remote access to VMware Workstation, SASl/Kerberos) + vmwarews+ssh://user@example.com/session (remote access to VMware Workstation, SSH tunnelled) + +Example domain XML config +------------------------- + +:: + + <domain type='vmware'> + <name>vmware</name> + <uuid>bea92244-8885-4562-828b-3b086731c5b1</uuid> + + <os> + <type>hvm</type> + </os> + + <memory>524288</memory> + <vcpu>1</vcpu> + + <features> + <pae/> + <acpi/> + </features> + + <devices> + <disk type='file' device='disk'> + <source file='/home/user/tmp/disk.vmdk'/> + <target bus='ide' dev='hda'/> + </disk> + + <interface type='bridge'> + <target dev='/dev/vmnet1'/> + <source bridge=''/> + <mac address='00:16:3e:5d:c7:9e'/> + </interface> + </devices> + </domain> diff --git a/docs/meson.build b/docs/meson.build index 99732f57ba..940fbedcfa 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvvmware', 'drvxen', 'firewall', 'format', @@ -81,6 +80,7 @@ docs_rst_files = [ 'drvtest', 'drvvbox', 'drvvirtuozzo', + 'drvvmware', 'errors', 'formatbackup', 'formatcheckpoint', -- 2.35.1

Fix the referenced anchor in 'formatdomain.rst' right away. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/drvxen.html.in | 358 ------------------------------------------ docs/drvxen.rst | 338 +++++++++++++++++++++++++++++++++++++++ docs/formatdomain.rst | 2 +- docs/meson.build | 2 +- 4 files changed, 340 insertions(+), 360 deletions(-) delete mode 100644 docs/drvxen.html.in create mode 100644 docs/drvxen.rst diff --git a/docs/drvxen.html.in b/docs/drvxen.html.in deleted file mode 100644 index 95be36c879..0000000000 --- a/docs/drvxen.html.in +++ /dev/null @@ -1,358 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>libxl hypervisor driver for Xen</h1> - - <ul id="toc"></ul> - - <p> - The libvirt libxl driver provides the ability to manage virtual - machines on any Xen release from 4.6.0 onwards. - </p> - - <h2><a id="project">Project Links</a></h2> - - <ul> - <li> - The <a href="https://www.xenproject.org">Xen</a> - hypervisor on Linux and Solaris hosts - </li> - </ul> - - <h2><a id="prereq">Deployment pre-requisites</a></h2> - - <p> - The libvirt libxl driver uses Xen's libxl API, also known as - libxenlight, to implement libvirt's hypervisor driver - functionality. libxl provides a consolidated interface for - managing a Xen host and its virtual machines, unlike old - versions of Xen where applications often had to communicate - with xend, xenstored, and the hypervisor itself via hypercalls. - With libxl the only pre-requisit is a properly installed Xen - host with the libxl toolstack running in a service domain - (often Domain-0). - </p> - - <h2><a id="uri">Connections to libxl driver</a></h2> - - <p> - The libvirt libxl driver is a single-instance privileged driver, - with a driver name of 'xen'. Some example connection URIs for - the libxl driver are: - </p> - -<pre> -xen:///system (local access, direct) -xen+unix:///system (local access, via daemon) -xen://example.com/system (remote access, TLS/x509) -xen+tcp://example.com/system (remote access, SASl/Kerberos) -xen+ssh://root@example.com/system (remote access, SSH tunnelled) -</pre> - - - <h2><a id="configFiles">Location of configuration files</a></h2> - - <p> - The libxl driver comes with sane default values. However, during its - initialization it reads a configuration file which offers system - administrator to override some of that default. The file is located - under <code>/etc/libvirt/libxl.conf</code> - </p> - - - <h2><a id="imex">Import and export of libvirt domain XML configs</a></h2> - - <p> - The libxl driver currently supports three native - config formats. The first, known as <code>xen-xm</code>, is the - original Xen virtual machine config format used by the legacy - xm/xend toolstack. The second, known as <code>xen-sxpr</code>, - is also one of the original formats that was used by xend's - legacy HTTP RPC service (<span class='removed'>removed in 5.6.0</span>) - </p> - - <p> - The third format is <code>xen-xl</code>, which is the virtual - machine config format supported by modern Xen. The <code>xen-xl</code> - format is described in the xl.cfg(5) man page. - </p> - - <h3><a id="xmlimport">Converting from XM config files to domain XML</a></h3> - - <p> - The <code>virsh domxml-from-native</code> provides a way to convert an - existing set of xl, xm, or sxpr config files to libvirt Domain XML, - which can then be used by libvirt. - </p> - - <pre>$ virsh -c xen:///system domxml-from-native xen-xm rhel5.cfg -<domain type='xen'> - <name>rhel5pv</name> - <uuid>8f07fe28-753f-2729-d76d-bdbd892f949a</uuid> - <memory>2560000</memory> - <currentMemory>307200</currentMemory> - <vcpu>4</vcpu> - <bootloader>/usr/bin/pygrub</bootloader> - <os> - <type arch='x86_64' machine='xenpv'>linux</type> - </os> - <clock offset='utc'/> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>restart</on_crash> - <devices> - <disk type='file' device='disk'> - <driver name='tap' type='aio'/> - <source file='/var/lib/xen/images/rhel5pv.img'/> - <target dev='xvda' bus='xen'/> - </disk> - <disk type='file' device='disk'> - <driver name='tap' type='qcow'/> - <source file='/root/qcow1-xen.img'/> - <target dev='xvdd' bus='xen'/> - </disk> - <interface type='bridge'> - <mac address='00:16:3e:60:36:ba'/> - <source bridge='xenbr0'/> - </interface> - <console type='pty'> - <target port='0'/> - </console> - <input type='mouse' bus='xen'/> - <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'/> - </devices> -</domain></pre> - - <h3><a id="xmlexport">Converting from domain XML to XM config files</a></h3> - - <p> - The <code>virsh domxml-to-native</code> provides a way to convert a - guest description using libvirt Domain XML into xl, xm, or sxpr config - format. - </p> - - <pre>$ virsh -c xen:///system domxml-to-native xen-xm rhel5pv.xml -name = "rhel5pv" -uuid = "8f07fe28-753f-2729-d76d-bdbd892f949a" -maxmem = 2500 -memory = 300 -vcpus = 4 -bootloader = "/usr/bin/pygrub" -kernel = "/var/lib/xen/boot_kernel.0YK-cS" -ramdisk = "/var/lib/xen/boot_ramdisk.vWgrxK" -extra = "ro root=/dev/VolGroup00/LogVol00 rhgb quiet" -on_poweroff = "destroy" -on_reboot = "restart" -on_crash = "restart" -sdl = 0 -vnc = 1 -vncunused = 1 -vnclisten = "0.0.0.0" -disk = [ "tap:aio:/var/lib/xen/images/rhel5pv.img,xvda,w", "tap:qcow:/root/qcow1-xen.img,xvdd,w" ] -vif = [ "mac=00:16:3e:60:36:ba,bridge=virbr0,script=vif-bridge,vifname=vif5.0" ]</pre> - - <h2><a id="xencommand">Pass-through of arbitrary command-line arguments - to the qemu device model</a></h2> - - <p><span class="since">Since 6.7.0</span>, the Xen driver supports passing - arbitrary command-line arguments to the qemu device model used by Xen with - the <code><xen:commandline></code> element under <code>domain</code>. - In order to use command-line pass-through, an XML namespace request must be - issued that pulls in <code>http://libvirt.org/schemas/domain/xen/1.0</code>. - With the namespace in place, it is then possible to add - <code><xen:arg></code>sub-elements to - <code><xen:commandline></code> describing each argument passed to - the device model when starting the domain. - </p> - <p>The following example illustrates passing arguments to the QEMU device - model that define a floppy drive, which Xen does not support through its - public APIs: - </p> - <pre> -<domain type="xen" xmlns:xen="http://libvirt.org/schemas/domain/xen/1.0"> - ... - <xen:commandline> - <xen:arg value='-drive'/> - <xen:arg value='file=/path/to/image,format=raw,if=none,id=drive-fdc0-0-0'/> - <xen:arg value='-global'/> - <xen:arg value='isa-fdc.driveA=drive-fdc0-0-0'/> - </xen:commandline> -</domain> - </pre> - - <h2><a id="xmlconfig">Example domain XML config</a></h2> - - <p> - Below are some example XML configurations for Xen guest domains. - For full details of the available options, consult the <a href="formatdomain.html">domain XML format</a> - guide. - </p> - - <h3>Paravirtualized guest bootloader</h3> - - <p> - Using a bootloader allows a paravirtualized guest to be booted using - a kernel stored inside its virtual disk image - </p> - - <pre><domain type='xen' > - <name>fc8</name> - <bootloader>/usr/bin/pygrub</bootloader> - <os> - <type>linux</type> - </os> - <memory>131072</memory> - <vcpu>1</vcpu> - <devices> - <disk type='file'> - <source file='/var/lib/xen/images/fc4.img'/> - <target dev='sda1'/> - </disk> - <interface type='bridge'> - <source bridge='xenbr0'/> - <mac address='aa:00:00:00:00:11'/> - <script path='/etc/xen/scripts/vif-bridge'/> - </interface> - <console tty='/dev/pts/5'/> - </devices> -</domain></pre> - - <h3>Paravirtualized guest direct kernel boot</h3> - - <p> - For installation of paravirtualized guests it is typical to boot the - domain using a kernel and initrd stored in the host OS - </p> - - <pre><domain type='xen' > - <name>fc8</name> - <os> - <type>linux</type> - <kernel>/var/lib/xen/install/vmlinuz-fedora8-x86_64</kernel> - <initrd>/var/lib/xen/install/initrd-vmlinuz-fedora8-x86_64</initrd> - <cmdline> kickstart=http://example.com/myguest.ks </cmdline> - </os> - <memory>131072</memory> - <vcpu>1</vcpu> - <devices> - <disk type='file'> - <source file='/var/lib/xen/images/fc4.img'/> - <target dev='sda1'/> - </disk> - <interface type='bridge'> - <source bridge='xenbr0'/> - <mac address='aa:00:00:00:00:11'/> - <script path='/etc/xen/scripts/vif-bridge'/> - </interface> - <graphics type='vnc' port='-1'/> - <console tty='/dev/pts/5'/> - </devices> -</domain></pre> - - <h3>Fullyvirtualized guest BIOS boot</h3> - - <p> - Fullyvirtualized guests use the emulated BIOS to boot off the primary - harddisk, CDROM or Network PXE ROM. - </p> - - <pre><domain type='xen' id='3'> - <name>fv0</name> - <uuid>4dea22b31d52d8f32516782e98ab3fa0</uuid> - <os> - <type>hvm</type> - <loader>/usr/lib/xen/boot/hvmloader</loader> - <boot dev='hd'/> - </os> - <memory>524288</memory> - <vcpu>1</vcpu> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>restart</on_crash> - <features> - <pae/> - <acpi/> - <apic/> - </features> - <clock sync="localtime"/> - <devices> - <emulator>/usr/lib/xen/bin/qemu-dm</emulator> - <interface type='bridge'> - <source bridge='xenbr0'/> - <mac address='00:16:3e:5d:c7:9e'/> - <script path='vif-bridge'/> - </interface> - <disk type='file'> - <source file='/var/lib/xen/images/fv0'/> - <target dev='hda'/> - </disk> - <disk type='file' device='cdrom'> - <source file='/var/lib/xen/images/fc5-x86_64-boot.iso'/> - <target dev='hdc'/> - <readonly/> - </disk> - <disk type='file' device='floppy'> - <source file='/root/fd.img'/> - <target dev='fda'/> - </disk> - <graphics type='vnc' port='5904'/> - </devices> -</domain></pre> - - <h3>Fullyvirtualized guest direct kernel boot</h3> - - <p> - With Xen 3.2.0 or later it is possible to bypass the BIOS and directly - boot a Linux kernel and initrd as a fullyvirtualized domain. This allows - for complete automation of OS installation, for example using the Anaconda - kickstart support. - </p> - - <pre><domain type='xen' id='3'> - <name>fv0</name> - <uuid>4dea22b31d52d8f32516782e98ab3fa0</uuid> - <os> - <type>hvm</type> - <loader>/usr/lib/xen/boot/hvmloader</loader> - <kernel>/var/lib/xen/install/vmlinuz-fedora8-x86_64</kernel> - <initrd>/var/lib/xen/install/initrd-vmlinuz-fedora8-x86_64</initrd> - <cmdline> kickstart=http://example.com/myguest.ks </cmdline> - </os> - <memory>524288</memory> - <vcpu>1</vcpu> - <on_poweroff>destroy</on_poweroff> - <on_reboot>restart</on_reboot> - <on_crash>restart</on_crash> - <features> - <pae/> - <acpi/> - <apic/> - </features> - <clock sync="localtime"/> - <devices> - <emulator>/usr/lib/xen/bin/qemu-dm</emulator> - <interface type='bridge'> - <source bridge='xenbr0'/> - <mac address='00:16:3e:5d:c7:9e'/> - <script path='vif-bridge'/> - </interface> - <disk type='file'> - <source file='/var/lib/xen/images/fv0'/> - <target dev='hda'/> - </disk> - <disk type='file' device='cdrom'> - <source file='/var/lib/xen/images/fc5-x86_64-boot.iso'/> - <target dev='hdc'/> - <readonly/> - </disk> - <disk type='file' device='floppy'> - <source file='/root/fd.img'/> - <target dev='fda'/> - </disk> - <graphics type='vnc' port='5904'/> - </devices> -</domain></pre> - - </body> -</html> diff --git a/docs/drvxen.rst b/docs/drvxen.rst new file mode 100644 index 0000000000..c131d52c7a --- /dev/null +++ b/docs/drvxen.rst @@ -0,0 +1,338 @@ +.. role:: since + +=============================== +libxl hypervisor driver for Xen +=============================== + +.. contents:: + +The libvirt libxl driver provides the ability to manage virtual machines on any +Xen release from 4.6.0 onwards. + +Project Links +------------- + +- The `Xen <https://www.xenproject.org>`__ hypervisor on Linux and Solaris + hosts + +Deployment pre-requisites +------------------------- + +The libvirt libxl driver uses Xen's libxl API, also known as libxenlight, to +implement libvirt's hypervisor driver functionality. libxl provides a +consolidated interface for managing a Xen host and its virtual machines, unlike +old versions of Xen where applications often had to communicate with xend, +xenstored, and the hypervisor itself via hypercalls. With libxl the only +pre-requisit is a properly installed Xen host with the libxl toolstack running +in a service domain (often Domain-0). + +Connections to libxl driver +--------------------------- + +The libvirt libxl driver is a single-instance privileged driver, with a driver +name of 'xen'. Some example connection URIs for the libxl driver are: + +:: + + xen:///system (local access, direct) + xen+unix:///system (local access, via daemon) + xen://example.com/system (remote access, TLS/x509) + xen+tcp://example.com/system (remote access, SASl/Kerberos) + xen+ssh://root@example.com/system (remote access, SSH tunnelled) + +Location of configuration files +------------------------------- + +The libxl driver comes with sane default values. However, during its +initialization it reads a configuration file which offers system administrator +to override some of that default. The file is located under +``/etc/libvirt/libxl.conf`` + +Import and export of libvirt domain XML configs +----------------------------------------------- + +The libxl driver currently supports three native config formats. The first, +known as ``xen-xm``, is the original Xen virtual machine config format used by +the legacy xm/xend toolstack. The second, known as ``xen-sxpr``, is also one of +the original formats that was used by xend's legacy HTTP RPC service ( +:since:`removed in 5.6.0` ) + +The third format is ``xen-xl``, which is the virtual machine config format +supported by modern Xen. The ``xen-xl`` format is described in the xl.cfg(5) man +page. + +Converting from XM config files to domain XML +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh domxml-from-native`` provides a way to convert an existing set of +xl, xm, or sxpr config files to libvirt Domain XML, which can then be used by +libvirt. + +:: + + $ virsh -c xen:///system domxml-from-native xen-xm rhel5.cfg + <domain type='xen'> + <name>rhel5pv</name> + <uuid>8f07fe28-753f-2729-d76d-bdbd892f949a</uuid> + <memory>2560000</memory> + <currentMemory>307200</currentMemory> + <vcpu>4</vcpu> + <bootloader>/usr/bin/pygrub</bootloader> + <os> + <type arch='x86_64' machine='xenpv'>linux</type> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>restart</on_crash> + <devices> + <disk type='file' device='disk'> + <driver name='tap' type='aio'/> + <source file='/var/lib/xen/images/rhel5pv.img'/> + <target dev='xvda' bus='xen'/> + </disk> + <disk type='file' device='disk'> + <driver name='tap' type='qcow'/> + <source file='/root/qcow1-xen.img'/> + <target dev='xvdd' bus='xen'/> + </disk> + <interface type='bridge'> + <mac address='00:16:3e:60:36:ba'/> + <source bridge='xenbr0'/> + </interface> + <console type='pty'> + <target port='0'/> + </console> + <input type='mouse' bus='xen'/> + <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'/> + </devices> + </domain> + +Converting from domain XML to XM config files +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``virsh domxml-to-native`` provides a way to convert a guest description +using libvirt Domain XML into xl, xm, or sxpr config format. + +:: + + $ virsh -c xen:///system domxml-to-native xen-xm rhel5pv.xml + name = "rhel5pv" + uuid = "8f07fe28-753f-2729-d76d-bdbd892f949a" + maxmem = 2500 + memory = 300 + vcpus = 4 + bootloader = "/usr/bin/pygrub" + kernel = "/var/lib/xen/boot_kernel.0YK-cS" + ramdisk = "/var/lib/xen/boot_ramdisk.vWgrxK" + extra = "ro root=/dev/VolGroup00/LogVol00 rhgb quiet" + on_poweroff = "destroy" + on_reboot = "restart" + on_crash = "restart" + sdl = 0 + vnc = 1 + vncunused = 1 + vnclisten = "0.0.0.0" + disk = [ "tap:aio:/var/lib/xen/images/rhel5pv.img,xvda,w", "tap:qcow:/root/qcow1-xen.img,xvdd,w" ] + vif = [ "mac=00:16:3e:60:36:ba,bridge=virbr0,script=vif-bridge,vifname=vif5.0" ] + +Pass-through of arbitrary command-line arguments to the qemu device model +------------------------------------------------------------------------- + +:since:`Since 6.7.0` , the Xen driver supports passing arbitrary command-line +arguments to the qemu device model used by Xen with the ``<xen:commandline>`` +element under ``domain``. In order to use command-line pass-through, an XML +namespace request must be issued that pulls in +``http://libvirt.org/schemas/domain/xen/1.0``. With the namespace in place, it +is then possible to add ``<xen:arg>``\ sub-elements to ``<xen:commandline>`` +describing each argument passed to the device model when starting the domain. + +The following example illustrates passing arguments to the QEMU device model +that define a floppy drive, which Xen does not support through its public APIs: + +:: + + <domain type="xen" xmlns:xen="http://libvirt.org/schemas/domain/xen/1.0"> + ... + <xen:commandline> + <xen:arg value='-drive'/> + <xen:arg value='file=/path/to/image,format=raw,if=none,id=drive-fdc0-0-0'/> + <xen:arg value='-global'/> + <xen:arg value='isa-fdc.driveA=drive-fdc0-0-0'/> + </xen:commandline> + </domain> + +Example domain XML config +------------------------- + +Below are some example XML configurations for Xen guest domains. For full +details of the available options, consult the `domain XML +format <formatdomain.html>`__ guide. + +Paravirtualized guest bootloader +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Using a bootloader allows a paravirtualized guest to be booted using a kernel +stored inside its virtual disk image + +:: + + <domain type='xen' > + <name>fc8</name> + <bootloader>/usr/bin/pygrub</bootloader> + <os> + <type>linux</type> + </os> + <memory>131072</memory> + <vcpu>1</vcpu> + <devices> + <disk type='file'> + <source file='/var/lib/xen/images/fc4.img'/> + <target dev='sda1'/> + </disk> + <interface type='bridge'> + <source bridge='xenbr0'/> + <mac address='aa:00:00:00:00:11'/> + <script path='/etc/xen/scripts/vif-bridge'/> + </interface> + <console tty='/dev/pts/5'/> + </devices> + </domain> + +Paravirtualized guest direct kernel boot +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For installation of paravirtualized guests it is typical to boot the domain +using a kernel and initrd stored in the host OS + +:: + + <domain type='xen' > + <name>fc8</name> + <os> + <type>linux</type> + <kernel>/var/lib/xen/install/vmlinuz-fedora8-x86_64</kernel> + <initrd>/var/lib/xen/install/initrd-vmlinuz-fedora8-x86_64</initrd> + <cmdline> kickstart=http://example.com/myguest.ks </cmdline> + </os> + <memory>131072</memory> + <vcpu>1</vcpu> + <devices> + <disk type='file'> + <source file='/var/lib/xen/images/fc4.img'/> + <target dev='sda1'/> + </disk> + <interface type='bridge'> + <source bridge='xenbr0'/> + <mac address='aa:00:00:00:00:11'/> + <script path='/etc/xen/scripts/vif-bridge'/> + </interface> + <graphics type='vnc' port='-1'/> + <console tty='/dev/pts/5'/> + </devices> + </domain> + +Fullyvirtualized guest BIOS boot +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Fullyvirtualized guests use the emulated BIOS to boot off the primary harddisk, +CDROM or Network PXE ROM. + +:: + + <domain type='xen' id='3'> + <name>fv0</name> + <uuid>4dea22b31d52d8f32516782e98ab3fa0</uuid> + <os> + <type>hvm</type> + <loader>/usr/lib/xen/boot/hvmloader</loader> + <boot dev='hd'/> + </os> + <memory>524288</memory> + <vcpu>1</vcpu> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>restart</on_crash> + <features> + <pae/> + <acpi/> + <apic/> + </features> + <clock sync="localtime"/> + <devices> + <emulator>/usr/lib/xen/bin/qemu-dm</emulator> + <interface type='bridge'> + <source bridge='xenbr0'/> + <mac address='00:16:3e:5d:c7:9e'/> + <script path='vif-bridge'/> + </interface> + <disk type='file'> + <source file='/var/lib/xen/images/fv0'/> + <target dev='hda'/> + </disk> + <disk type='file' device='cdrom'> + <source file='/var/lib/xen/images/fc5-x86_64-boot.iso'/> + <target dev='hdc'/> + <readonly/> + </disk> + <disk type='file' device='floppy'> + <source file='/root/fd.img'/> + <target dev='fda'/> + </disk> + <graphics type='vnc' port='5904'/> + </devices> + </domain> + +Fullyvirtualized guest direct kernel boot +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +With Xen 3.2.0 or later it is possible to bypass the BIOS and directly boot a +Linux kernel and initrd as a fullyvirtualized domain. This allows for complete +automation of OS installation, for example using the Anaconda kickstart support. + +:: + + <domain type='xen' id='3'> + <name>fv0</name> + <uuid>4dea22b31d52d8f32516782e98ab3fa0</uuid> + <os> + <type>hvm</type> + <loader>/usr/lib/xen/boot/hvmloader</loader> + <kernel>/var/lib/xen/install/vmlinuz-fedora8-x86_64</kernel> + <initrd>/var/lib/xen/install/initrd-vmlinuz-fedora8-x86_64</initrd> + <cmdline> kickstart=http://example.com/myguest.ks </cmdline> + </os> + <memory>524288</memory> + <vcpu>1</vcpu> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>restart</on_crash> + <features> + <pae/> + <acpi/> + <apic/> + </features> + <clock sync="localtime"/> + <devices> + <emulator>/usr/lib/xen/bin/qemu-dm</emulator> + <interface type='bridge'> + <source bridge='xenbr0'/> + <mac address='00:16:3e:5d:c7:9e'/> + <script path='vif-bridge'/> + </interface> + <disk type='file'> + <source file='/var/lib/xen/images/fv0'/> + <target dev='hda'/> + </disk> + <disk type='file' device='cdrom'> + <source file='/var/lib/xen/images/fc5-x86_64-boot.iso'/> + <target dev='hdc'/> + <readonly/> + </disk> + <disk type='file' device='floppy'> + <source file='/root/fd.img'/> + <target dev='fda'/> + </disk> + <graphics type='vnc' port='5904'/> + </devices> + </domain> diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 4fb2e1a9f4..95ace2677e 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -8352,5 +8352,5 @@ Example configs Example configurations for each driver are provide on the driver specific pages listed below -- `Xen examples <drvxen.html#xmlconfig>`__ +- `Xen examples <drvxen.html#example-domain-xml-config>`__ - `QEMU/KVM examples <drvqemu.html#example-domain-xml-config>`__ diff --git a/docs/meson.build b/docs/meson.build index 940fbedcfa..6147f85d16 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'drvxen', 'firewall', 'format', 'formatcaps', @@ -81,6 +80,7 @@ docs_rst_files = [ 'drvvbox', 'drvvirtuozzo', 'drvvmware', + 'drvxen', 'errors', 'formatbackup', 'formatcheckpoint', -- 2.35.1

On Mon, Mar 28, 2022 at 02:10:29PM +0200, Peter Krempa wrote:
Fix the referenced anchor in 'formatdomain.rst' right away.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- ...
+Fullyvirtualized guest BIOS boot +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Fullyvirtualized guests use the emulated BIOS to boot off the primary harddisk,
nitpick: even though it has nothing to do with the conversion, I think we could fix "fullyvirtualized" to "fully virtualized". (1 more occurrence a few lines later) Reviewed-by: Erik Skultety <eskultet@redhat.com>

On Fri, Apr 01, 2022 at 15:07:16 +0200, Erik Skultety wrote:
On Mon, Mar 28, 2022 at 02:10:29PM +0200, Peter Krempa wrote:
Fix the referenced anchor in 'formatdomain.rst' right away.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- ...
+Fullyvirtualized guest BIOS boot +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Fullyvirtualized guests use the emulated BIOS to boot off the primary harddisk,
nitpick: even though it has nothing to do with the conversion, I think we could fix "fullyvirtualized" to "fully virtualized". (1 more occurrence a few lines later)
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Good catch; I'll fix it in a separate commit

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/firewall.html.in | 523 ------------------------------------------ docs/firewall.rst | 506 ++++++++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 507 insertions(+), 524 deletions(-) delete mode 100644 docs/firewall.html.in create mode 100644 docs/firewall.rst diff --git a/docs/firewall.html.in b/docs/firewall.html.in deleted file mode 100644 index 15b4f397be..0000000000 --- a/docs/firewall.html.in +++ /dev/null @@ -1,523 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1 >Firewall and network filtering in libvirt</h1> - <p>There are three pieces of libvirt functionality which do network - filtering of some type. - <br /><br /> - At a high level they are: - </p> - <ul> - <li>The virtual network driver - <br /><br /> - This provides an isolated bridge device (ie no physical NICs - attached). Guest TAP devices are attached to this bridge. - Guests can talk to each other and the host, and optionally the - wider world. - <br /><br /> - </li> - <li>The QEMU driver MAC filtering - <br /><br /> - This provides a generic filtering of MAC addresses to prevent - the guest spoofing its MAC address. This is mostly obsoleted by - the next item, so won't be discussed further. - <br /><br /> - </li> - <li>The network filter driver - <br /><br /> - This provides fully configurable, arbitrary network filtering - of traffic on guest NICs. Generic rulesets are defined at the - host level to control traffic in some manner. Rules sets are - then associated with individual NICs of a guest. While not as - expressive as directly using iptables/ebtables, this can still - do nearly everything you would want to on a guest NIC filter. - </li> - </ul> - - <h3><a id="fw-virtual-network-driver">The virtual network driver</a> - </h3> - <p>The typical configuration for guests is to use bridging of the - physical NIC on the host to connect the guest directly to the LAN. - In RHEL6 there is also the possibility of using macvtap/sr-iov - and VEPA connectivity. None of this stuff plays nicely with wireless - NICs, since they will typically silently drop any traffic with a - MAC address that doesn't match that of the physical NIC. - </p> - <p>Thus the virtual network driver in libvirt was invented. This takes - the form of an isolated bridge device (ie one with no physical NICs - attached). The TAP devices associated with the guest NICs are attached - to the bridge device. This immediately allows guests on a single host - to talk to each other and to the host OS (modulo host IPtables rules). - </p> - <p>libvirt then uses iptables to control what further connectivity is - available. There are three configurations possible for a virtual - network at time of writing: - </p> - <ul> - <li>isolated: all off-node traffic is completely blocked</li> - <li>nat: outbound traffic to the LAN is allowed, but MASQUERADED</li> - <li>forward: outbound traffic to the LAN is allowed</li> - </ul> - <p>The latter 'forward' case requires the virtual network be on a - separate sub-net from the main LAN, and that the LAN admin has - configured routing for this subnet. In the future we intend to - add support for IP subnetting and/or proxy-arp. This allows for - the virtual network to use the same subnet as the main LAN and - should avoid need for the LAN admin to configure special routing. - </p> - <p>Libvirt will optionally also provide DHCP services to the virtual - network using DNSMASQ. In all cases, we need to allow DNS/DHCP - queries to the host OS. Since we can't predict whether the host - firewall setup is already allowing this, we insert 4 rules into - the head of the INPUT chain - </p> - <pre> -target prot opt in out source destination -ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 -ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 -ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 -ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67</pre> - <p>Note we have restricted our rules to just the bridge associated - with the virtual network, to avoid opening undesirable holes in - the host firewall wrt the LAN/WAN. - </p> - <p>The next rules depend on the type of connectivity allowed, and go - in the main FORWARD chain: - </p> - <ul> - <li>type=isolated - <br /><br /> -Allow traffic between guests. Deny inbound. Deny outbound. - <pre> -target prot opt in out source destination -ACCEPT all -- virbr1 virbr1 0.0.0.0/0 0.0.0.0/0 -REJECT all -- * virbr1 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable -REJECT all -- virbr1 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable</pre> - </li> - <li>type=nat - <br /><br /> -Allow inbound related to an established connection. Allow -outbound, but only from our expected subnet. Allow traffic -between guests. Deny all other inbound. Deny all other outbound. - <pre> -target prot opt in out source destination -ACCEPT all -- * virbr0 0.0.0.0/0 192.168.122.0/24 state RELATED,ESTABLISHED -ACCEPT all -- virbr0 * 192.168.122.0/24 0.0.0.0/0 -ACCEPT all -- virbr0 virbr0 0.0.0.0/0 0.0.0.0/0 -REJECT all -- * virbr0 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable -REJECT all -- virbr0 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable</pre> - </li> - <li>type=routed - <br /><br /> -Allow inbound, but only to our expected subnet. Allow -outbound, but only from our expected subnet. Allow traffic -between guests. Deny all other inbound. Deny all other outbound. - <pre> -target prot opt in out source destination -ACCEPT all -- * virbr2 0.0.0.0/0 192.168.124.0/24 -ACCEPT all -- virbr2 * 192.168.124.0/24 0.0.0.0/0 -ACCEPT all -- virbr2 virbr2 0.0.0.0/0 0.0.0.0/0 -REJECT all -- * virbr2 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable -REJECT all -- virbr2 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable</pre> - </li> - <li>Finally, with type=nat, there is also an entry in the POSTROUTING -chain to apply masquerading: - <pre> -target prot opt in out source destination -MASQUERADE all -- * * 192.168.122.0/24 !192.168.122.0/24</pre> - </li> - </ul> - - <h3><a id="fw-firewalld-and-virtual-network-driver">firewalld and the virtual network driver</a> - </h3> - <p> - If <a href="https://firewalld.org">firewalld</a> is active on - the host, libvirt will attempt to place the bridge interface of - a libvirt virtual network into the firewalld zone named - "libvirt" (thus making all guest->host traffic on that network - subject to the rules of the "libvirt" zone). This is done - because, if firewalld is using its nftables backend (available - since firewalld 0.6.0) the default firewalld zone (which would - be used if libvirt didn't explicitly set the zone) prevents - forwarding traffic from guests through the bridge, as well as - preventing DHCP, DNS, and most other traffic from guests to - host. The zone named "libvirt" is installed into the firewalld - configuration by libvirt (not by firewalld), and allows - forwarded traffic through the bridge as well as DHCP, DNS, TFTP, - and SSH traffic to the host - depending on firewalld's backend - this will be implemented via either iptables or nftables - rules. libvirt's own rules outlined above will *always* be - iptables rules regardless of which backend is in use by - firewalld. - </p> - <p> - NB: It is possible to manually set the firewalld zone for a - network's interface with the "zone" attribute of the network's - "bridge" element. - </p> - <p> - NB: Prior to libvirt 5.1.0, the firewalld "libvirt" zone did not - exist, and prior to firewalld 0.7.0 a feature crucial to making - the "libvirt" zone operate properly (rich rule priority - settings) was not implemented in firewalld. In cases where one - or the other of the two packages is missing the necessary - functionality, it's still possible to have functional guest - networking by setting the firewalld backend to "iptables" (in - firewalld prior to 0.6.0, this was the only backend available). - </p> - - <h3><a id="fw-network-filter-driver">The network filter driver</a> - </h3> - <p>This driver provides a fully configurable network filtering capability - that leverages ebtables, iptables and ip6tables. This was written by - the libvirt guys at IBM and although its XML schema is defined by libvirt, - the conceptual model is closely aligned with the DMTF CIM schema for - network filtering: - </p> - <p><a href="https://www.dmtf.org/standards/cim/cim_schema_v2230/CIM_Network.pdf">https://www.dmtf.org/standards/cim/cim_schema_v2230/CIM_Network.pdf</a></p> - <p>The filters are managed in libvirt as a top level, standalone object. - This allows the filters to then be referenced by any libvirt object - that requires their functionality, instead tying them only to use - by guest NICs. In the current implementation, filters can be associated - with individual guest NICs via the libvirt domain XML format. In the - future we might allow filters to be associated with the virtual network - objects. Further we're expecting to define a new 'virtual switch' object - to remove the complexity of configuring bridge/sriov/vepa networking - modes. This make also end up making use of network filters. - </p> - <p>There are a new set of virsh commands for managing network filters:</p> - <ul> - <li>virsh nwfilter-define - <br /><br /> - define or update a network filter from an XML file - <br /><br /> - </li> - <li>virsh nwfilter-undefine - <br /><br /> - undefine a network filter - <br /><br /> - </li> - <li>virsh nwfilter-dumpxml - <br /><br /> - network filter information in XML - <br /><br /> - </li> - <li>virsh nwfilter-list - <br /><br /> - list network filters - <br /><br /> - </li> - <li>virsh nwfilter-edit - <br /><br /> - edit XML configuration for a network filter - </li> - </ul> - <p>There are equivalently named C APIs for each of these commands.</p> - <p>As with all objects libvirt manages, network filters are configured -using an XML format. At a high level the format looks like this: - </p> -<pre> -<filter name='no-spamming' chain='XXXX'> - <uuid>d217f2d7-5a04-0e01-8b98-ec2743436b74</uuid> - - <rule ...> - .... - </rule> - - <filterref filter='XXXX'/> -</filter></pre> - <p>Every filter has a name and UUID which serve as unique identifiers. - A filter can have zero-or-more <code><rule></code> elements which - are used to actually define network controls. Filters can be arranged - into a DAG, so zero-or-more <code><filterref/></code> elements are - also allowed. Cycles in the graph are not allowed. - </p> - <p>The <code><rule></code> element is where all the interesting stuff - happens. It has three attributes, an action, a traffic direction and an - optional priority. E.g.: - </p> - <pre><rule action='drop' direction='out' priority='500'></pre> - <p>Within the rule there are a wide variety of elements allowed, which - do protocol specific matching. Supported protocols currently include - <code>mac</code>, <code>arp</code>, <code>rarp</code>, <code>ip</code>, - <code>ipv6</code>, <code>tcp/ip</code>, <code>icmp/ip</code>, - <code>igmp/ip</code>, <code>udp/ip</code>, <code>udplite/ip</code>, - <code>esp/ip</code>, <code>ah/ip</code>, <code>sctp/ip</code>, - <code>tcp/ipv6</code>, <code>icmp/ipv6</code>, <code>igmp/ipv6</code>, - <code>udp/ipv6</code>, <code>udplite/ipv6</code>, <code>esp/ipv6</code>, - <code>ah/ipv6</code>, <code>sctp/ipv6</code>. Each protocol defines what - is valid inside the <rule> element. The general pattern though is: - </p> - <pre> -<protocol match='yes|no' attribute1='value1' attribute2='value2'/></pre> - <p>So, eg a TCP protocol, matching ports 0-1023 would be expressed as:</p> - <pre><tcp match='yes' srcportstart='0' srcportend='1023'/></pre> - <p>Attributes can included references to variables defined by the - object using the rule. So the guest XML format allows each NIC - to have a MAC address and IP address defined. These are made - available to filters via the variables <code><b>$IP</b></code> and - <code><b>$MAC</b></code>. - </p> - <p>So to define a filter that prevents IP address spoofing we can - simply match on source IP address <code>!= $IP</code> like this: - </p> - <pre> -<filter name='no-ip-spoofing' chain='ipv4'> - <rule action='drop' direction='out'> - <ip match='no' srcipaddr='<b>$IP</b>' /> - </rule> -</filter></pre> - <p>I'm not going to go into details on all the other protocol - matches you can do, because it'll take far too much space. - You can read about the options - <a href="formatnwfilter.html#nwfelemsRulesProto">here</a>. - </p> - <p>Out of the box in RHEL6/Fedora rawhide, libvirt ships with a - set of default useful rules: - </p> - <pre> -# virsh nwfilter-list -UUID Name ----------------------------------------------------------------- -15b1ab2b-b1ac-1be2-ed49-2042caba4abb allow-arp -6c51a466-8d14-6d11-46b0-68b1a883d00f allow-dhcp -7517ad6c-bd90-37c8-26c9-4eabcb69848d allow-dhcp-server -7680776c-77aa-496f-90d6-13097664b925 allow-dhcpv6 -9cdaad60-7631-4172-8ccb-ef774be7485b allow-dhcpv6-server -3d38b406-7cf0-8335-f5ff-4b9add35f288 allow-incoming-ipv4 -908543c1-902e-45f6-a6ca-1a0ad35e7599 allow-incoming-ipv6 -5ff06320-9228-2899-3db0-e32554933415 allow-ipv4 -ce8904cc-ad3a-4454-896c-53452882f817 allow-ipv6 -db0b1767-d62b-269b-ea96-0cc8b451144e clean-traffic -6d6ddcc8-1242-4c43-ac63-63af80493132 clean-traffic-gateway -4cf38077-c7d5-4e25-99bb-6c4c9efad294 no-arp-ip-spoofing -0b11a636-ce58-497f-be90-17f63c92487a no-arp-mac-spoofing -f88f1932-debf-4aa1-9fbe-f10d3aa4bc95 no-arp-spoofing -772f112d-52e4-700c-0250-e178a3d91a7a no-ip-multicast -7ee20370-8106-765d-f7ff-8a60d5aaf30b no-ip-spoofing -f8a51c43-a08f-49b3-b9e2-393d54522dc0 no-ipv6-multicast -a7f0afe9-a428-44b8-8566-c8ee2a669271 no-ipv6-spoofing -d5d3c490-c2eb-68b1-24fc-3ee362fc8af3 no-mac-broadcast -fb57c546-76dc-a372-513f-e8179011b48a no-mac-spoofing -dba10ea7-446d-76de-346f-335bd99c1d05 no-other-l2-traffic -f5c78134-9da4-0c60-a9f0-fb37bc21ac1f no-other-rarp-traffic -7637e405-4ccf-42ac-5b41-14f8d03d8cf3 qemu-announce-self -9aed52e7-f0f3-343e-fe5c-7dcb27b594e5 qemu-announce-self-rarp</pre> - <p>Most of these are just building blocks. The interesting one here - is 'clean-traffic'. This pulls together all the building blocks - into one filter that you can then associate with a guest NIC. - This stops the most common bad things a guest might try, IP - spoofing, arp spoofing and MAC spoofing. To look at the rules for - any of these just do: - </p> - <pre>virsh nwfilter-dumpxml FILTERNAME|UUID</pre> - <p>They are all stored in <code>/etc/libvirt/nwfilter</code>, but don't - edit the files there directly. Use <code>virsh nwfilter-define</code> - to update them. This ensures the guests have their iptables/ebtables - rules recreated. - </p> - <p>To associate the clean-traffic filter with a guest, edit the - guest XML config and change the <code><interface></code> element - to include a <code><filterref></code> and also specify the - <code><ip address/></code> that the guest is allowed to - use: - </p> - <pre> -<interface type='bridge'> - <mac address='52:54:00:56:44:32'/> - <source bridge='br1'/> - <ip address='10.33.8.131'/> - <target dev='vnet0'/> - <model type='virtio'/> - <filterref filter='clean-traffic'/> -</interface></pre> - <p>If no <code><ip address></code> is included, the network filter - driver will activate its 'learning mode'. This uses libpcap to snoop on - network traffic the guest sends and attempts to identify the - first IP address it uses. It then locks traffic to this address. - Obviously this isn't entirely secure, but it does offer some - protection against the guest being trojaned once up and running. - In the future we intend to enhance the learning mode so that it - looks for DHCPOFFERS from a trusted DHCP server and only allows - the offered IP address to be used. - </p> - <p>Now, how is all this implemented...?</p> - <p>The network filter driver uses a combination of ebtables, iptables and - ip6tables, depending on which protocols are referenced in a filter. The - out of the box 'clean-traffic' filter rules only require use of - ebtables. If you want to do matching at tcp/udp/etc protocols (eg to add - a new filter 'no-email-spamming' to block port 25), then iptables will - also be used. - </p> - <p>The driver attempts to keep its rules separate from those that - the host admin might already have configured. So the first thing - it does with ebtables, is to add two hooks in POSTROUTING and - PREROUTING chains, to redirect traffic to custom chains. These - hooks match on the TAP device name of the guest NIC, so they - should not interact badly with any administrator defined rules: - </p> - <pre> -Bridge chain: PREROUTING, entries: 1, policy: ACCEPT --i vnet0 -j libvirt-I-vnet0 - -Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT --o vnet0 -j libvirt-O-vnet0</pre> - <p>To keep things manageable and easy to follow, the driver will then - create further sub-chains for each protocol then it needs to match - against: - </p> - <pre> -Bridge chain: libvirt-I-vnet0, entries: 5, policy: ACCEPT --p IPv4 -j I-vnet0-ipv4 --p ARP -j I-vnet0-arp --p 0x8035 -j I-vnet0-rarp --p 0x835 -j ACCEPT --j DROP - -Bridge chain: libvirt-O-vnet0, entries: 4, policy: ACCEPT --p IPv4 -j O-vnet0-ipv4 --p ARP -j O-vnet0-arp --p 0x8035 -j O-vnet0-rarp --j DROP</pre> - <p>Finally, here comes the actual implementation of the filters. This - example shows the 'clean-traffic' filter implementation. - I'm not going to explain what this is doing now. :-) - </p> - <pre> -Bridge chain: I-vnet0-ipv4, entries: 2, policy: ACCEPT --s ! 52:54:0:56:44:32 -j DROP --p IPv4 --ip-src ! 10.33.8.131 -j DROP - -Bridge chain: O-vnet0-ipv4, entries: 1, policy: ACCEPT --j ACCEPT - -Bridge chain: I-vnet0-arp, entries: 6, policy: ACCEPT --s ! 52:54:0:56:44:32 -j DROP --p ARP --arp-mac-src ! 52:54:0:56:44:32 -j DROP --p ARP --arp-ip-src ! 10.33.8.131 -j DROP --p ARP --arp-op Request -j ACCEPT --p ARP --arp-op Reply -j ACCEPT --j DROP - -Bridge chain: O-vnet0-arp, entries: 5, policy: ACCEPT --p ARP --arp-op Reply --arp-mac-dst ! 52:54:0:56:44:32 -j DROP --p ARP --arp-ip-dst ! 10.33.8.131 -j DROP --p ARP --arp-op Request -j ACCEPT --p ARP --arp-op Reply -j ACCEPT --j DROP - -Bridge chain: I-vnet0-rarp, entries: 2, policy: ACCEPT --p 0x8035 -s 52:54:0:56:44:32 -d Broadcast --arp-op Request_Reverse --arp-ip-src 0.0.0.0 --arp-ip-dst 0.0.0.0 --arp-mac-src 52:54:0:56:44:32 --arp-mac-dst 52:54:0:56:44:32 -j ACCEPT --j DROP - -Bridge chain: O-vnet0-rarp, entries: 2, policy: ACCEPT --p 0x8035 -d Broadcast --arp-op Request_Reverse --arp-ip-src 0.0.0.0 --arp-ip-dst 0.0.0.0 --arp-mac-src 52:54:0:56:44:32 --arp-mac-dst 52:54:0:56:44:32 -j ACCEPT --j DROP</pre> - <p>NB, we would have liked to include the prefix 'libvirt-' in all - of our chain names, but unfortunately the kernel limits names - to a very short maximum length. So only the first two custom - chains can include that prefix. The others just include the - TAP device name + protocol name. - </p> - <p>If I define a new filter 'no-spamming' and then add this to the - 'clean-traffic' filter, I can illustrate how iptables usage works: - </p> - <pre> -# cat > /root/spamming.xml <<EOF -<filter name='no-spamming' chain='root'> - <uuid>d217f2d7-5a04-0e01-8b98-ec2743436b74</uuid> - <rule action='drop' direction='out' priority='500'> - <tcp dstportstart='25' dstportend='25'/> - </rule> -</filter> -EOF -# virsh nwfilter-define /root/spamming.xml -# virsh nwfilter-edit clean-traffic</pre> - - <p>...add <code><filterref filter='no-spamming'/></code></p> - <p>All active guests immediately have their iptables/ebtables rules - rebuilt. - </p> - <p>The network filter driver deals with iptables in a very similar - way. First it separates out its rules from those the admin may - have defined, by adding a couple of hooks into the INPUT/FORWARD - chains: - </p> - <pre> -Chain INPUT (policy ACCEPT 13M packets, 21G bytes) -target prot opt in out source destination -libvirt-host-in all -- * * 0.0.0.0/0 0.0.0.0/0 - -Chain FORWARD (policy ACCEPT 5532K packets, 3010M bytes) -target prot opt in out source destination -libvirt-in all -- * * 0.0.0.0/0 0.0.0.0/0 -libvirt-out all -- * * 0.0.0.0/0 0.0.0.0/0 -libvirt-in-post all -- * * 0.0.0.0/0 0.0.0.0/0</pre> - <p>These custom chains then do matching based on the TAP device - name, so they won't open holes in the admin defined matches for - the LAN/WAN (if any). - </p> - <pre> -Chain libvirt-host-in (1 references) - target prot opt in out source destination - HI-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-in vnet0 - -Chain libvirt-in (1 references) - target prot opt in out source destination - FI-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-in vnet0 - -Chain libvirt-in-post (1 references) - target prot opt in out source destination - ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vnet0 - -Chain libvirt-out (1 references) - target prot opt in out source destination - FO-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-out vnet0</pre> - <p>Finally, we can see the interesting bit which is the actual - implementation of my filter to block port 25 access: - </p> - <pre> -Chain FI-vnet0 (1 references) - target prot opt in out source destination - DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:25 - -Chain FO-vnet0 (1 references) - target prot opt in out source destination - DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:25 - -Chain HI-vnet0 (1 references) - target prot opt in out source destination - DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:25</pre> - <p>One thing in looking at this you may notice is that if there - are many guests all using the same filters, we will be duplicating - the iptables rules over and over for each guest. This is merely a - limitation of the current rules engine implementation. At the libvirt - object modelling level you can clearly see we've designed the model - so filter rules are defined in one place, and indirectly referenced - by guests. Thus it should be possible to change the implementation in - the future so we can share the actual iptables/ebtables rules for - each guest to create a more scalable system. The stuff in current libvirt - is more or less the very first working implementation we've had of this, - so there's not been much optimization work done yet. - </p> - <p>Also notice that at the XML level we don't expose the fact we - are using iptables or ebtables at all. The rule definition is done in - terms of network protocols. Thus if we ever find a need, we could - plug in an alternative implementation that calls out to a different - firewall implementation instead of ebtables/iptables (providing that - implementation was suitably expressive of course) - </p> - <p>Finally, in terms of problems we have in deployment. The biggest - problem is that if the admin does <code>service iptables restart</code> - all our work gets blown away. We've experimented with using lokkit - to record our custom rules in a persistent config file, but that - caused different problem. Admins who were not using lokkit for - their config found that all their own rules got blown away. So - we threw away our lokkit code. Instead we document that if you - run <code>service iptables restart</code>, you need to send SIGHUP to - libvirt to make it recreate its rules. - </p> - <p>More in depth documentation on this is <a href="formatnwfilter.html">here</a>.</p> - </body> -</html> diff --git a/docs/firewall.rst b/docs/firewall.rst new file mode 100644 index 0000000000..adda0ef1f4 --- /dev/null +++ b/docs/firewall.rst @@ -0,0 +1,506 @@ +========================================= +Firewall and network filtering in libvirt +========================================= + +.. contents:: + +There are three pieces of libvirt functionality which do network filtering of +some type. At a high level they are: + +- The virtual network driver + + This provides an isolated bridge device (ie no physical NICs attached). + Guest TAP devices are attached to this bridge. Guests can talk to each + other and the host, and optionally the wider world. + +- The QEMU driver MAC filtering + + This provides a generic filtering of MAC addresses to prevent the guest + spoofing its MAC address. This is mostly obsoleted by the next item, so + won't be discussed further. + +- The network filter driver + + This provides fully configurable, arbitrary network filtering of traffic on + guest NICs. Generic rulesets are defined at the host level to control + traffic in some manner. Rules sets are then associated with individual NICs + of a guest. While not as expressive as directly using iptables/ebtables, + this can still do nearly everything you would want to on a guest NIC + filter. + +The virtual network driver +-------------------------- + +The typical configuration for guests is to use bridging of the physical NIC on +the host to connect the guest directly to the LAN. In RHEL6 there is also the +possibility of using macvtap/sr-iov and VEPA connectivity. None of this stuff +plays nicely with wireless NICs, since they will typically silently drop any +traffic with a MAC address that doesn't match that of the physical NIC. + +Thus the virtual network driver in libvirt was invented. This takes the form of +an isolated bridge device (ie one with no physical NICs attached). The TAP +devices associated with the guest NICs are attached to the bridge device. This +immediately allows guests on a single host to talk to each other and to the host +OS (modulo host IPtables rules). + +libvirt then uses iptables to control what further connectivity is available. +There are three configurations possible for a virtual network at time of +writing: + +- isolated: all off-node traffic is completely blocked +- nat: outbound traffic to the LAN is allowed, but MASQUERADED +- forward: outbound traffic to the LAN is allowed + +The latter 'forward' case requires the virtual network be on a separate sub-net +from the main LAN, and that the LAN admin has configured routing for this +subnet. In the future we intend to add support for IP subnetting and/or +proxy-arp. This allows for the virtual network to use the same subnet as the +main LAN and should avoid need for the LAN admin to configure special routing. + +Libvirt will optionally also provide DHCP services to the virtual network using +DNSMASQ. In all cases, we need to allow DNS/DHCP queries to the host OS. Since +we can't predict whether the host firewall setup is already allowing this, we +insert 4 rules into the head of the INPUT chain + +:: + + target prot opt in out source destination + ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 + ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 + ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 + ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67 + +Note we have restricted our rules to just the bridge associated with the virtual +network, to avoid opening undesirable holes in the host firewall wrt the +LAN/WAN. + +The next rules depend on the type of connectivity allowed, and go in the main +FORWARD chain: + +- | type=isolated + | Allow traffic between guests. Deny inbound. Deny outbound. + + :: + + target prot opt in out source destination + ACCEPT all -- virbr1 virbr1 0.0.0.0/0 0.0.0.0/0 + REJECT all -- * virbr1 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + REJECT all -- virbr1 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + +- | type=nat + | Allow inbound related to an established connection. Allow outbound, but + only from our expected subnet. Allow traffic between guests. Deny all other + inbound. Deny all other outbound. + + :: + + target prot opt in out source destination + ACCEPT all -- * virbr0 0.0.0.0/0 192.168.122.0/24 state RELATED,ESTABLISHED + ACCEPT all -- virbr0 * 192.168.122.0/24 0.0.0.0/0 + ACCEPT all -- virbr0 virbr0 0.0.0.0/0 0.0.0.0/0 + REJECT all -- * virbr0 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + REJECT all -- virbr0 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + +- | type=routed + | Allow inbound, but only to our expected subnet. Allow outbound, but only + from our expected subnet. Allow traffic between guests. Deny all other + inbound. Deny all other outbound. + + :: + + target prot opt in out source destination + ACCEPT all -- * virbr2 0.0.0.0/0 192.168.124.0/24 + ACCEPT all -- virbr2 * 192.168.124.0/24 0.0.0.0/0 + ACCEPT all -- virbr2 virbr2 0.0.0.0/0 0.0.0.0/0 + REJECT all -- * virbr2 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + REJECT all -- virbr2 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable + +- Finally, with type=nat, there is also an entry in the POSTROUTING chain to + apply masquerading: + + :: + + target prot opt in out source destination + MASQUERADE all -- * * 192.168.122.0/24 !192.168.122.0/24 + +firewalld and the virtual network driver +---------------------------------------- + +If `firewalld <https://firewalld.org>`__ is active on the host, libvirt will +attempt to place the bridge interface of a libvirt virtual network into the +firewalld zone named "libvirt" (thus making all guest->host traffic on that +network subject to the rules of the "libvirt" zone). This is done because, if +firewalld is using its nftables backend (available since firewalld 0.6.0) the +default firewalld zone (which would be used if libvirt didn't explicitly set the +zone) prevents forwarding traffic from guests through the bridge, as well as +preventing DHCP, DNS, and most other traffic from guests to host. The zone named +"libvirt" is installed into the firewalld configuration by libvirt (not by +firewalld), and allows forwarded traffic through the bridge as well as DHCP, +DNS, TFTP, and SSH traffic to the host - depending on firewalld's backend this +will be implemented via either iptables or nftables rules. libvirt's own rules +outlined above will \*always\* be iptables rules regardless of which backend is +in use by firewalld. + +NB: It is possible to manually set the firewalld zone for a network's interface +with the "zone" attribute of the network's "bridge" element. + +NB: Prior to libvirt 5.1.0, the firewalld "libvirt" zone did not exist, and +prior to firewalld 0.7.0 a feature crucial to making the "libvirt" zone operate +properly (rich rule priority settings) was not implemented in firewalld. In +cases where one or the other of the two packages is missing the necessary +functionality, it's still possible to have functional guest networking by +setting the firewalld backend to "iptables" (in firewalld prior to 0.6.0, this +was the only backend available). + +The network filter driver +------------------------- + +This driver provides a fully configurable network filtering capability that +leverages ebtables, iptables and ip6tables. This was written by the libvirt guys +at IBM and although its XML schema is defined by libvirt, the conceptual model +is closely aligned with the DMTF CIM schema for network filtering: + +https://www.dmtf.org/standards/cim/cim_schema_v2230/CIM_Network.pdf + +The filters are managed in libvirt as a top level, standalone object. This +allows the filters to then be referenced by any libvirt object that requires +their functionality, instead tying them only to use by guest NICs. In the +current implementation, filters can be associated with individual guest NICs via +the libvirt domain XML format. In the future we might allow filters to be +associated with the virtual network objects. Further we're expecting to define a +new 'virtual switch' object to remove the complexity of configuring +bridge/sriov/vepa networking modes. This make also end up making use of network +filters. + +There are a new set of virsh commands for managing network filters: + +- ``virsh nwfilter-define`` + define or update a network filter from an XML file +- ``virsh nwfilter-undefine`` + undefine a network filter +- ``virsh nwfilter-dumpxml`` + network filter information in XML +- ``virsh nwfilter-list`` + list network filters +- ``virsh nwfilter-edit`` + edit XML configuration for a network filter + +There are equivalently named C APIs for each of these commands. + +As with all objects libvirt manages, network filters are configured using an XML +format. At a high level the format looks like this: + +:: + + <filter name='no-spamming' chain='XXXX'> + <uuid>d217f2d7-5a04-0e01-8b98-ec2743436b74</uuid> + + <rule ...> + .... + </rule> + + <filterref filter='XXXX'/> + </filter> + +Every filter has a name and UUID which serve as unique identifiers. A filter can +have zero-or-more ``<rule>`` elements which are used to actually define network +controls. Filters can be arranged into a DAG, so zero-or-more ``<filterref/>`` +elements are also allowed. Cycles in the graph are not allowed. + +The ``<rule>`` element is where all the interesting stuff happens. It has three +attributes, an action, a traffic direction and an optional priority. E.g.: + +:: + + <rule action='drop' direction='out' priority='500'> + +Within the rule there are a wide variety of elements allowed, which do protocol +specific matching. Supported protocols currently include ``mac``, ``arp``, +``rarp``, ``ip``, ``ipv6``, ``tcp/ip``, ``icmp/ip``, ``igmp/ip``, ``udp/ip``, +``udplite/ip``, ``esp/ip``, ``ah/ip``, ``sctp/ip``, ``tcp/ipv6``, ``icmp/ipv6``, +``igmp/ipv6``, ``udp/ipv6``, ``udplite/ipv6``, ``esp/ipv6``, ``ah/ipv6``, +``sctp/ipv6``. Each protocol defines what is valid inside the <rule> element. +The general pattern though is: + +:: + + <protocol match='yes|no' attribute1='value1' attribute2='value2'/> + +So, eg a TCP protocol, matching ports 0-1023 would be expressed as: + +:: + + <tcp match='yes' srcportstart='0' srcportend='1023'/> + +Attributes can included references to variables defined by the object using the +rule. So the guest XML format allows each NIC to have a MAC address and IP +address defined. These are made available to filters via the variables ``$IP`` +and ``$MAC``. + +So to define a filter that prevents IP address spoofing we can simply match on +source IP address ``!= $IP`` like this: + +:: + + <filter name='no-ip-spoofing' chain='ipv4'> + <rule action='drop' direction='out'> + <ip match='no' srcipaddr='$IP' /> + </rule> + </filter> + +I'm not going to go into details on all the other protocol matches you can do, +because it'll take far too much space. You can read about the options +`here <formatnwfilter.html#nwfelemsRulesProto>`__. + +Out of the box in RHEL6/Fedora rawhide, libvirt ships with a set of default +useful rules: + +:: + + # virsh nwfilter-list + UUID Name + ---------------------------------------------------------------- + 15b1ab2b-b1ac-1be2-ed49-2042caba4abb allow-arp + 6c51a466-8d14-6d11-46b0-68b1a883d00f allow-dhcp + 7517ad6c-bd90-37c8-26c9-4eabcb69848d allow-dhcp-server + 7680776c-77aa-496f-90d6-13097664b925 allow-dhcpv6 + 9cdaad60-7631-4172-8ccb-ef774be7485b allow-dhcpv6-server + 3d38b406-7cf0-8335-f5ff-4b9add35f288 allow-incoming-ipv4 + 908543c1-902e-45f6-a6ca-1a0ad35e7599 allow-incoming-ipv6 + 5ff06320-9228-2899-3db0-e32554933415 allow-ipv4 + ce8904cc-ad3a-4454-896c-53452882f817 allow-ipv6 + db0b1767-d62b-269b-ea96-0cc8b451144e clean-traffic + 6d6ddcc8-1242-4c43-ac63-63af80493132 clean-traffic-gateway + 4cf38077-c7d5-4e25-99bb-6c4c9efad294 no-arp-ip-spoofing + 0b11a636-ce58-497f-be90-17f63c92487a no-arp-mac-spoofing + f88f1932-debf-4aa1-9fbe-f10d3aa4bc95 no-arp-spoofing + 772f112d-52e4-700c-0250-e178a3d91a7a no-ip-multicast + 7ee20370-8106-765d-f7ff-8a60d5aaf30b no-ip-spoofing + f8a51c43-a08f-49b3-b9e2-393d54522dc0 no-ipv6-multicast + a7f0afe9-a428-44b8-8566-c8ee2a669271 no-ipv6-spoofing + d5d3c490-c2eb-68b1-24fc-3ee362fc8af3 no-mac-broadcast + fb57c546-76dc-a372-513f-e8179011b48a no-mac-spoofing + dba10ea7-446d-76de-346f-335bd99c1d05 no-other-l2-traffic + f5c78134-9da4-0c60-a9f0-fb37bc21ac1f no-other-rarp-traffic + 7637e405-4ccf-42ac-5b41-14f8d03d8cf3 qemu-announce-self + 9aed52e7-f0f3-343e-fe5c-7dcb27b594e5 qemu-announce-self-rarp + +Most of these are just building blocks. The interesting one here is +'clean-traffic'. This pulls together all the building blocks into one filter +that you can then associate with a guest NIC. This stops the most common bad +things a guest might try, IP spoofing, arp spoofing and MAC spoofing. To look at +the rules for any of these just do: + +:: + + virsh nwfilter-dumpxml FILTERNAME|UUID + +They are all stored in ``/etc/libvirt/nwfilter``, but don't edit the files there +directly. Use ``virsh nwfilter-define`` to update them. This ensures the guests +have their iptables/ebtables rules recreated. + +To associate the clean-traffic filter with a guest, edit the guest XML config +and change the ``<interface>`` element to include a ``<filterref>`` and also +specify the ``<ip address/>`` that the guest is allowed to use: + +:: + + <interface type='bridge'> + <mac address='52:54:00:56:44:32'/> + <source bridge='br1'/> + <ip address='10.33.8.131'/> + <target dev='vnet0'/> + <model type='virtio'/> + <filterref filter='clean-traffic'/> + </interface> + +If no ``<ip address>`` is included, the network filter driver will activate its +'learning mode'. This uses libpcap to snoop on network traffic the guest sends +and attempts to identify the first IP address it uses. It then locks traffic to +this address. Obviously this isn't entirely secure, but it does offer some +protection against the guest being trojaned once up and running. In the future +we intend to enhance the learning mode so that it looks for DHCPOFFERS from a +trusted DHCP server and only allows the offered IP address to be used. + +Now, how is all this implemented...? + +The network filter driver uses a combination of ebtables, iptables and +ip6tables, depending on which protocols are referenced in a filter. The out of +the box 'clean-traffic' filter rules only require use of ebtables. If you want +to do matching at tcp/udp/etc protocols (eg to add a new filter +'no-email-spamming' to block port 25), then iptables will also be used. + +The driver attempts to keep its rules separate from those that the host admin +might already have configured. So the first thing it does with ebtables, is to +add two hooks in POSTROUTING and PREROUTING chains, to redirect traffic to +custom chains. These hooks match on the TAP device name of the guest NIC, so +they should not interact badly with any administrator defined rules: + +:: + + Bridge chain: PREROUTING, entries: 1, policy: ACCEPT + -i vnet0 -j libvirt-I-vnet0 + + Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT + -o vnet0 -j libvirt-O-vnet0 + +To keep things manageable and easy to follow, the driver will then create +further sub-chains for each protocol then it needs to match against: + +:: + + Bridge chain: libvirt-I-vnet0, entries: 5, policy: ACCEPT + -p IPv4 -j I-vnet0-ipv4 + -p ARP -j I-vnet0-arp + -p 0x8035 -j I-vnet0-rarp + -p 0x835 -j ACCEPT + -j DROP + + Bridge chain: libvirt-O-vnet0, entries: 4, policy: ACCEPT + -p IPv4 -j O-vnet0-ipv4 + -p ARP -j O-vnet0-arp + -p 0x8035 -j O-vnet0-rarp + -j DROP + +Finally, here comes the actual implementation of the filters. This example shows +the 'clean-traffic' filter implementation. I'm not going to explain what this is +doing now. :-) + +:: + + Bridge chain: I-vnet0-ipv4, entries: 2, policy: ACCEPT + -s ! 52:54:0:56:44:32 -j DROP + -p IPv4 --ip-src ! 10.33.8.131 -j DROP + + Bridge chain: O-vnet0-ipv4, entries: 1, policy: ACCEPT + -j ACCEPT + + Bridge chain: I-vnet0-arp, entries: 6, policy: ACCEPT + -s ! 52:54:0:56:44:32 -j DROP + -p ARP --arp-mac-src ! 52:54:0:56:44:32 -j DROP + -p ARP --arp-ip-src ! 10.33.8.131 -j DROP + -p ARP --arp-op Request -j ACCEPT + -p ARP --arp-op Reply -j ACCEPT + -j DROP + + Bridge chain: O-vnet0-arp, entries: 5, policy: ACCEPT + -p ARP --arp-op Reply --arp-mac-dst ! 52:54:0:56:44:32 -j DROP + -p ARP --arp-ip-dst ! 10.33.8.131 -j DROP + -p ARP --arp-op Request -j ACCEPT + -p ARP --arp-op Reply -j ACCEPT + -j DROP + + Bridge chain: I-vnet0-rarp, entries: 2, policy: ACCEPT + -p 0x8035 -s 52:54:0:56:44:32 -d Broadcast --arp-op Request_Reverse --arp-ip-src 0.0.0.0 --arp-ip-dst 0.0.0.0 --arp-mac-src 52:54:0:56:44:32 --arp-mac-dst 52:54:0:56:44:32 -j ACCEPT + -j DROP + + Bridge chain: O-vnet0-rarp, entries: 2, policy: ACCEPT + -p 0x8035 -d Broadcast --arp-op Request_Reverse --arp-ip-src 0.0.0.0 --arp-ip-dst 0.0.0.0 --arp-mac-src 52:54:0:56:44:32 --arp-mac-dst 52:54:0:56:44:32 -j ACCEPT + -j DROP + +NB, we would have liked to include the prefix 'libvirt-' in all of our chain +names, but unfortunately the kernel limits names to a very short maximum length. +So only the first two custom chains can include that prefix. The others just +include the TAP device name + protocol name. + +If I define a new filter 'no-spamming' and then add this to the 'clean-traffic' +filter, I can illustrate how iptables usage works: + +:: + + # cat > /root/spamming.xml <<EOF + <filter name='no-spamming' chain='root'> + <uuid>d217f2d7-5a04-0e01-8b98-ec2743436b74</uuid> + <rule action='drop' direction='out' priority='500'> + <tcp dstportstart='25' dstportend='25'/> + </rule> + </filter> + EOF + # virsh nwfilter-define /root/spamming.xml + # virsh nwfilter-edit clean-traffic + +...add ``<filterref filter='no-spamming'/>`` + +All active guests immediately have their iptables/ebtables rules rebuilt. + +The network filter driver deals with iptables in a very similar way. First it +separates out its rules from those the admin may have defined, by adding a +couple of hooks into the INPUT/FORWARD chains: + +:: + + Chain INPUT (policy ACCEPT 13M packets, 21G bytes) + target prot opt in out source destination + libvirt-host-in all -- * * 0.0.0.0/0 0.0.0.0/0 + + Chain FORWARD (policy ACCEPT 5532K packets, 3010M bytes) + target prot opt in out source destination + libvirt-in all -- * * 0.0.0.0/0 0.0.0.0/0 + libvirt-out all -- * * 0.0.0.0/0 0.0.0.0/0 + libvirt-in-post all -- * * 0.0.0.0/0 0.0.0.0/0 + +These custom chains then do matching based on the TAP device name, so they won't +open holes in the admin defined matches for the LAN/WAN (if any). + +:: + + Chain libvirt-host-in (1 references) + target prot opt in out source destination + HI-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-in vnet0 + + Chain libvirt-in (1 references) + target prot opt in out source destination + FI-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-in vnet0 + + Chain libvirt-in-post (1 references) + target prot opt in out source destination + ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vnet0 + + Chain libvirt-out (1 references) + target prot opt in out source destination + FO-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-out vnet0 + +Finally, we can see the interesting bit which is the actual implementation of my +filter to block port 25 access: + +:: + + Chain FI-vnet0 (1 references) + target prot opt in out source destination + DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:25 + + Chain FO-vnet0 (1 references) + target prot opt in out source destination + DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:25 + + Chain HI-vnet0 (1 references) + target prot opt in out source destination + DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:25 + +One thing in looking at this you may notice is that if there are many guests all +using the same filters, we will be duplicating the iptables rules over and over +for each guest. This is merely a limitation of the current rules engine +implementation. At the libvirt object modelling level you can clearly see we've +designed the model so filter rules are defined in one place, and indirectly +referenced by guests. Thus it should be possible to change the implementation in +the future so we can share the actual iptables/ebtables rules for each guest to +create a more scalable system. The stuff in current libvirt is more or less the +very first working implementation we've had of this, so there's not been much +optimization work done yet. + +Also notice that at the XML level we don't expose the fact we are using iptables +or ebtables at all. The rule definition is done in terms of network protocols. +Thus if we ever find a need, we could plug in an alternative implementation that +calls out to a different firewall implementation instead of ebtables/iptables +(providing that implementation was suitably expressive of course) + +Finally, in terms of problems we have in deployment. The biggest problem is that +if the admin does ``service iptables restart`` all our work gets blown away. +We've experimented with using lokkit to record our custom rules in a persistent +config file, but that caused different problem. Admins who were not using lokkit +for their config found that all their own rules got blown away. So we threw away +our lokkit code. Instead we document that if you run +``service iptables restart``, you need to send SIGHUP to libvirt to make it +recreate its rules. + +More in depth documentation on this is `here <formatnwfilter.html>`__. diff --git a/docs/meson.build b/docs/meson.build index 6147f85d16..aa8bad89f0 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'firewall', 'format', 'formatcaps', 'formatdomaincaps', @@ -82,6 +81,7 @@ docs_rst_files = [ 'drvvmware', 'drvxen', 'errors', + 'firewall', 'formatbackup', 'formatcheckpoint', 'formatdomain', -- 2.35.1

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/format.html.in | 48 --------------------------------------------- docs/format.rst | 35 +++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 36 insertions(+), 49 deletions(-) delete mode 100644 docs/format.html.in create mode 100644 docs/format.rst diff --git a/docs/format.html.in b/docs/format.html.in deleted file mode 100644 index 1d2456de6f..0000000000 --- a/docs/format.html.in +++ /dev/null @@ -1,48 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>XML Format</h1> - - - <p> - Objects in the libvirt API are configured using XML documents to allow - for ease of extension in future releases. Each XML document has an - associated Relax-NG schema that can be used to validate documents - prior to usage. - </p> - - - <ul> - <li><a href="formatdomain.html">Domains</a></li> - <li><a href="formatnetwork.html">Networks</a></li> - <li><a href="formatnwfilter.html">Network filtering</a></li> - <li><a href="formatnetworkport.html">Network ports</a></li> - <li><a href="formatstorage.html">Storage</a></li> - <li><a href="formatstorageencryption.html">Storage encryption</a></li> - <li><a href="formatcaps.html">Capabilities</a></li> - <li><a href="formatdomaincaps.html">Domain capabilities</a></li> - <li><a href="formatstoragecaps.html">Storage Pool capabilities</a></li> - <li><a href="formatnode.html">Node devices</a></li> - <li><a href="formatsecret.html">Secrets</a></li> - <li><a href="formatsnapshot.html">Snapshots</a></li> - <li><a href="formatcheckpoint.html">Checkpoints</a></li> - <li><a href="formatbackup.html">Backup jobs</a></li> - </ul> - - <h2>Command line validation</h2> - - <p> - The <code>virt-xml-validate</code> tool provides a simple command line - for validating XML documents prior to giving them to libvirt. It uses - the locally installed RNG schema documents. It will auto-detect which - schema to use for validation based on the name of the top level element - in the input document. Thus it merely requires the XML document filename - to be passed on the command line - </p> - - <pre> -$ virt-xml-validate /path/to/XML/file</pre> - - </body> -</html> diff --git a/docs/format.rst b/docs/format.rst new file mode 100644 index 0000000000..a261007e73 --- /dev/null +++ b/docs/format.rst @@ -0,0 +1,35 @@ +========== +XML Format +========== + +Objects in the libvirt API are configured using XML documents to allow for ease +of extension in future releases. Each XML document has an associated Relax-NG +schema that can be used to validate documents prior to usage. + +- `Domains <formatdomain.html>`__ +- `Networks <formatnetwork.html>`__ +- `Network filtering <formatnwfilter.html>`__ +- `Network ports <formatnetworkport.html>`__ +- `Storage <formatstorage.html>`__ +- `Storage encryption <formatstorageencryption.html>`__ +- `Capabilities <formatcaps.html>`__ +- `Domain capabilities <formatdomaincaps.html>`__ +- `Storage Pool capabilities <formatstoragecaps.html>`__ +- `Node devices <formatnode.html>`__ +- `Secrets <formatsecret.html>`__ +- `Snapshots <formatsnapshot.html>`__ +- `Checkpoints <formatcheckpoint.html>`__ +- `Backup jobs <formatbackup.html>`__ + +Command line validation +----------------------- + +The ``virt-xml-validate`` tool provides a simple command line for validating XML +documents prior to giving them to libvirt. It uses the locally installed RNG +schema documents. It will auto-detect which schema to use for validation based +on the name of the top level element in the input document. Thus it merely +requires the XML document filename to be passed on the command line + +:: + + $ virt-xml-validate /path/to/XML/file diff --git a/docs/meson.build b/docs/meson.build index aa8bad89f0..acc455c7c7 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'format', 'formatcaps', 'formatdomaincaps', 'formatnetwork', @@ -82,6 +81,7 @@ docs_rst_files = [ 'drvxen', 'errors', 'firewall', + 'format', 'formatbackup', 'formatcheckpoint', 'formatdomain', -- 2.35.1

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/formatcaps.html.in | 219 ---------------------------------------- docs/formatcaps.rst | 196 +++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 197 insertions(+), 220 deletions(-) delete mode 100644 docs/formatcaps.html.in create mode 100644 docs/formatcaps.rst diff --git a/docs/formatcaps.html.in b/docs/formatcaps.html.in deleted file mode 100644 index 09662f78c8..0000000000 --- a/docs/formatcaps.html.in +++ /dev/null @@ -1,219 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Driver capabilities XML format</h1> - - <ul id="toc"></ul> - - <h2><a id="elements">Element and attribute overview</a></h2> - - <p>As new virtualization engine support gets added to libvirt, and to - handle cases like QEMU supporting a variety of emulations, a query - interface has been added in 0.2.1 allowing to list the set of supported - virtualization capabilities on the host:</p> - - <pre>char * virConnectGetCapabilities (virConnectPtr conn);</pre> - - <p>The value returned is an XML document listing the virtualization - capabilities of the host and virtualization engine to which - <code>@conn</code> is connected. One can test it using <code>virsh</code> - command line tool command '<code>capabilities</code>', it dumps the XML - associated to the current connection. </p> - - <p>As can be seen in the <a href="#elementExamples">example</a>, the - capabilities XML consists of the <code>capabilities</code> element which - have exactly one <code>host</code> child element to report information on - host capabilities, and zero or more <code>guest</code> element to express - the set of architectures the host can run at the moment.</p> - - - <h3><a id="elementHost">Host capabilities</a></h3> - - <p>The <code><host/></code> element consists of the following child - elements:</p> - <dl> - <dt><code>uuid</code></dt> - <dd>The host UUID.</dd> - - <dt><code>cpu</code></dt> - <dd>The host CPU architecture and features.</dd> - - <dt><code>power_management</code></dt> - <dd>whether host is capable of memory suspend, disk hibernation, or - hybrid suspend.</dd> - - <dt><code>migration_features</code></dt> - <dd>This element exposes information on the hypervisor's migration - capabilities, like live migration, supported URI transports, and so - on.</dd> - - <dt><code>topology</code></dt> - <dd>This element embodies the host internal topology. Management - applications may want to learn this information when orchestrating new - guests - e.g. due to reduce inter-NUMA node transfers.</dd> - - <dt><code>secmodel</code></dt> - <dd>To find out default security labels for different security models you - need to parse this element. In contrast with the former elements, this is - repeated for each security model the libvirt daemon currently supports. - </dd> - </dl> - - - <h3><a id="elementGuest">Guest capabilities</a></h3> - - <p>While the <a href="#elementHost">previous section</a> aims at host - capabilities, this one focuses on capabilities available to a guest - using a given hypervisor. The <code><guest/></code> element will - typically wrap up the following elements:</p> - - <dl> - <dt><code>os_type</code></dt> - <dd>This expresses what kind of operating system the hypervisor - is able to run. Possible values are: - <dl> - <dt><code>xen</code></dt> - <dd>for XEN PV</dd> - - <dt><code>linux</code></dt> - <dd>legacy alias for <code>xen</code></dd> - - <dt><code>xenpvh</code></dt> - <dd>for XEN PVH</dd> - - <dt><code>hvm</code></dt> - <dd>Unmodified operating system</dd> - - <dt><code>exe</code></dt> - <dd>Container based virtualization</dd> - </dl> - </dd> - - <dt><code>arch</code></dt> - <dd>This element brings some information on supported guest - architecture. Possible subelements are: - <dl> - <dt><code>wordsize</code></dt><dd>Size of CPU word in bits, for example 64.</dd> - <dt><code>emulator</code></dt><dd>Emulator (device model) path, for - use in <a href="formatdomain.html#elementEmulator">emulator</a> - element of domain XML.</dd> - <dt><code>loader</code></dt><dd>Loader path, for use in - <a href="formatdomain.html#elementLoader">loader</a> element of domain - XML.</dd> - <dt><code>machine</code></dt><dd>Machine type, for use in - <a href="formatdomain.html#attributeOSTypeMachine">machine</a> - attribute of os/type element in domain XML. For example Xen - supports <code>xenfv</code> for HVM, <code>xenpv</code> for - PV, or <code>xenpvh</code> for PVH.</dd> - <dt><code>domain</code></dt><dd>The <code>type</code> attribute of - this element specifies the type of hypervisor required to run the - domain. Use in <a href="formatdomain.html#attributeDomainType">type</a> - attribute of the domain root element.</dd> - </dl> - </dd> - - <dt><code>features</code></dt> - <dd>This optional element encases possible features that can be used - with a guest of described type. Possible subelements are: - <dl> - <dt><code>pae</code></dt><dd>If present, 32-bit guests can use PAE - address space extensions, <span class="since">since - 0.4.1</span></dd> - <dt><code>nonpae</code></dt><dd>If present, 32-bit guests can be run - without requiring PAE, <span class="since">since - 0.4.1</span></dd> - <dt><code>ia64_be</code></dt><dd>If present, IA64 guests can be run in - big-endian mode, <span class="since">since 0.4.1</span></dd> - <dt><code>acpi</code></dt><dd>If this element is present, - the <code>default</code> attribute describes whether the - hypervisor exposes ACPI to the guest by default, and - the <code>toggle</code> attribute describes whether the - user can override this - default. <span class="since">Since 0.4.1</span></dd> - <dt><code>apic</code></dt><dd>If this element is present, - the <code>default</code> attribute describes whether the - hypervisor exposes APIC to the guest by default, and - the <code>toggle</code> attribute describes whether the - user can override this - default. <span class="since">Since 0.4.1</span></dd> - <dt><code>cpuselection</code></dt><dd>If this element is present, the - hypervisor supports the <code><cpu></code> element - within a domain definition for fine-grained control over - the CPU presented to the - guest. <span class="since">Since 0.7.5</span></dd> - <dt><code>deviceboot</code></dt><dd>If this element is present, - the <code><boot order='...'/></code> element can - be used inside devices, rather than the older boot - specification by category. <span class="since">Since - 0.8.8</span></dd> - <dt><code>disksnapshot</code></dt><dd>If this element is present, - the <code>default</code> attribute describes whether - external disk snapshots are supported. If absent, - external snapshots may still be supported, but it - requires attempting the API and checking for an error to - find out for sure. <span class="since">Since - 1.2.3</span></dd> - </dl> - </dd> - </dl> - - <h3><a id="elementExamples">Examples</a></h3> - - <p>For example, in the case of a 64-bit machine with hardware - virtualization capabilities enabled in the chip and - BIOS you will see:</p> - - <pre><capabilities> - <span style="color: #E50000"><host> - <cpu> - <arch>x86_64</arch> - <features> - <vmx/> - </features> - <model>core2duo</model> - <vendor>Intel</vendor> - <topology sockets="1" dies="1" cores="2" threads="1"/> - <feature name="lahf_lm"/> - <feature name='xtpr'/> - ... - </cpu> - <power_management> - <suspend_mem/> - <suspend_disk/> - <suspend_hybrid/> - </power_management> - </host></span> - - <!-- xen-3.0-x86_64 --> - <span style="color: #0000E5"><guest> - <os_type>xen</os_type> - <arch name="x86_64"> - <wordsize>64</wordsize> - <domain type="xen"></domain> - <emulator>/usr/lib64/xen/bin/qemu-dm</emulator> - </arch> - <features> - </features> - </guest></span> - - <!-- hvm-3.0-x86_32 --> - <span style="color: #00B200"><guest> - <os_type>hvm</os_type> - <arch name="i686"> - <wordsize>32</wordsize> - <domain type="xen"></domain> - <emulator>/usr/lib/xen/bin/qemu-dm</emulator> - <machine>pc</machine> - <machine>isapc</machine> - <loader>/usr/lib/xen/boot/hvmloader</loader> - </arch> - <features> - <cpuselection/> - <deviceboot/> - </features> - </guest></span> - ... -</capabilities></pre> - </body> -</html> diff --git a/docs/formatcaps.rst b/docs/formatcaps.rst new file mode 100644 index 0000000000..1ba847cea1 --- /dev/null +++ b/docs/formatcaps.rst @@ -0,0 +1,196 @@ +.. role:: since + +============================== +Driver capabilities XML format +============================== + +.. contents:: + +Element and attribute overview +------------------------------ + +As new virtualization engine support gets added to libvirt, and to handle cases +like QEMU supporting a variety of emulations, a query interface has been added +in 0.2.1 allowing to list the set of supported virtualization capabilities on +the host: + +:: + + char * virConnectGetCapabilities (virConnectPtr conn); + +The value returned is an XML document listing the virtualization capabilities of +the host and virtualization engine to which ``@conn`` is connected. One can test +it using ``virsh`` command line tool command '``capabilities``', it dumps the +XML associated to the current connection. + +As can be seen in the `example <#elementExamples>`__, the capabilities XML +consists of the ``capabilities`` element which have exactly one ``host`` child +element to report information on host capabilities, and zero or more ``guest`` +element to express the set of architectures the host can run at the moment. + +Host capabilities +~~~~~~~~~~~~~~~~~ + +The ``<host/>`` element consists of the following child elements: + +``uuid`` + The host UUID. +``cpu`` + The host CPU architecture and features. +``power_management`` + whether host is capable of memory suspend, disk hibernation, or hybrid + suspend. +``migration_features`` + This element exposes information on the hypervisor's migration capabilities, + like live migration, supported URI transports, and so on. +``topology`` + This element embodies the host internal topology. Management applications may + want to learn this information when orchestrating new guests - e.g. due to + reduce inter-NUMA node transfers. +``secmodel`` + To find out default security labels for different security models you need to + parse this element. In contrast with the former elements, this is repeated + for each security model the libvirt daemon currently supports. + +Guest capabilities +~~~~~~~~~~~~~~~~~~ + +While the `previous section <#elementHost>`__ aims at host capabilities, this +one focuses on capabilities available to a guest using a given hypervisor. The +``<guest/>`` element will typically wrap up the following elements: + +``os_type`` + This expresses what kind of operating system the hypervisor is able to run. + Possible values are: + + ``xen`` + for XEN PV + ``linux`` + legacy alias for ``xen`` + ``xenpvh`` + for XEN PVH + ``hvm`` + Unmodified operating system + ``exe`` + Container based virtualization +``arch`` + This element brings some information on supported guest architecture. + Possible subelements are: + + ``wordsize`` + Size of CPU word in bits, for example 64. + ``emulator`` + Emulator (device model) path, for use in + `emulator <formatdomain.html#elementEmulator>`__ element of domain XML. + ``loader`` + Loader path, for use in `loader <formatdomain.html#elementLoader>`__ + element of domain XML. + ``machine`` + Machine type, for use in + `machine <formatdomain.html#attributeOSTypeMachine>`__ attribute of + os/type element in domain XML. For example Xen supports ``xenfv`` for HVM, + ``xenpv`` for PV, or ``xenpvh`` for PVH. + ``domain`` + The ``type`` attribute of this element specifies the type of hypervisor + required to run the domain. Use in + `type <formatdomain.html#attributeDomainType>`__ attribute of the domain + root element. +``features`` + This optional element encases possible features that can be used with a guest + of described type. Possible subelements are: + + ``pae`` + If present, 32-bit guests can use PAE address space extensions, + :since:`since 0.4.1` + ``nonpae`` + If present, 32-bit guests can be run without requiring PAE, :since:`since + 0.4.1` + ``ia64_be`` + If present, IA64 guests can be run in big-endian mode, :since:`since + 0.4.1` + ``acpi`` + If this element is present, the ``default`` attribute describes whether + the hypervisor exposes ACPI to the guest by default, and the ``toggle`` + attribute describes whether the user can override this default. + :since:`Since 0.4.1` + ``apic`` + If this element is present, the ``default`` attribute describes whether + the hypervisor exposes APIC to the guest by default, and the ``toggle`` + attribute describes whether the user can override this default. + :since:`Since 0.4.1` + ``cpuselection`` + If this element is present, the hypervisor supports the ``<cpu>`` element + within a domain definition for fine-grained control over the CPU presented + to the guest. :since:`Since 0.7.5` + ``deviceboot`` + If this element is present, the ``<boot order='...'/>`` element can be + used inside devices, rather than the older boot specification by category. + :since:`Since 0.8.8` + ``disksnapshot`` + If this element is present, the ``default`` attribute describes whether + external disk snapshots are supported. If absent, external snapshots may + still be supported, but it requires attempting the API and checking for an + error to find out for sure. :since:`Since 1.2.3` + +Examples +~~~~~~~~ + +For example, in the case of a 64-bit machine with hardware virtualization +capabilities enabled in the chip and BIOS you will see: + +:: + + <capabilities> + <host> + <cpu> + <arch>x86_64</arch> + <features> + <vmx/> + </features> + <model>core2duo</model> + <vendor>Intel</vendor> + <topology sockets="1" dies="1" cores="2" threads="1"/> + <feature name="lahf_lm"/> + <feature name='xtpr'/> + ... + </cpu> + <power_management> + <suspend_mem/> + <suspend_disk/> + <suspend_hybrid/> + </power_management> + </host> + + + <!-- xen-3.0-x86_64 --> + <guest> + <os_type>xen</os_type> + <arch name="x86_64"> + <wordsize>64</wordsize> + <domain type="xen"></domain> + <emulator>/usr/lib64/xen/bin/qemu-dm</emulator> + </arch> + <features> + </features> + </guest> + + + <!-- hvm-3.0-x86_32 --> + <guest> + <os_type>hvm</os_type> + <arch name="i686"> + <wordsize>32</wordsize> + <domain type="xen"></domain> + <emulator>/usr/lib/xen/bin/qemu-dm</emulator> + <machine>pc</machine> + <machine>isapc</machine> + <loader>/usr/lib/xen/boot/hvmloader</loader> + </arch> + <features> + <cpuselection/> + <deviceboot/> + </features> + </guest> + + ... + </capabilities> diff --git a/docs/meson.build b/docs/meson.build index acc455c7c7..95c9babcf5 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'formatcaps', 'formatdomaincaps', 'formatnetwork', 'formatnetworkport', @@ -83,6 +82,7 @@ docs_rst_files = [ 'firewall', 'format', 'formatbackup', + 'formatcaps', 'formatcheckpoint', 'formatdomain', 'formatsecret', -- 2.35.1

On Mon, Mar 28, 2022 at 14:10:32 +0200, Peter Krempa wrote:
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/formatcaps.html.in | 219 ---------------------------------------- docs/formatcaps.rst | 196 +++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 197 insertions(+), 220 deletions(-) delete mode 100644 docs/formatcaps.html.in create mode 100644 docs/formatcaps.rst
There was a modification in the html.in file now, so I'll drop this patch from the series and re-convert it later in another pass.

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/formatdomain.rst | 20 +- docs/formatdomaincaps.html.in | 693 ---------------------------------- docs/formatdomaincaps.rst | 602 +++++++++++++++++++++++++++++ docs/kbase/backing_chains.rst | 2 +- docs/meson.build | 2 +- 5 files changed, 614 insertions(+), 705 deletions(-) delete mode 100644 docs/formatdomaincaps.html.in create mode 100644 docs/formatdomaincaps.rst diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 95ace2677e..2dc52baa14 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -1406,15 +1406,15 @@ In case no restrictions need to be put on CPU model and its features, a simpler expected. :since:`Since 3.2.0 and QEMU 2.9.0` this mode works the way it was designed and it is indicated by the ``fallback`` attribute set to ``forbid`` in the host-model CPU definition advertised in `domain - capabilities XML <formatdomaincaps.html#elementsCPU>`__. When ``fallback`` - attribute is set to ``allow`` in the domain capabilities XML, it is - recommended to use ``custom`` mode with just the CPU model from the host - capabilities XML. :since:`Since 1.2.11` PowerISA allows processors to run - VMs in binary compatibility mode supporting an older version of ISA. - Libvirt on PowerPC architecture uses the ``host-model`` to signify a guest - mode CPU running in binary compatibility mode. Example: When a user needs - a power7 VM to run in compatibility mode on a Power8 host, this can be - described in XML as follows : + capabilities XML <formatdomaincaps.html#cpu-configuration>`__. When + ``fallback`` attribute is set to ``allow`` in the domain capabilities + XML, it is recommended to use ``custom`` mode with just the CPU model + from the host capabilities XML. :since:`Since 1.2.11` PowerISA allows + processors to run VMs in binary compatibility mode supporting an older + version of ISA. Libvirt on PowerPC architecture uses the ``host-model`` + to signify a guest mode CPU running in binary compatibility mode. + Example: When a user needs a power7 VM to run in compatibility mode on a + Power8 host, this can be described in XML as follows : :: @@ -2902,7 +2902,7 @@ paravirtualized driver is specified via the ``disk`` element. This element describes the backing store used by the disk specified by sibling ``source`` element. :since:`Since 1.2.4.` If the hypervisor driver does not support the - `backingStoreInput <formatdomaincaps.html#featureBackingStoreInput>`__ ( + `backingStoreInput <formatdomaincaps.html#backingstoreinput>`__ ( :since:`Since 5.10.0` ) domain feature the ``backingStore`` is ignored on input and only used for output to describe the detected backing chains of running domains. If ``backingStoreInput`` is supported the ``backingStore`` diff --git a/docs/formatdomaincaps.html.in b/docs/formatdomaincaps.html.in deleted file mode 100644 index 35b8bf3def..0000000000 --- a/docs/formatdomaincaps.html.in +++ /dev/null @@ -1,693 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Domain capabilities XML format</h1> - - <ul id="toc"></ul> - - <h2><a id="Overview">Overview</a></h2> - - <p>Sometimes, when a new domain is to be created it may come handy to know - the capabilities of the hypervisor so the correct combination of devices and - drivers is used. For example, when management application is considering the - mode for a host device's passthrough there are several options depending not - only on host, but on hypervisor in question too. If the hypervisor is qemu - then it needs to be more recent to support VFIO, while legacy KVM is - achievable just fine with older qemus.</p> - - <p>The main difference between - <a href="/html/libvirt-libvirt-host.html#virConnectGetCapabilities"> - <code>virConnectGetCapabilities</code> - </a> - and the emulator capabilities API is, the former one aims more on - the host capabilities (e.g. NUMA topology, security models in - effect, etc.) while the latter one specializes on the hypervisor - capabilities.</p> - - <p>While the <a href="formatcaps.html">Driver Capabilities</a> provides the - host capabilities (e.g NUMA topology, security models in effect, etc.), the - Domain Capabilities provides the hypervisor specific capabilities for - Management Applications to query and make decisions regarding what to - utilize.</p> - - <p>The Domain Capabilities can provide information such as the correct - combination of devices and drivers that are supported. Knowing which host - and hypervisor specific options are available or supported would allow the - management application to choose an appropriate mode for a pass-through - host device as well as which adapter to utilize.</p> - - <p>Some XML elements may be entirely omitted from the domaincapabilities - XML, depending on what the libvirt driver has filled in. Applications - should only act on what is explicitly reported in the domaincapabilities - XML. For example, if <disk supported='yes'/> is present, you can safely - assume the driver supports <disk> devices. If <disk supported='no'/> is - present, you can safely assume the driver does NOT support <disk> - devices. If the <disk> block is omitted entirely, the driver is not - indicating one way or the other whether it supports <disk> devices, and - applications should not interpret the missing block to mean any thing in - particular.</p> - - <h2><a id="elements">Element and attribute overview</a></h2> - - <p> A new query interface was added to the virConnect API's to retrieve the - XML listing of the set of domain capabilities (<span class="since">Since - 1.2.7</span>):</p> - -<pre> -<a href="/html/libvirt-libvirt-domain.html#virConnectGetDomainCapabilities">virConnectGetDomainCapabilities</a> -</pre> - - <p>The root element that emulator capability XML document starts with has - name <code>domainCapabilities</code>. It contains at least four direct - child elements:</p> - -<pre> -<domainCapabilities> - <path>/usr/bin/qemu-system-x86_64</path> - <domain>kvm</domain> - <machine>pc-i440fx-2.1</machine> - <arch>x86_64</arch> - ... -</domainCapabilities> -</pre> - <dl> - <dt><code>path</code></dt> - <dd>The full path to the emulator binary.</dd> - - <dt><code>domain</code></dt> - <dd>Describes the <a href="formatdomain.html#elements">virtualization - type</a> (or so called domain type).</dd> - - <dt><code>machine</code></dt> - <dd>The domain's <a href="formatdomain.html#elementsOSBIOS">machine - type</a>. Since not every hypervisor has a sense of machine types - this element might be omitted in such drivers.</dd> - - <dt><code>arch</code></dt> - <dd>The domain's <a href="formatdomain.html#elementsOSBIOS"> - architecture</a>.</dd> - - </dl> - - <h3><a id="elementsCPUAllocation">CPU Allocation</a></h3> - - <p>Before any devices capability occurs, there might be info on domain - wide capabilities, e.g. virtual CPUs:</p> - -<pre> -<domainCapabilities> - ... - <vcpu max='255'/> - ... -</domainCapabilities> -</pre> - - <dl> - <dt><code>vcpu</code></dt> - <dd>The maximum number of supported virtual CPUs</dd> - </dl> - - <h3><a id="elementsOSBIOS">BIOS bootloader</a></h3> - - <p>Sometimes users might want to tweak some BIOS knobs or use - UEFI. For cases like that, <a - href="formatdomain.html#elementsOSBIOS"><code>os</code></a> - element exposes what values can be passed to its children.</p> - -<pre> -<domainCapabilities> - ... - <os supported='yes'> - <enum name='firmware'> - <value>bios</value> - <value>efi</value> - </enum> - <loader supported='yes'> - <value>/usr/share/OVMF/OVMF_CODE.fd</value> - <enum name='type'> - <value>rom</value> - <value>pflash</value> - </enum> - <enum name='readonly'> - <value>yes</value> - <value>no</value> - </enum> - <enum name='secure'> - <value>yes</value> - <value>no</value> - </enum> - </loader> - </os> - ... -<domainCapabilities> -</pre> - - <p>The <code>firmware</code> enum corresponds to the - <code>firmware</code> attribute of the <code>os</code> element in - the domain XML. The presence of this enum means libvirt is capable - of the so-called firmware auto-selection feature. And the listed - firmware values represent the accepted input in the domain - XML. Note that the <code>firmware</code> enum reports only those - values for which a firmware "descriptor file" exists on the host. - Firmware descriptor file is a small JSON document that describes - details about a given BIOS or UEFI binary on the host, e.g. the - firmware binary path, its architecture, supported machine types, - NVRAM template, etc. This ensures that the reported values won't - cause a failure on guest boot. - </p> - - <p>For the <code>loader</code> element, the following can occur:</p> - - <dl> - <dt><code>value</code></dt> - <dd>List of known firmware binary paths. Currently this is used - only to advertise the known location of OVMF binaries for - QEMU. OVMF binaries will only be listed if they actually exist on - host.</dd> - - <dt><code>type</code></dt> - <dd>Whether the boot loader is a typical BIOS (<code>rom</code>) - or a UEFI firmware (<code>pflash</code>). Each <code>value</code> - sub-element under the <code>type</code> enum represents a possible - value for the <code>type</code> attribute for the <loader/> - element in the domain XML. E.g. the presence - of <code>pfalsh</code> under the <code>type</code> enum means that - a domain XML can use UEFI firmware via: <loader/> - type="pflash" ...>/path/to/the/firmware/binary/</loader>. - </dd> - - <dt><code>readonly</code></dt> - <dd>Options for the <code>readonly</code> attribute of the - <loader/> element in the domain XML.</dd> - - <dt><code>secure</code></dt> - <dd>Options for the <code>secure</code> attribute of the - <loader/> element in the domain XML. Note that the - value <code>yes</code> is listed only if libvirt detects a - firmware descriptor file that has path to an OVMF binary that - supports Secure boot, and lists its architecture and supported - machine type.</dd> - </dl> - - <h3><a id="elementsCPU">CPU configuration</a></h3> - - <p> - The <code>cpu</code> element exposes options usable for configuring - <a href="formatdomain.html#elementsCPU">guest CPUs</a>. - </p> - -<pre> -<domainCapabilities> - ... - <cpu> - <mode name='host-passthrough' supported='yes'> - <enum name='hostPassthroughMigratable'> - <value>on</value> - <value>off</value> - </enum> - </mode> - <mode name='maximum' supported='yes'> - <enum name='maximumMigratable'> - <value>on</value> - <value>off</value> - </enum> - </mode> - <mode name='host-model' supported='yes'> - <model fallback='allow'>Broadwell</model> - <vendor>Intel</vendor> - <feature policy='disable' name='aes'/> - <feature policy='require' name='vmx'/> - </mode> - <mode name='custom' supported='yes'> - <model usable='no' deprecated='no'>Broadwell</model> - <model usable='yes' deprecated='no'>Broadwell-noTSX</model> - <model usable='no' deprecated='yes'>Haswell</model> - ... - </mode> - </cpu> - ... -<domainCapabilities> -</pre> - - <p> - Each CPU mode understood by libvirt is described with a - <code>mode</code> element which tells whether the particular mode - is supported and provides (when applicable) more details about it: - </p> - - <dl> - <dt><code>host-passthrough</code></dt> - <dd> - The <code>hostPassthroughMigratable</code> enum shows possible values - of the <code>migratable</code> attribute for the <cpu> element - with <code>mode='host-passthrough'</code> in the domain XML. - </dd> - - <dt><code>host-model</code></dt> - <dd> - If <code>host-model</code> is supported by the hypervisor, the - <code>mode</code> describes the guest CPU which will be used when - starting a domain with <code>host-model</code> CPU. The hypervisor - specifics (such as unsupported CPU models or features, machine type, - etc.) may be accounted for in this guest CPU specification and thus - the CPU can be different from the one shown in host capabilities XML. - This is indicated by the <code>fallback</code> attribute of the - <code>model</code> sub element: <code>allow</code> means not all - specifics were accounted for and thus the CPU a guest will see may - be different; <code>forbid</code> indicates that the CPU a guest will - see should match this CPU definition. - </dd> - - <dt><code>custom</code></dt> - <dd> - The <code>mode</code> element contains a list of supported CPU - models, each described by a dedicated <code>model</code> element. - The <code>usable</code> attribute specifies whether the model can - be used directly on the host. When usable='no' the corresponding model - cannot be used without disabling some features that the CPU of such - model is expected to have. A special value <code>unknown</code> - indicates libvirt does not have enough information to provide the - usability data. The <code>deprecated</code> attribute reflects - the hypervisor's policy on usage of this model - <span class="since">(since 7.1.0)</span>. - </dd> - </dl> - - <h3><a id="elementsIothreads">I/O Threads</a></h3> - - <p> - The <code>iothread</code> elements indicates whether or not - <a href="formatdomain.html#elementsIOThreadsAllocation">I/O threads</a> - are supported. - </p> - -<pre> -<domainCapabilities> - ... - <iothread supported='yes'/> - ... -<domainCapabilities> -</pre> - - <h3><a id="elementsMemoryBacking">Memory Backing</a></h3> - - <p> - The <code>memory backing</code> element indicates whether or not - <a href="formatdomain.html#memory-backing">memory backing</a> - is supported. - </p> - -<pre> -<domainCapabilities> - ... - <memoryBacking supported='yes'> - <enum name='sourceType'> - <value>anonymous</value> - <value>file</value> - <value>memfd</value> - </enum> - </memoryBacking> - ... -<domainCapabilities> -</pre> - - <dl> - <dt><code>sourceType</code></dt> - <dd>Options for the <code>type</code> attribute of the - <memoryBacking><source> element.</dd> - </dl> - - <h3><a id="elementsDevices">Devices</a></h3> - - <p> - Another set of XML elements describe the supported devices and their - capabilities. All devices occur as children of the main - <code>devices</code> element. - </p> - -<pre> -<domainCapabilities> - ... - <devices> - <disk supported='yes'> - <enum name='diskDevice'> - <value>disk</value> - <value>cdrom</value> - <value>floppy</value> - <value>lun</value> - </enum> - ... - </disk> - <hostdev supported='no'/> - </devices> -</domainCapabilities> -</pre> - - <p>Reported capabilities are expressed as an enumerated list of available - options for each of the element or attribute. For example, the - <disk/> element has an attribute <code>device</code> which can - support the values <code>disk</code>, <code>cdrom</code>, - <code>floppy</code>, or <code>lun</code>.</p> - - <h4><a id="elementsDisks">Hard drives, floppy disks, CDROMs</a></h4> - <p>Disk capabilities are exposed under the <code>disk</code> element. For - instance:</p> - -<pre> -<domainCapabilities> - ... - <devices> - <disk supported='yes'> - <enum name='diskDevice'> - <value>disk</value> - <value>cdrom</value> - <value>floppy</value> - <value>lun</value> - </enum> - <enum name='bus'> - <value>ide</value> - <value>fdc</value> - <value>scsi</value> - <value>virtio</value> - <value>xen</value> - <value>usb</value> - <value>sata</value> - <value>sd</value> - </enum> - </disk> - ... - </devices> -</domainCapabilities> -</pre> - - <dl> - <dt><code>diskDevice</code></dt> - <dd>Options for the <code>device</code> attribute of the <disk/> - element.</dd> - - <dt><code>bus</code></dt> - <dd>Options for the <code>bus</code> attribute of the <target/> - element for a <disk/>.</dd> - </dl> - - - <h4><a id="elementsGraphics">Graphical framebuffers</a></h4> - <p>Graphics device capabilities are exposed under the - <code>graphics</code> element. For instance:</p> - -<pre> -<domainCapabilities> - ... - <devices> - <graphics supported='yes'> - <enum name='type'> - <value>sdl</value> - <value>vnc</value> - <value>spice</value> - </enum> - </graphics> - ... - </devices> -</domainCapabilities> -</pre> - - <dl> - <dt><code>type</code></dt> - <dd>Options for the <code>type</code> attribute of the <graphics/> - element.</dd> - </dl> - - - <h4><a id="elementsVideo">Video device</a></h4> - <p>Video device capabilities are exposed under the - <code>video</code> element. For instance:</p> - -<pre> -<domainCapabilities> - ... - <devices> - <video supported='yes'> - <enum name='modelType'> - <value>vga</value> - <value>cirrus</value> - <value>vmvga</value> - <value>qxl</value> - <value>virtio</value> - </enum> - </video> - ... - </devices> -</domainCapabilities> -</pre> - - <dl> - <dt><code>modelType</code></dt> - <dd>Options for the <code>type</code> attribute of the - <video><model> element.</dd> - </dl> - - - <h4><a id="elementsHostDev">Host device assignment</a></h4> - <p>Some host devices can be passed through to a guest (e.g. USB, PCI and - SCSI). Well, only if the following is enabled:</p> - -<pre> -<domainCapabilities> - ... - <devices> - <hostdev supported='yes'> - <enum name='mode'> - <value>subsystem</value> - <value>capabilities</value> - </enum> - <enum name='startupPolicy'> - <value>default</value> - <value>mandatory</value> - <value>requisite</value> - <value>optional</value> - </enum> - <enum name='subsysType'> - <value>usb</value> - <value>pci</value> - <value>scsi</value> - </enum> - <enum name='capsType'> - <value>storage</value> - <value>misc</value> - <value>net</value> - </enum> - <enum name='pciBackend'> - <value>default</value> - <value>kvm</value> - <value>vfio</value> - <value>xen</value> - </enum> - </hostdev> - </devices> -</domainCapabilities> -</pre> - - <dl> - <dt><code>mode</code></dt> - <dd>Options for the <code>mode</code> attribute of the <hostdev/> - element.</dd> - - <dt><code>startupPolicy</code></dt> - <dd>Options for the <code>startupPolicy</code> attribute of the - <hostdev/> element.</dd> - - <dt><code>subsysType</code></dt> - <dd>Options for the <code>type</code> attribute of the <hostdev/> - element in case of <code>mode="subsystem"</code>.</dd> - - <dt><code>capsType</code></dt> - <dd>Options for the <code>type</code> attribute of the <hostdev/> - element in case of <code>mode="capabilities"</code>.</dd> - - <dt><code>pciBackend</code></dt> - <dd>Options for the <code>name</code> attribute of the <driver/> - element.</dd> - </dl> - - - <h4><a id="elementsRNG">RNG device</a></h4> - <p>RNG device capabilities are exposed under the - <code>rng</code> element. For instance:</p> - -<pre> -<domainCapabilities> - ... - <devices> - <rng supported='yes'> - <enum name='model'> - <value>virtio</value> - <value>virtio-transitional</value> - <value>virtio-non-transitional</value> - </enum> - <enum name='backendModel'> - <value>random</value> - <value>egd</value> - <value>builtin</value> - </enum> - </rng> - ... - </devices> -</domainCapabilities> -</pre> - - <dl> - <dt><code>model</code></dt> - <dd>Options for the <code>model</code> attribute of the - <rng> element.</dd> - <dt><code>backendModel</code></dt> - <dd>Options for the <code>model</code> attribute of the - <rng><backend> element.</dd> - </dl> - - - <h4><a id="elementsFilesystem">Filesystem device</a></h4> - <p>Filesystem device capabilities are exposed under the - <code>filesystem</code> element. For instance:</p> - -<pre> -<domainCapabilities> - ... - <devices> - <filesystem supported='yes'> - <enum name='driverType'> - <value>default</value> - <value>path</value> - <value>handle</value> - <value>virtiofs</value> - </enum> - </filesystem> - ... - </devices> -</domainCapabilities> -</pre> - - <dl> - <dt><code>driverType</code></dt> - <dd>Options for the <code>type</code> attribute of the - <filesystem><driver> element.</dd> - </dl> - - - <h3><a id="elementsFeatures">Features</a></h3> - - <p>One more set of XML elements describe the supported features and - their capabilities. All features occur as children of the main - <code>features</code> element.</p> - -<pre> -<domainCapabilities> - ... - <features> - <gic supported='yes'> - <enum name='version'> - <value>2</value> - <value>3</value> - </enum> - </gic> - <vmcoreinfo supported='yes'/> - <genid supported='yes'/> - <backingStoreInput supported='yes'/> - <backup supported='yes'/> - <sev> - <cbitpos>47</cbitpos> - <reduced-phys-bits>1</reduced-phys-bits> - </sev> - </features> -</domainCapabilities> -</pre> - - <p>Reported capabilities are expressed as an enumerated list of - possible values for each of the elements or attributes. For example, the - <code>gic</code> element has an attribute <code>version</code> which can - support the values <code>2</code> or <code>3</code>.</p> - - <p>For information about the purpose of each feature, see the - <a href="formatdomain.html#elementsFeatures">relevant section</a> in - the domain XML documentation. - </p> - - <h4><a id="elementsGIC">GIC capabilities</a></h4> - - <p>GIC capabilities are exposed under the <code>gic</code> element.</p> - - <dl> - <dt><code>version</code></dt> - <dd>Options for the <code>version</code> attribute of the - <code>gic</code> element.</dd> - </dl> - - <h4><a id="elementsvmcoreinfo">vmcoreinfo</a></h4> - - <p>Reports whether the vmcoreinfo feature can be enabled.</p> - - <h4><a id="elementsgenid">genid</a></h4> - - <p>Reports whether the genid feature can be used by the domain.</p> - - <h4><a id="featureBackingStoreInput">backingStoreInput</a></h4> - - <p>Reports whether the hypervisor will obey the <backingStore> - elements configured for a <disk> when booting the guest, hotplugging - the disk to a running guest, or similar. - <span class="since">(Since 5.10)</span> - </p> - - <h4><a id="featureBackup">backup</a></h4> - - <p>Reports whether the hypervisor supports the backup, checkpoint, and - related features. (<code>virDomainBackupBegin</code>, - <code>virDomainCheckpointCreateXML</code> etc). The presence of the - <code>backup</code> element even if <code>supported='no'</code> implies that - the <code>VIR_DOMAIN_UNDEFINE_CHECKPOINTS_METADATA</code> flag for - <code>virDomainUndefine</code> is supported. - </p> - - <h4><a id="elementsS390PV">s390-pv capability</a></h4> - - <p>Reports whether the hypervisor supports the Protected Virtualization. - In order to use Protected Virtualization with libvirt have a look at the - <a href="formatdomain.html#launchSecurity">launchSecurity element in the - domain XML</a>. For more details on the Protected Virtualization feature - please see <a href="kbase/s390_protected_virt.html">Protected - Virtualization on s390</a>. - </p> - - <h4><a id="elementsSEV">SEV capabilities</a></h4> - - <p>AMD Secure Encrypted Virtualization (SEV) capabilities are exposed under - the <code>sev</code> element. - SEV is an extension to the AMD-V architecture which supports running - virtual machines (VMs) under the control of a hypervisor. When supported, - guest owner can create a VM whose memory contents will be transparently - encrypted with a key unique to that VM.</p> - - <p> - For more details on the SEV feature, please follow resources in the - AMD developer's document store. In order to use SEV with libvirt have - a look at <a href="formatdomain.html#launchSecurity">SEV in domain XML</a> - </p> - - <dl> - <dt><code>cbitpos</code></dt> - <dd>When memory encryption is enabled, one of the physical address bits - (aka the C-bit) is utilized to mark if a memory page is protected. The - C-bit position is Hypervisor dependent.</dd> - <dt><code>reducedPhysBits</code></dt> - <dd>When memory encryption is enabled, we lose certain bits in physical - address space. The number of bits we lose is hypervisor dependent.</dd> - <dt><code>maxGuests</code></dt> - <dd>The maximum number of SEV guests that can be launched on the host. - This value may be configurable in the firmware for some hosts.</dd> - <dt><code>maxESGuests</code></dt> - <dd>The maximum number of SEV-ES guests that can be launched on the host. - This value may be configurable in the firmware for some hosts.</dd> - </dl> - - </body> -</html> diff --git a/docs/formatdomaincaps.rst b/docs/formatdomaincaps.rst new file mode 100644 index 0000000000..c07c07da4b --- /dev/null +++ b/docs/formatdomaincaps.rst @@ -0,0 +1,602 @@ +.. role:: since + +============================== +Domain capabilities XML format +============================== + +.. contents:: + +Overview +-------- + +Sometimes, when a new domain is to be created it may come handy to know the +capabilities of the hypervisor so the correct combination of devices and drivers +is used. For example, when management application is considering the mode for a +host device's passthrough there are several options depending not only on host, +but on hypervisor in question too. If the hypervisor is qemu then it needs to be +more recent to support VFIO, while legacy KVM is achievable just fine with older +qemus. + +The main difference between +`virConnectGetCapabilities </html/libvirt-libvirt-host.html#virConnectGetCapabilities>`__ +and the emulator capabilities API is, the former one aims more on the host +capabilities (e.g. NUMA topology, security models in effect, etc.) while the +latter one specializes on the hypervisor capabilities. + +While the `Driver Capabilities <formatcaps.html>`__ provides the host +capabilities (e.g NUMA topology, security models in effect, etc.), the Domain +Capabilities provides the hypervisor specific capabilities for Management +Applications to query and make decisions regarding what to utilize. + +The Domain Capabilities can provide information such as the correct combination +of devices and drivers that are supported. Knowing which host and hypervisor +specific options are available or supported would allow the management +application to choose an appropriate mode for a pass-through host device as well +as which adapter to utilize. + +Some XML elements may be entirely omitted from the domaincapabilities XML, +depending on what the libvirt driver has filled in. Applications should only act +on what is explicitly reported in the domaincapabilities XML. For example, if +<disk supported='yes'/> is present, you can safely assume the driver supports +<disk> devices. If <disk supported='no'/> is present, you can safely assume the +driver does NOT support <disk> devices. If the <disk> block is omitted entirely, +the driver is not indicating one way or the other whether it supports <disk> +devices, and applications should not interpret the missing block to mean any +thing in particular. + +Element and attribute overview +------------------------------ + +A new query interface was added to the virConnect API's to retrieve the XML +listing of the set of domain capabilities ( :since:`Since 1.2.7` ): + +:: + + virConnectGetDomainCapabilities + +The root element that emulator capability XML document starts with has name +``domainCapabilities``. It contains at least four direct child elements: + +:: + + <domainCapabilities> + <path>/usr/bin/qemu-system-x86_64</path> + <domain>kvm</domain> + <machine>pc-i440fx-2.1</machine> + <arch>x86_64</arch> + ... + </domainCapabilities> + +``path`` + The full path to the emulator binary. +``domain`` + Describes the `virtualization type <formatdomain.html#elements>`__ (or so + called domain type). +``machine`` + The domain's `machine type <formatdomain.html#elementsOSBIOS>`__. Since not + every hypervisor has a sense of machine types this element might be omitted + in such drivers. +``arch`` + The domain's `architecture <formatdomain.html#elementsOSBIOS>`__. + +CPU Allocation +~~~~~~~~~~~~~~ + +Before any devices capability occurs, there might be info on domain wide +capabilities, e.g. virtual CPUs: + +:: + + <domainCapabilities> + ... + <vcpu max='255'/> + ... + </domainCapabilities> + +``vcpu`` + The maximum number of supported virtual CPUs + +BIOS bootloader +~~~~~~~~~~~~~~~ + +Sometimes users might want to tweak some BIOS knobs or use UEFI. For cases like +that, `os <formatdomain.html#elementsOSBIOS>`__ element exposes what values can +be passed to its children. + +:: + + <domainCapabilities> + ... + <os supported='yes'> + <enum name='firmware'> + <value>bios</value> + <value>efi</value> + </enum> + <loader supported='yes'> + <value>/usr/share/OVMF/OVMF_CODE.fd</value> + <enum name='type'> + <value>rom</value> + <value>pflash</value> + </enum> + <enum name='readonly'> + <value>yes</value> + <value>no</value> + </enum> + <enum name='secure'> + <value>yes</value> + <value>no</value> + </enum> + </loader> + </os> + ... + <domainCapabilities> + +The ``firmware`` enum corresponds to the ``firmware`` attribute of the ``os`` +element in the domain XML. The presence of this enum means libvirt is capable of +the so-called firmware auto-selection feature. And the listed firmware values +represent the accepted input in the domain XML. Note that the ``firmware`` enum +reports only those values for which a firmware "descriptor file" exists on the +host. Firmware descriptor file is a small JSON document that describes details +about a given BIOS or UEFI binary on the host, e.g. the firmware binary path, +its architecture, supported machine types, NVRAM template, etc. This ensures +that the reported values won't cause a failure on guest boot. + +For the ``loader`` element, the following can occur: + +``value`` + List of known firmware binary paths. Currently this is used only to advertise + the known location of OVMF binaries for QEMU. OVMF binaries will only be + listed if they actually exist on host. +``type`` + Whether the boot loader is a typical BIOS (``rom``) or a UEFI firmware + (``pflash``). Each ``value`` sub-element under the ``type`` enum represents a + possible value for the ``type`` attribute for the <loader/> element in the + domain XML. E.g. the presence of ``pfalsh`` under the ``type`` enum means + that a domain XML can use UEFI firmware via: <loader/> type="pflash" + ...>/path/to/the/firmware/binary/</loader>. +``readonly`` + Options for the ``readonly`` attribute of the <loader/> element in the domain + XML. +``secure`` + Options for the ``secure`` attribute of the <loader/> element in the domain + XML. Note that the value ``yes`` is listed only if libvirt detects a firmware + descriptor file that has path to an OVMF binary that supports Secure boot, + and lists its architecture and supported machine type. + +CPU configuration +~~~~~~~~~~~~~~~~~ + +The ``cpu`` element exposes options usable for configuring `guest +CPUs <formatdomain.html#elementsCPU>`__. + +:: + + <domainCapabilities> + ... + <cpu> + <mode name='host-passthrough' supported='yes'> + <enum name='hostPassthroughMigratable'> + <value>on</value> + <value>off</value> + </enum> + </mode> + <mode name='maximum' supported='yes'> + <enum name='maximumMigratable'> + <value>on</value> + <value>off</value> + </enum> + </mode> + <mode name='host-model' supported='yes'> + <model fallback='allow'>Broadwell</model> + <vendor>Intel</vendor> + <feature policy='disable' name='aes'/> + <feature policy='require' name='vmx'/> + </mode> + <mode name='custom' supported='yes'> + <model usable='no' deprecated='no'>Broadwell</model> + <model usable='yes' deprecated='no'>Broadwell-noTSX</model> + <model usable='no' deprecated='yes'>Haswell</model> + ... + </mode> + </cpu> + ... + <domainCapabilities> + +Each CPU mode understood by libvirt is described with a ``mode`` element which +tells whether the particular mode is supported and provides (when applicable) +more details about it: + +``host-passthrough`` + The ``hostPassthroughMigratable`` enum shows possible values of the + ``migratable`` attribute for the <cpu> element with + ``mode='host-passthrough'`` in the domain XML. +``host-model`` + If ``host-model`` is supported by the hypervisor, the ``mode`` describes the + guest CPU which will be used when starting a domain with ``host-model`` CPU. + The hypervisor specifics (such as unsupported CPU models or features, machine + type, etc.) may be accounted for in this guest CPU specification and thus the + CPU can be different from the one shown in host capabilities XML. This is + indicated by the ``fallback`` attribute of the ``model`` sub element: + ``allow`` means not all specifics were accounted for and thus the CPU a guest + will see may be different; ``forbid`` indicates that the CPU a guest will see + should match this CPU definition. +``custom`` + The ``mode`` element contains a list of supported CPU models, each described + by a dedicated ``model`` element. The ``usable`` attribute specifies whether + the model can be used directly on the host. When usable='no' the + corresponding model cannot be used without disabling some features that the + CPU of such model is expected to have. A special value ``unknown`` indicates + libvirt does not have enough information to provide the usability data. The + ``deprecated`` attribute reflects the hypervisor's policy on usage of this + model :since:`(since 7.1.0)` . + +I/O Threads +~~~~~~~~~~~ + +The ``iothread`` elements indicates whether or not `I/O +threads <formatdomain.html#elementsIOThreadsAllocation>`__ are supported. + +:: + + <domainCapabilities> + ... + <iothread supported='yes'/> + ... + <domainCapabilities> + +Memory Backing +~~~~~~~~~~~~~~ + +The ``memory backing`` element indicates whether or not `memory +backing <formatdomain.html#memory-backing>`__ is supported. + +:: + + <domainCapabilities> + ... + <memoryBacking supported='yes'> + <enum name='sourceType'> + <value>anonymous</value> + <value>file</value> + <value>memfd</value> + </enum> + </memoryBacking> + ... + <domainCapabilities> + +``sourceType`` + Options for the ``type`` attribute of the <memoryBacking><source> element. + +Devices +~~~~~~~ + +Another set of XML elements describe the supported devices and their +capabilities. All devices occur as children of the main ``devices`` element. + +:: + + <domainCapabilities> + ... + <devices> + <disk supported='yes'> + <enum name='diskDevice'> + <value>disk</value> + <value>cdrom</value> + <value>floppy</value> + <value>lun</value> + </enum> + ... + </disk> + <hostdev supported='no'/> + </devices> + </domainCapabilities> + +Reported capabilities are expressed as an enumerated list of available options +for each of the element or attribute. For example, the <disk/> element has an +attribute ``device`` which can support the values ``disk``, ``cdrom``, +``floppy``, or ``lun``. + +Hard drives, floppy disks, CDROMs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Disk capabilities are exposed under the ``disk`` element. For instance: + +:: + + <domainCapabilities> + ... + <devices> + <disk supported='yes'> + <enum name='diskDevice'> + <value>disk</value> + <value>cdrom</value> + <value>floppy</value> + <value>lun</value> + </enum> + <enum name='bus'> + <value>ide</value> + <value>fdc</value> + <value>scsi</value> + <value>virtio</value> + <value>xen</value> + <value>usb</value> + <value>sata</value> + <value>sd</value> + </enum> + </disk> + ... + </devices> + </domainCapabilities> + +``diskDevice`` + Options for the ``device`` attribute of the <disk/> element. +``bus`` + Options for the ``bus`` attribute of the <target/> element for a <disk/>. + +Graphical framebuffers +^^^^^^^^^^^^^^^^^^^^^^ + +Graphics device capabilities are exposed under the ``graphics`` element. For +instance: + +:: + + <domainCapabilities> + ... + <devices> + <graphics supported='yes'> + <enum name='type'> + <value>sdl</value> + <value>vnc</value> + <value>spice</value> + </enum> + </graphics> + ... + </devices> + </domainCapabilities> + +``type`` + Options for the ``type`` attribute of the <graphics/> element. + +Video device +^^^^^^^^^^^^ + +Video device capabilities are exposed under the ``video`` element. For instance: + +:: + + <domainCapabilities> + ... + <devices> + <video supported='yes'> + <enum name='modelType'> + <value>vga</value> + <value>cirrus</value> + <value>vmvga</value> + <value>qxl</value> + <value>virtio</value> + </enum> + </video> + ... + </devices> + </domainCapabilities> + +``modelType`` + Options for the ``type`` attribute of the <video><model> element. + +Host device assignment +^^^^^^^^^^^^^^^^^^^^^^ + +Some host devices can be passed through to a guest (e.g. USB, PCI and SCSI). +Well, only if the following is enabled: + +:: + + <domainCapabilities> + ... + <devices> + <hostdev supported='yes'> + <enum name='mode'> + <value>subsystem</value> + <value>capabilities</value> + </enum> + <enum name='startupPolicy'> + <value>default</value> + <value>mandatory</value> + <value>requisite</value> + <value>optional</value> + </enum> + <enum name='subsysType'> + <value>usb</value> + <value>pci</value> + <value>scsi</value> + </enum> + <enum name='capsType'> + <value>storage</value> + <value>misc</value> + <value>net</value> + </enum> + <enum name='pciBackend'> + <value>default</value> + <value>kvm</value> + <value>vfio</value> + <value>xen</value> + </enum> + </hostdev> + </devices> + </domainCapabilities> + +``mode`` + Options for the ``mode`` attribute of the <hostdev/> element. +``startupPolicy`` + Options for the ``startupPolicy`` attribute of the <hostdev/> element. +``subsysType`` + Options for the ``type`` attribute of the <hostdev/> element in case of + ``mode="subsystem"``. +``capsType`` + Options for the ``type`` attribute of the <hostdev/> element in case of + ``mode="capabilities"``. +``pciBackend`` + Options for the ``name`` attribute of the <driver/> element. + +RNG device +^^^^^^^^^^ + +RNG device capabilities are exposed under the ``rng`` element. For instance: + +:: + + <domainCapabilities> + ... + <devices> + <rng supported='yes'> + <enum name='model'> + <value>virtio</value> + <value>virtio-transitional</value> + <value>virtio-non-transitional</value> + </enum> + <enum name='backendModel'> + <value>random</value> + <value>egd</value> + <value>builtin</value> + </enum> + </rng> + ... + </devices> + </domainCapabilities> + +``model`` + Options for the ``model`` attribute of the <rng> element. +``backendModel`` + Options for the ``model`` attribute of the <rng><backend> element. + +Filesystem device +^^^^^^^^^^^^^^^^^ + +Filesystem device capabilities are exposed under the ``filesystem`` element. For +instance: + +:: + + <domainCapabilities> + ... + <devices> + <filesystem supported='yes'> + <enum name='driverType'> + <value>default</value> + <value>path</value> + <value>handle</value> + <value>virtiofs</value> + </enum> + </filesystem> + ... + </devices> + </domainCapabilities> + +``driverType`` + Options for the ``type`` attribute of the <filesystem><driver> element. + +Features +~~~~~~~~ + +One more set of XML elements describe the supported features and their +capabilities. All features occur as children of the main ``features`` element. + +:: + + <domainCapabilities> + ... + <features> + <gic supported='yes'> + <enum name='version'> + <value>2</value> + <value>3</value> + </enum> + </gic> + <vmcoreinfo supported='yes'/> + <genid supported='yes'/> + <backingStoreInput supported='yes'/> + <backup supported='yes'/> + <sev> + <cbitpos>47</cbitpos> + <reduced-phys-bits>1</reduced-phys-bits> + </sev> + </features> + </domainCapabilities> + +Reported capabilities are expressed as an enumerated list of possible values for +each of the elements or attributes. For example, the ``gic`` element has an +attribute ``version`` which can support the values ``2`` or ``3``. + +For information about the purpose of each feature, see the `relevant +section <formatdomain.html#elementsFeatures>`__ in the domain XML documentation. + +GIC capabilities +^^^^^^^^^^^^^^^^ + +GIC capabilities are exposed under the ``gic`` element. + +``version`` + Options for the ``version`` attribute of the ``gic`` element. + +vmcoreinfo +^^^^^^^^^^ + +Reports whether the vmcoreinfo feature can be enabled. + +genid +^^^^^ + +Reports whether the genid feature can be used by the domain. + +backingStoreInput +^^^^^^^^^^^^^^^^^ + +Reports whether the hypervisor will obey the <backingStore> elements configured +for a <disk> when booting the guest, hotplugging the disk to a running guest, or +similar. :since:`(Since 5.10)` + +backup +^^^^^^ + +Reports whether the hypervisor supports the backup, checkpoint, and related +features. (``virDomainBackupBegin``, ``virDomainCheckpointCreateXML`` etc). The +presence of the ``backup`` element even if ``supported='no'`` implies that the +``VIR_DOMAIN_UNDEFINE_CHECKPOINTS_METADATA`` flag for ``virDomainUndefine`` is +supported. + +s390-pv capability +^^^^^^^^^^^^^^^^^^ + +Reports whether the hypervisor supports the Protected Virtualization. In order +to use Protected Virtualization with libvirt have a look at the `launchSecurity +element in the domain XML <formatdomain.html#launchSecurity>`__. For more +details on the Protected Virtualization feature please see `Protected +Virtualization on s390 <kbase/s390_protected_virt.html>`__. + +SEV capabilities +^^^^^^^^^^^^^^^^ + +AMD Secure Encrypted Virtualization (SEV) capabilities are exposed under the +``sev`` element. SEV is an extension to the AMD-V architecture which supports +running virtual machines (VMs) under the control of a hypervisor. When +supported, guest owner can create a VM whose memory contents will be +transparently encrypted with a key unique to that VM. + +For more details on the SEV feature, please follow resources in the AMD +developer's document store. In order to use SEV with libvirt have a look at `SEV +in domain XML <formatdomain.html#launchSecurity>`__ + +``cbitpos`` + When memory encryption is enabled, one of the physical address bits (aka the + C-bit) is utilized to mark if a memory page is protected. The C-bit position + is Hypervisor dependent. +``reducedPhysBits`` + When memory encryption is enabled, we lose certain bits in physical address + space. The number of bits we lose is hypervisor dependent. +``maxGuests`` + The maximum number of SEV guests that can be launched on the host. This value + may be configurable in the firmware for some hosts. +``maxESGuests`` + The maximum number of SEV-ES guests that can be launched on the host. This + value may be configurable in the firmware for some hosts. diff --git a/docs/kbase/backing_chains.rst b/docs/kbase/backing_chains.rst index 89920a61b1..38a9a2337b 100644 --- a/docs/kbase/backing_chains.rst +++ b/docs/kbase/backing_chains.rst @@ -97,7 +97,7 @@ specification can be used: </disk> This makes libvirt follow the settings as configured in the XML. Note that this -is supported only when the https://libvirt.org/formatdomaincaps.html#featureBackingStoreInput +is supported only when the https://libvirt.org/formatdomaincaps.html#backingstoreinput capability is present. An empty ``<backingStore/>`` element signals the end of the chain. Using this diff --git a/docs/meson.build b/docs/meson.build index 95c9babcf5..81f348398d 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -22,7 +22,6 @@ docs_html_in_files = [ 'csharp', 'dbus', 'docs', - 'formatdomaincaps', 'formatnetwork', 'formatnetworkport', 'formatnode', @@ -85,6 +84,7 @@ docs_rst_files = [ 'formatcaps', 'formatcheckpoint', 'formatdomain', + 'formatdomaincaps', 'formatsecret', 'formatsnapshot', 'formatstorage', -- 2.35.1

On Mon, Mar 28, 2022 at 02:10:33PM +0200, Peter Krempa wrote:
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- ...
+Element and attribute overview +------------------------------ + +A new query interface was added to the virConnect API's to retrieve the XML +listing of the set of domain capabilities ( :since:`Since 1.2.7` ): + +:: + + virConnectGetDomainCapabilities
Previously, ^this was both verbatim and a href which RST still cannot do IIRC, so how about we do `virConnectGetDomainCapabilities </html/libvirt-libvirt.host.html#virConnectGetDomainCapabilities>`__ instead of verbatim? Reviewed-by: Erik Skultety <eskultet@redhat.com>

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/formatnetworkport.html.in | 223 --------------------------------- docs/formatnetworkport.rst | 175 ++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 176 insertions(+), 224 deletions(-) delete mode 100644 docs/formatnetworkport.html.in create mode 100644 docs/formatnetworkport.rst diff --git a/docs/formatnetworkport.html.in b/docs/formatnetworkport.html.in deleted file mode 100644 index 2d41552618..0000000000 --- a/docs/formatnetworkport.html.in +++ /dev/null @@ -1,223 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Network XML format</h1> - - <ul id="toc"> - </ul> - - <p> - This page provides an introduction to the network port XML format. - This stores information about the connection between a virtual - interface of a virtual domain, and the virtual network it is - attached to. - </p> - - <h2><a id="elements">Element and attribute overview</a></h2> - - <p> - The root element required for all virtual network ports is - named <code>networkport</code> and has no configurable attributes - The network port XML format is available <span class="since">since - 5.5.0</span> - </p> - - <h3><a id="elementsMetadata">General metadata</a></h3> - - <p> - The first elements provide basic metadata about the virtual - network port. - </p> - - <pre> -<networkport> - <uuid>7ae63b5f-fe96-4af0-a7c3-da04ba1b3f54</uuid> - <owner> - <uuid>06578fc1-c686-46fa-bc2c-220893b466a6</uuid> - <name>myguest</name> - </owner> - <group>webfront</group> - <mac address='52:54:0:7b:35:93'/> - ...</pre> - - <dl> - <dt><code>uuid</code></dt> - <dd>The content of the <code>uuid</code> element provides - a globally unique identifier for the virtual network port. - The format must be RFC 4122 compliant, eg <code>3e3fce45-4f53-4fa7-bb32-11f34168b82b</code>. - If omitted when defining/creating a new network port, a random - UUID is generated.</dd> - <dd>The <code>owner</code> node records the domain object that - is the owner of the network port. It contains two child nodes: - <dl> - <dt><code>uuid</code></dt> - <dd>The content of the <code>uuid</code> element provides - a globally unique identifier for the virtual domain.</dd> - <dt><code>name</code></dt> - <dd>The unique name of the virtual domain</dd> - </dl> - </dd> - <dt><code>group</code></dt> - <dd>The port group in the virtual network to which the - port belongs. Can be omitted if no port groups are - defined on the network.</dd> - <dt><code>mac</code></dt> - <dd>The <code>address</code> attribute provides the MAC - address of the virtual port that will be see by the - guest. The MAC address must not start with 0xFE as this - byte is reserved for use on the host side of the port. - </dd> - </dl> - - <h3><a id="elementsCommon">Common elements</a></h3> - - <p> - The following elements are common to one or more of the plug - types listed later - </p> - - <pre> - ... - <bandwidth> - <inbound average='1000' peak='5000' floor='200' burst='1024'/> - <outbound average='128' peak='256' burst='256'/> - </bandwidth> - <rxfilters trustGuest='yes'/> - <port isolated='yes'/> - <virtualport type='802.1Qbg'> - <parameters managerid='11' typeid='1193047' typeidversion='2'/> - </virtualport> - ...</pre> - - <dl> - <dt><code>bandwidth</code></dt> - <dd>This part of the network port XML provides setting quality of service. - Incoming and outgoing traffic can be shaped independently. - The <code>bandwidth</code> element and its child elements are described - in the <a href="formatnetwork.html#elementQoS">QoS</a> section of - the Network XML. In addition the <code>classID</code> attribute may - exist to provide the ID of the traffic shaping class that is active. - </dd> - <dt><code>rxfilters</code></dt> - <dd>The <code>rxfilters</code> element property - <code>trustGuest</code> provides the - capability for the host to detect and trust reports from the - guest regarding changes to the interface mac address and receive - filters by setting the attribute to <code>yes</code>. The default - setting for the attribute is <code>no</code> for security - reasons and support depends on the guest network device model as - well as the type of connection on the host - currently it is - only supported for the virtio device model and for macvtap - connections on the host. - </dd> - <dt><code>port</code></dt> - <dd> <span class="since">Since 6.1.0.</span> - The <code>port</code> element property - <code>isolated</code>, when set to <code>yes</code> (default - setting is <code>no</code>) is used to isolate this port's - network traffic from other ports on the same network that also - have <code><port isolated='yes'/></code>. This setting - is only supported for emulated network devices connected to a - Linux host bridge via a standard tap device. - </dd> - <dt><code>virtualport</code></dt> - <dd>The <code>virtualport</code> element describes metadata that - needs to be provided to the underlying network subsystem. It - is described in the domain XML - <a href="formatdomain.html#elementsNICS">interface documentation</a>. - </dd> - </dl> - - - <h3><a id="elementsPlug">Plugs</a></h3> - - <p> - The <code>plug</code> element has varying content depending - on the value of the <code>type</code> attribute. - </p> - - <h4><a id="elementsPlugNetwork">Network</a></h4> - - <p> - The <code>network</code> plug type refers to a managed virtual - network plug that is based on a traditional software bridge - device privately managed by libvirt. - </p> - - <pre> - ... - <plug type='network' bridge='virbr0'/> - ...</pre> - - <p> - The <code>bridge</code> attribute provides the name of the - privately managed bridge device associated with the virtual - network. - </p> - - <h4><a id="elementsPlugNetwork">Bridge</a></h4> - - <p> - The <code>bridge</code> plug type refers to an externally - managed traditional software bridge. - </p> - - <pre> - ... - <plug type='bridge' bridge='br2'/> - ...</pre> - - <p> - The <code>bridge</code> attribute provides the name of the - externally managed bridge device associated with the virtual - network. - </p> - - <h4><a id="elementsPlugNetwork">Direct</a></h4> - - <p> - The <code>direct</code> plug type refers to a connection - directly to a physical network interface. - </p> - - <pre> - ... - <plug type='direct' dev='ens3' mode='vepa'/> - ...</pre> - - <p> - The <code>dev</code> attribute provides the name of the - physical network interface to which the port will be - connected. The <code>mode</code> attribute describes - how the connection will be setup and takes the same - values described in the - <a href="formatdomain.html#elementsNICSDirect">domain XML</a>. - </p> - - <h4><a id="elementsPlugNetwork">Host PCI</a></h4> - - <p> - The <code>hostdev-pci</code> plug type refers to the - passthrough of a physical PCI device rather than emulation. - </p> - - <pre> - ... - <plug type='hostdev-pci' managed='yes'> - <driver name='vfio'/> - <address domain='0x0001' bus='0x02' slot='0x03' function='0x4'/> - </plug> - ...</pre> - - <p> - The <code>managed</code> attribute indicates who is responsible for - managing the PCI device in the host. When set to the value <code>yes</code> - libvirt is responsible for automatically detaching the device from host - drivers and resetting it if needed. If the value is <code>no</code>, - some other party must ensure the device is not attached to any - host drivers. - </p> - - </body> -</html> diff --git a/docs/formatnetworkport.rst b/docs/formatnetworkport.rst new file mode 100644 index 0000000000..a85888907d --- /dev/null +++ b/docs/formatnetworkport.rst @@ -0,0 +1,175 @@ +.. role:: since + +================== +Network XML format +================== + +.. contents:: + +This page provides an introduction to the network port XML format. This stores +information about the connection between a virtual interface of a virtual +domain, and the virtual network it is attached to. + +Element and attribute overview +------------------------------ + +The root element required for all virtual network ports is named ``networkport`` +and has no configurable attributes The network port XML format is available +:since:`since 5.5.0` + +General metadata +~~~~~~~~~~~~~~~~ + +The first elements provide basic metadata about the virtual network port. + +:: + + <networkport> + <uuid>7ae63b5f-fe96-4af0-a7c3-da04ba1b3f54</uuid> + <owner> + <uuid>06578fc1-c686-46fa-bc2c-220893b466a6</uuid> + <name>myguest</name> + </owner> + <group>webfront</group> + <mac address='52:54:0:7b:35:93'/> + ... + +``uuid`` + The content of the ``uuid`` element provides a globally unique identifier for + the virtual network port. The format must be RFC 4122 compliant, eg + ``3e3fce45-4f53-4fa7-bb32-11f34168b82b``. If omitted when defining/creating a + new network port, a random UUID is generated. + The ``owner`` node records the domain object that is the owner of the network + port. It contains two child nodes: + + ``uuid`` + The content of the ``uuid`` element provides a globally unique identifier + for the virtual domain. + ``name`` + The unique name of the virtual domain +``group`` + The port group in the virtual network to which the port belongs. Can be + omitted if no port groups are defined on the network. +``mac`` + The ``address`` attribute provides the MAC address of the virtual port that + will be see by the guest. The MAC address must not start with 0xFE as this + byte is reserved for use on the host side of the port. + +Common elements +~~~~~~~~~~~~~~~ + +The following elements are common to one or more of the plug types listed later + +:: + + ... + <bandwidth> + <inbound average='1000' peak='5000' floor='200' burst='1024'/> + <outbound average='128' peak='256' burst='256'/> + </bandwidth> + <rxfilters trustGuest='yes'/> + <port isolated='yes'/> + <virtualport type='802.1Qbg'> + <parameters managerid='11' typeid='1193047' typeidversion='2'/> + </virtualport> + ... + +``bandwidth`` + This part of the network port XML provides setting quality of service. + Incoming and outgoing traffic can be shaped independently. The ``bandwidth`` + element and its child elements are described in the + `QoS <formatnetwork.html#elementQoS>`__ section of the Network XML. In + addition the ``classID`` attribute may exist to provide the ID of the traffic + shaping class that is active. +``rxfilters`` + The ``rxfilters`` element property ``trustGuest`` provides the capability for + the host to detect and trust reports from the guest regarding changes to the + interface mac address and receive filters by setting the attribute to + ``yes``. The default setting for the attribute is ``no`` for security reasons + and support depends on the guest network device model as well as the type of + connection on the host - currently it is only supported for the virtio device + model and for macvtap connections on the host. +``port`` + :since:`Since 6.1.0.` The ``port`` element property ``isolated``, when set to + ``yes`` (default setting is ``no``) is used to isolate this port's network + traffic from other ports on the same network that also have + ``<port isolated='yes'/>``. This setting is only supported for emulated + network devices connected to a Linux host bridge via a standard tap device. +``virtualport`` + The ``virtualport`` element describes metadata that needs to be provided to + the underlying network subsystem. It is described in the domain XML + `interface documentation <formatdomain.html#elementsNICS>`__. + +Plugs +~~~~~ + +The ``plug`` element has varying content depending on the value of the ``type`` +attribute. + +Network +^^^^^^^ + +The ``network`` plug type refers to a managed virtual network plug that is based +on a traditional software bridge device privately managed by libvirt. + +:: + + ... + <plug type='network' bridge='virbr0'/> + ... + +The ``bridge`` attribute provides the name of the privately managed bridge +device associated with the virtual network. + +Bridge +^^^^^^ + +The ``bridge`` plug type refers to an externally managed traditional software +bridge. + +:: + + ... + <plug type='bridge' bridge='br2'/> + ... + +The ``bridge`` attribute provides the name of the externally managed bridge +device associated with the virtual network. + +Direct +^^^^^^ + +The ``direct`` plug type refers to a connection directly to a physical network +interface. + +:: + + ... + <plug type='direct' dev='ens3' mode='vepa'/> + ... + +The ``dev`` attribute provides the name of the physical network interface to +which the port will be connected. The ``mode`` attribute describes how the +connection will be setup and takes the same values described in the `domain +XML <formatdomain.html#elementsNICSDirect>`__. + +Host PCI +^^^^^^^^ + +The ``hostdev-pci`` plug type refers to the passthrough of a physical PCI device +rather than emulation. + +:: + + ... + <plug type='hostdev-pci' managed='yes'> + <driver name='vfio'/> + <address domain='0x0001' bus='0x02' slot='0x03' function='0x4'/> + </plug> + ... + +The ``managed`` attribute indicates who is responsible for managing the PCI +device in the host. When set to the value ``yes`` libvirt is responsible for +automatically detaching the device from host drivers and resetting it if needed. +If the value is ``no``, some other party must ensure the device is not attached +to any host drivers. diff --git a/docs/meson.build b/docs/meson.build index 81f348398d..bb1359aacd 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -23,7 +23,6 @@ docs_html_in_files = [ 'dbus', 'docs', 'formatnetwork', - 'formatnetworkport', 'formatnode', 'formatnwfilter', 'formatstoragecaps', @@ -85,6 +84,7 @@ docs_rst_files = [ 'formatcheckpoint', 'formatdomain', 'formatdomaincaps', + 'formatnetworkport', 'formatsecret', 'formatsnapshot', 'formatstorage', -- 2.35.1

The top level heading didn't contain the word 'port'. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/formatnetworkport.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/formatnetworkport.rst b/docs/formatnetworkport.rst index a85888907d..86a1bdb60b 100644 --- a/docs/formatnetworkport.rst +++ b/docs/formatnetworkport.rst @@ -1,8 +1,8 @@ .. role:: since -================== -Network XML format -================== +======================= +Network port XML format +======================= .. contents:: -- 2.35.1

On Mon, Mar 28, 2022 at 02:10:35PM +0200, Peter Krempa wrote:
The top level heading didn't contain the word 'port'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

Note that if we want to preserve the link from the code block hilighting 'virConnectGetStoragePoolCapabilities' we'd need to re-stylize it as rST doesn't support nesting of links into preformatted code blocks. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/formatstoragecaps.html.in | 95 ---------------------------------- docs/formatstoragecaps.rst | 81 +++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 82 insertions(+), 96 deletions(-) delete mode 100644 docs/formatstoragecaps.html.in create mode 100644 docs/formatstoragecaps.rst diff --git a/docs/formatstoragecaps.html.in b/docs/formatstoragecaps.html.in deleted file mode 100644 index a9ecc371fa..0000000000 --- a/docs/formatstoragecaps.html.in +++ /dev/null @@ -1,95 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Storage Pool Capabilities XML format</h1> - - <ul id="toc"></ul> - - <h2><a id="Overview">Overview</a></h2> - - <p>The Storage Pool Capabilities XML will provide the information - to determine what types of Storage Pools exist, whether the pool is - supported, and if relevant the source format types, the required - source elements, and the target volume format types. </p> - - <h2><a id="elements">Element and attribute overview</a></h2> - - <p>A query interface was added to the virConnect API's to retrieve the - XML listing of the set of Storage Pool Capabilities - (<span class="since">Since 5.2.0</span>):</p> - -<pre> -<a href="/html/libvirt-libvirt-storage.html#virConnectGetStoragePoolCapabilities">virConnectGetStoragePoolCapabilities</a> -</pre> - - <p>The root element that emulator capability XML document starts with is - named <code>storagepoolCapabilities</code>. There will be any number of - <code>pool</code> child elements with two attributes <code>type</code> - and <code>supported</code>. Each <code>pool</code> element may have - a <code>poolOptions</code> or <code>volOptions</code> subelements to - describe the available features. Sample XML output is:</p> - -<pre> -<storagepoolCapabilities> - <pool type='dir' supported='yes'> - <volOptions> - <defaultFormat type='raw'</> - <enum name='targetFormatType'> - <value>none</value> - <value>raw</value> - ... - </enum> - </volOptions> - </pool> - <pool type='fs' supported='yes'> - <poolOptions> - <defaultFormat type='auto'</> - <enum name='sourceFormatType'> - <value>auto</value> - <value>ext2</value> - ... - </enum> - </poolOptions> - <volOptions> - <defaultFormat type='raw'</> - <enum name='targetFormatType'> - <value>none</value> - <value>raw</value> - ... - </enum> - </volOptions> - </pool> - ... -</storagepoolCapabilities> -</pre> - - <p>The following section describes subelements of the - <code>poolOptions</code> and <code>volOptions</code> subelements </p>: - - <dl> - <dt><code>defaultFormat</code></dt> - <dd>For the <code>poolOptions</code>, the <code>type</code> attribute - describes the default format name used for the pool source. For the - <code>volOptions</code>, the <code>type</code> attribute describes - the default volume name used for each volume. - </dd> - <dl> - <dt><code>enum</code></dt> - <dd>Each enum uses a name from the list below with any number of - <code>value</code> value subelements describing the valid values. - <dl> - <dt><code>sourceFormatType</code></dt> - <dd>Lists all the possible <code>poolOptions</code> source - pool format types. - </dd> - <dt><code>targetFormatType</code></dt> - <dd>Lists all the possible <code>volOptions</code> target volume - format types. - </dd> - </dl> - </dd> - </dl> - </dl> - </body> -</html> diff --git a/docs/formatstoragecaps.rst b/docs/formatstoragecaps.rst new file mode 100644 index 0000000000..32cd392931 --- /dev/null +++ b/docs/formatstoragecaps.rst @@ -0,0 +1,81 @@ +.. role:: since + +==================================== +Storage Pool Capabilities XML format +==================================== + +.. contents:: + +Overview +-------- + +The Storage Pool Capabilities XML will provide the information to determine what +types of Storage Pools exist, whether the pool is supported, and if relevant the +source format types, the required source elements, and the target volume format +types. + +Element and attribute overview +------------------------------ + +A query interface was added to the virConnect API's to retrieve the XML listing +of the set of Storage Pool Capabilities ( :since:`Since 5.2.0` ): + +:: + + virConnectGetStoragePoolCapabilities + +The root element that emulator capability XML document starts with is named +``storagepoolCapabilities``. There will be any number of ``pool`` child elements +with two attributes ``type`` and ``supported``. Each ``pool`` element may have a +``poolOptions`` or ``volOptions`` subelements to describe the available +features. Sample XML output is: + +:: + + <storagepoolCapabilities> + <pool type='dir' supported='yes'> + <volOptions> + <defaultFormat type='raw'</> + <enum name='targetFormatType'> + <value>none</value> + <value>raw</value> + ... + </enum> + </volOptions> + </pool> + <pool type='fs' supported='yes'> + <poolOptions> + <defaultFormat type='auto'</> + <enum name='sourceFormatType'> + <value>auto</value> + <value>ext2</value> + ... + </enum> + </poolOptions> + <volOptions> + <defaultFormat type='raw'</> + <enum name='targetFormatType'> + <value>none</value> + <value>raw</value> + ... + </enum> + </volOptions> + </pool> + ... + </storagepoolCapabilities> + +The following section describes subelements of the ``poolOptions`` and +``volOptions`` subelements + +``defaultFormat`` + For the ``poolOptions``, the ``type`` attribute describes the default format + name used for the pool source. For the ``volOptions``, the ``type`` attribute + describes the default volume name used for each volume. +``enum`` + Each enum uses a name from the list below with any number of ``value`` value + subelements describing the valid values. + + ``sourceFormatType`` + Lists all the possible ``poolOptions`` source pool format types. + ``targetFormatType`` + Lists all the possible ``volOptions`` target volume format types. diff --git a/docs/meson.build b/docs/meson.build index bb1359aacd..b911292480 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -25,7 +25,6 @@ docs_html_in_files = [ 'formatnetwork', 'formatnode', 'formatnwfilter', - 'formatstoragecaps', 'formatstorageencryption', 'hooks', 'index', @@ -88,6 +87,7 @@ docs_rst_files = [ 'formatsecret', 'formatsnapshot', 'formatstorage', + 'formatstoragecaps', 'glib-adoption', 'goals', 'governance', -- 2.35.1

On Mon, Mar 28, 2022 at 02:10:36PM +0200, Peter Krempa wrote:
Note that if we want to preserve the link from the code block hilighting 'virConnectGetStoragePoolCapabilities' we'd need to re-stylize it as rST doesn't support nesting of links into preformatted code blocks.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- ...
+A query interface was added to the virConnect API's to retrieve the XML listing +of the set of Storage Pool Capabilities ( :since:`Since 5.2.0` ): + +:: + + virConnectGetStoragePoolCapabilities
Here too it should IMO be a ref instead of verbatim. Reviewed-by: Erik Skultety <eskultet@redhat.com>

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/formatstorageencryption.html.in | 181 --------------------------- docs/formatstorageencryption.rst | 142 +++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 143 insertions(+), 182 deletions(-) delete mode 100644 docs/formatstorageencryption.html.in create mode 100644 docs/formatstorageencryption.rst diff --git a/docs/formatstorageencryption.html.in b/docs/formatstorageencryption.html.in deleted file mode 100644 index 395a7269b1..0000000000 --- a/docs/formatstorageencryption.html.in +++ /dev/null @@ -1,181 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Storage volume encryption XML format</h1> - - <ul id="toc"></ul> - - <h2><a id="StorageEncryption">Storage volume encryption XML</a></h2> - - <p> - Storage volumes may be encrypted, the XML snippet described below is used - to represent the details of the encryption. It can be used as a part - of a domain or storage configuration. - </p> - <p> - The top-level tag of volume encryption specification - is <code>encryption</code>, with a mandatory - attribute <code>format</code>. Currently defined values - of <code>format</code> are <code>default</code>, <code>qcow</code>, - <code>luks</code>, and <code>luks2</code>. - Each value of <code>format</code> implies some expectations about the - content of the <code>encryption</code> tag. Other format values may be - defined in the future. - </p> - <p> - The <code>encryption</code> tag supports an optional <code>engine</code> - tag, which allows selecting which component actually handles - the encryption. Currently defined values of <code>engine</code> are - <code>qemu</code> and <code>librbd</code>. - Both <code>qemu</code> and <code>librbd</code> require using the qemu - driver. - The <code>librbd</code> engine requires qemu version >= 6.1.0, both - ceph cluster and librbd1 >= 16.1.0, and is only applicable for RBD - network disks. - If the engine tag is not specified, the <code>qemu</code> engine will be - used by default (assuming the qemu driver is used). - Note that <code>librbd</code> engine is currently only supported by the - qemu VM driver, and is not supported by the storage driver. Furthermore, - the storage driver currently ignores the <code>engine</code> tag. - </p> - <p> - The <code>encryption</code> tag can currently contain a sequence of - <code>secret</code> tags, each with mandatory attributes <code>type</code> - and either <code>uuid</code> or <code>usage</code> - (<span class="since">since 2.1.0</span>). The only currently defined - value of <code>type</code> is <code>volume</code>. The - <code>uuid</code> is "uuid" of the <code>secret</code> while - <code>usage</code> is the "usage" subelement field. - A secret value can be set in libvirt by the - <a href="html/libvirt-libvirt-secret.html#virSecretSetValue"> - <code>virSecretSetValue</code></a> API. Alternatively, if supported - by the particular volume format and driver, automatically generate a - secret value at the time of volume creation, and store it using the - specified <code>uuid</code>. - </p> - <h3><a id="StorageEncryptionDefault">"default" format</a></h3> - <h3><a id="StorageEncryptionQcow">"qcow" format</a></h3> - <p> - <span class="since">Since 4.5.0,</span> encryption formats - <code>default</code> and <code>qcow</code> may no longer be used - to create an encrypted volume. Usage of qcow encrypted volumes - in QEMU began phasing out in QEMU 2.3 and by QEMU 2.9 creation - of a qcow encrypted volume via qemu-img required usage of secret - objects, but that support was not added to libvirt. - </p> - <h3><a id="StorageEncryptionLuks">"luks" format</a></h3> - <p> - The <code>luks</code> format is specific to a luks encrypted volume - and the secret is used in order to either encrypt during volume creation - or decrypt the volume for usage by the domain. A single - <code><secret type='passphrase'...></code> element is expected. - <span class="since">Since 2.1.0</span>. - </p> - <p> - For volume creation, it is possible to specify the encryption - algorithm used to encrypt the luks volume. The following two - optional elements may be provided for that purpose. It is hypervisor - dependent as to which algorithms are supported. The default algorithm - used by the storage driver backend when using qemu-img to create - the volume is 'aes-256-cbc' using 'essiv' for initialization vector - generation and 'sha256' hash algorithm for both the cipher and the - initialization vector generation. - </p> - - <dl> - <dt><code>cipher</code></dt> - <dd>This element describes the cipher algorithm to be used to either - encrypt or decrypt the luks volume. This element has the following - attributes: - <dl> - <dt><code>name</code></dt> - <dd>The name of the cipher algorithm used for data encryption, - such as 'aes', 'des', 'cast5', 'serpent', 'twofish', etc. - Support of the specific algorithm is storage driver - implementation dependent.</dd> - <dt><code>size</code></dt> - <dd>The size of the cipher in bits, such as '256', '192', '128', - etc. Support of the specific size for a specific cipher is - hypervisor dependent.</dd> - <dt><code>mode</code></dt> - <dd>An optional cipher algorithm mode such as 'cbc', 'xts', - 'ecb', etc. Support of the specific cipher mode is - hypervisor dependent.</dd> - <dt><code>hash</code></dt> - <dd>An optional master key hash algorithm such as 'md5', 'sha1', - 'sha256', etc. Support of the specific hash algorithm is - hypervisor dependent.</dd> - </dl> - </dd> - <dt><code>ivgen</code></dt> - <dd>This optional element describes the initialization vector - generation algorithm used in conjunction with the - <code>cipher</code>. If the <code>cipher</code> is not provided, - then an error will be generated by the parser. - <dl> - <dt><code>name</code></dt> - <dd>The name of the algorithm, such as 'plain', 'plain64', - 'essiv', etc. Support of the specific algorithm is hypervisor - dependent.</dd> - <dt><code>hash</code></dt> - <dd>An optional hash algorithm such as 'md5', 'sha1', 'sha256', - etc. Support of the specific ivgen hash algorithm is hypervisor - dependent.</dd> - </dl> - </dd> - </dl> - - <h3><a id="StorageEncryptionLuks2">"luks2" format</a></h3> - <p> - The <code>luks2</code> format is currently supported only by the - <code>librbd</code> engine, and can only be applied to RBD network disks - (RBD images). - Since the <code>librbd</code> engine is currently not supported by the - libvirt storage driver, you cannot use it to control such disks. However, - pre-formatted RBD luks2 disks can be loaded to a qemu VM using the qemu - VM driver. - A single - <code><secret type='passphrase'...></code> element is expected. - </p> - - - <h2><a id="example">Examples</a></h2> - - <p> - Assuming a <a href="formatsecret.html#usage-type-volume"> - <code>luks volume type secret</code></a> is already defined, - a simple example specifying use of the <code>luks</code> format - for either volume creation without a specific cipher being defined or - as part of a domain volume definition: - </p> - <pre> -<encryption format='luks'> - <secret type='passphrase' uuid='f52a81b2-424e-490c-823d-6bd4235bc572'/> -</encryption> - </pre> - - <p> - Here is an example specifying use of the <code>luks</code> format for - a specific cipher algorithm for volume creation. - <span class="since">Since 6.10.0,</span> the <code>target</code> format - can also support <code>qcow2</code> type with <code>luks</code> encryption. - </p> - <pre> -<volume> - <name>twofish.luks</name> - <capacity unit='G'>5</capacity> - <target> - <path>/var/lib/libvirt/images/demo.luks</path> - <format type='raw'/> - <encryption format='luks'> - <secret type='passphrase' uuid='f52a81b2-424e-490c-823d-6bd4235bc572'/> - <cipher name='twofish' size='256' mode='cbc' hash='sha256'/> - <ivgen name='plain64' hash='sha256'/> - </encryption> - </target> -</volume> - </pre> - - </body> -</html> diff --git a/docs/formatstorageencryption.rst b/docs/formatstorageencryption.rst new file mode 100644 index 0000000000..7b5cccaf5e --- /dev/null +++ b/docs/formatstorageencryption.rst @@ -0,0 +1,142 @@ +.. role:: since + +==================================== +Storage volume encryption XML format +==================================== + +.. contents:: + +Storage volume encryption XML +----------------------------- + +Storage volumes may be encrypted, the XML snippet described below is used to +represent the details of the encryption. It can be used as a part of a domain or +storage configuration. + +The top-level tag of volume encryption specification is ``encryption``, with a +mandatory attribute ``format``. Currently defined values of ``format`` are +``default``, ``qcow``, ``luks``, and ``luks2``. Each value of ``format`` implies +some expectations about the content of the ``encryption`` tag. Other format +values may be defined in the future. + +The ``encryption`` tag supports an optional ``engine`` tag, which allows +selecting which component actually handles the encryption. Currently defined +values of ``engine`` are ``qemu`` and ``librbd``. Both ``qemu`` and ``librbd`` +require using the qemu driver. The ``librbd`` engine requires qemu version >= +6.1.0, both ceph cluster and librbd1 >= 16.1.0, and is only applicable for RBD +network disks. If the engine tag is not specified, the ``qemu`` engine will be +used by default (assuming the qemu driver is used). Note that ``librbd`` engine +is currently only supported by the qemu VM driver, and is not supported by the +storage driver. Furthermore, the storage driver currently ignores the ``engine`` +tag. + +The ``encryption`` tag can currently contain a sequence of ``secret`` tags, each +with mandatory attributes ``type`` and either ``uuid`` or ``usage`` ( +:since:`since 2.1.0` ). The only currently defined value of ``type`` is +``volume``. The ``uuid`` is "uuid" of the ``secret`` while ``usage`` is the +"usage" subelement field. A secret value can be set in libvirt by the +`virSecretSetValue <html/libvirt-libvirt-secret.html#virSecretSetValue>`__ API. +Alternatively, if supported by the particular volume format and driver, +automatically generate a secret value at the time of volume creation, and store +it using the specified ``uuid``. + +"default" format +~~~~~~~~~~~~~~~~ + +"qcow" format +~~~~~~~~~~~~~ + +:since:`Since 4.5.0,` encryption formats ``default`` and ``qcow`` may no longer +be used to create an encrypted volume. Usage of qcow encrypted volumes in QEMU +began phasing out in QEMU 2.3 and by QEMU 2.9 creation of a qcow encrypted +volume via qemu-img required usage of secret objects, but that support was not +added to libvirt. + +"luks" format +~~~~~~~~~~~~~ + +The ``luks`` format is specific to a luks encrypted volume and the secret is +used in order to either encrypt during volume creation or decrypt the volume for +usage by the domain. A single ``<secret type='passphrase'...>`` element is +expected. :since:`Since 2.1.0` . + +For volume creation, it is possible to specify the encryption algorithm used to +encrypt the luks volume. The following two optional elements may be provided for +that purpose. It is hypervisor dependent as to which algorithms are supported. +The default algorithm used by the storage driver backend when using qemu-img to +create the volume is 'aes-256-cbc' using 'essiv' for initialization vector +generation and 'sha256' hash algorithm for both the cipher and the +initialization vector generation. + +``cipher`` + This element describes the cipher algorithm to be used to either encrypt or + decrypt the luks volume. This element has the following attributes: + + ``name`` + The name of the cipher algorithm used for data encryption, such as 'aes', + 'des', 'cast5', 'serpent', 'twofish', etc. Support of the specific + algorithm is storage driver implementation dependent. + ``size`` + The size of the cipher in bits, such as '256', '192', '128', etc. Support + of the specific size for a specific cipher is hypervisor dependent. + ``mode`` + An optional cipher algorithm mode such as 'cbc', 'xts', 'ecb', etc. + Support of the specific cipher mode is hypervisor dependent. + ``hash`` + An optional master key hash algorithm such as 'md5', 'sha1', 'sha256', + etc. Support of the specific hash algorithm is hypervisor dependent. +``ivgen`` + This optional element describes the initialization vector generation + algorithm used in conjunction with the ``cipher``. If the ``cipher`` is not + provided, then an error will be generated by the parser. + + ``name`` + The name of the algorithm, such as 'plain', 'plain64', 'essiv', etc. + Support of the specific algorithm is hypervisor dependent. + ``hash`` + An optional hash algorithm such as 'md5', 'sha1', 'sha256', etc. Support + of the specific ivgen hash algorithm is hypervisor dependent. + +"luks2" format +~~~~~~~~~~~~~~ + +The ``luks2`` format is currently supported only by the ``librbd`` engine, and +can only be applied to RBD network disks (RBD images). Since the ``librbd`` +engine is currently not supported by the libvirt storage driver, you cannot use +it to control such disks. However, pre-formatted RBD luks2 disks can be loaded +to a qemu VM using the qemu VM driver. A single +``<secret type='passphrase'...>`` element is expected. + +Examples +-------- + +Assuming a `luks volume type secret <formatsecret.html#VolumeUsageType>`__ is +already defined, a simple example specifying use of the ``luks`` format for +either volume creation without a specific cipher being defined or as part of a +domain volume definition: + +:: + + <encryption format='luks'> + <secret type='passphrase' uuid='f52a81b2-424e-490c-823d-6bd4235bc572'/> + </encryption> + +Here is an example specifying use of the ``luks`` format for a specific cipher +algorithm for volume creation. :since:`Since 6.10.0,` the ``target`` format can +also support ``qcow2`` type with ``luks`` encryption. + +:: + + <volume> + <name>twofish.luks</name> + <capacity unit='G'>5</capacity> + <target> + <path>/var/lib/libvirt/images/demo.luks</path> + <format type='raw'/> + <encryption format='luks'> + <secret type='passphrase' uuid='f52a81b2-424e-490c-823d-6bd4235bc572'/> + <cipher name='twofish' size='256' mode='cbc' hash='sha256'/> + <ivgen name='plain64' hash='sha256'/> + </encryption> + </target> + </volume> diff --git a/docs/meson.build b/docs/meson.build index b911292480..22eca7d8bd 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -25,7 +25,6 @@ docs_html_in_files = [ 'formatnetwork', 'formatnode', 'formatnwfilter', - 'formatstorageencryption', 'hooks', 'index', 'internals', @@ -88,6 +87,7 @@ docs_rst_files = [ 'formatsnapshot', 'formatstorage', 'formatstoragecaps', + 'formatstorageencryption', 'glib-adoption', 'goals', 'governance', -- 2.35.1

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/formatstorageencryption.rst | 3 --- 1 file changed, 3 deletions(-) diff --git a/docs/formatstorageencryption.rst b/docs/formatstorageencryption.rst index 7b5cccaf5e..1c7227040e 100644 --- a/docs/formatstorageencryption.rst +++ b/docs/formatstorageencryption.rst @@ -40,9 +40,6 @@ Alternatively, if supported by the particular volume format and driver, automatically generate a secret value at the time of volume creation, and store it using the specified ``uuid``. -"default" format -~~~~~~~~~~~~~~~~ - "qcow" format ~~~~~~~~~~~~~ -- 2.35.1

Use backticks to force monospace font instead of double quotes. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/formatstorageencryption.rst | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/formatstorageencryption.rst b/docs/formatstorageencryption.rst index 1c7227040e..11bea53cb8 100644 --- a/docs/formatstorageencryption.rst +++ b/docs/formatstorageencryption.rst @@ -40,8 +40,8 @@ Alternatively, if supported by the particular volume format and driver, automatically generate a secret value at the time of volume creation, and store it using the specified ``uuid``. -"qcow" format -~~~~~~~~~~~~~ +``qcow`` format +~~~~~~~~~~~~~~~ :since:`Since 4.5.0,` encryption formats ``default`` and ``qcow`` may no longer be used to create an encrypted volume. Usage of qcow encrypted volumes in QEMU @@ -49,8 +49,8 @@ began phasing out in QEMU 2.3 and by QEMU 2.9 creation of a qcow encrypted volume via qemu-img required usage of secret objects, but that support was not added to libvirt. -"luks" format -~~~~~~~~~~~~~ +``luks`` format +~~~~~~~~~~~~~~~ The ``luks`` format is specific to a luks encrypted volume and the secret is used in order to either encrypt during volume creation or decrypt the volume for @@ -94,8 +94,8 @@ initialization vector generation. An optional hash algorithm such as 'md5', 'sha1', 'sha256', etc. Support of the specific ivgen hash algorithm is hypervisor dependent. -"luks2" format -~~~~~~~~~~~~~~ +``luks2`` format +~~~~~~~~~~~~~~~~ The ``luks2`` format is currently supported only by the ``librbd`` engine, and can only be applied to RBD network disks (RBD images). Since the ``librbd`` -- 2.35.1

On Mon, Mar 28, 2022 at 02:10:39PM +0200, Peter Krempa wrote:
Use backticks to force monospace font instead of double quotes.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/hooks.html.in | 406 ----------------------------------- docs/hooks.rst | 518 +++++++++++++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 519 insertions(+), 407 deletions(-) delete mode 100644 docs/hooks.html.in create mode 100644 docs/hooks.rst diff --git a/docs/hooks.html.in b/docs/hooks.html.in deleted file mode 100644 index bbbc414dc4..0000000000 --- a/docs/hooks.html.in +++ /dev/null @@ -1,406 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Hooks for specific system management</h1> - - <ul id="toc"></ul> - - <h2><a id="intro">Custom event scripts</a></h2> - <p>Beginning with libvirt 0.8.0, specific events on a host system will - trigger custom scripts.</p> - <p>These custom <b>hook</b> scripts are executed when any of the following - actions occur:</p> - <ul> - <li>The libvirt daemon starts, stops, or reloads its - configuration - (<span class="since">since 0.8.0</span>)<br/><br/></li> - <li>A QEMU guest is started or stopped - (<span class="since">since 0.8.0</span>)<br/><br/></li> - <li>An LXC guest is started or stopped - (<span class="since">since 0.8.0</span>)<br/><br/></li> - <li>A libxl-handled Xen guest is started or stopped - (<span class="since">since 2.1.0</span>)<br/><br/></li> - <li>A network is started or stopped or an interface is - plugged/unplugged to/from the network - (<span class="since">since 1.2.2</span>)<br/><br/></li> - </ul> - - <h2><a id="location">Script location</a></h2> - <p>The libvirt hook scripts are located in the directory - <code>$SYSCONFDIR/libvirt/hooks/</code>.</p> - <ul> - <li>In Linux distributions such as Fedora and RHEL, this is - <code>/etc/libvirt/hooks/</code>. Other Linux distributions may do - this differently.</li> - <li>If your installation of libvirt has instead been compiled from - source, it is likely to be - <code>/usr/local/etc/libvirt/hooks/</code>.</li> - <li><span class="since">Since 6.5.0</span>, you can also place several - hook scripts in the directories - <code>/etc/libvirt/hooks/<driver>.d/</code>.</li> - </ul> - <p>To use hook scripts, you will need to create this <code>hooks</code> - directory manually, place the desired hook scripts inside, then make - them executable.</p> - <br/> - - <h2><a id="names">Script names</a></h2> - <p>At present, there are five hook scripts that can be called:</p> - <ul> - <li><code>/etc/libvirt/hooks/daemon</code><br/><br/> - Executed when the libvirt daemon is started, stopped, or reloads - its configuration<br/><br/></li> - <li><code>/etc/libvirt/hooks/qemu</code><br/><br/> - Executed when a QEMU guest is started, stopped, or migrated<br/><br/></li> - <li><code>/etc/libvirt/hooks/lxc</code><br /><br/> - Executed when an LXC guest is started or stopped</li> - <li><code>/etc/libvirt/hooks/libxl</code><br/><br/> - Executed when a libxl-handled Xen guest is started, stopped, or - migrated<br/><br/></li> - <li><code>/etc/libvirt/hooks/network</code><br/><br/> - Executed when a network is started or stopped or an - interface is plugged/unplugged to/from the network</li> - </ul> - <p><span class="since">Since 6.5.0</span>, you can also have - several scripts with any name in the directories - <code>/etc/libvirt/hooks/<driver>.d/</code>. They are - executed in alphabetical order after main script.</p> - <br/> - - <h2><a id="structure">Script structure</a></h2> - <p>The hook scripts are executed using standard Linux process creation - functions. Therefore, they must begin with the declaration of the - command interpreter to use.</p> - <p>For example:</p> - <pre>#!/bin/bash</pre> - <p>or:</p> - <pre>#!/usr/bin/python</pre> - <p>Other command interpreters are equally valid, as is any executable - binary, so you are welcome to use your favourite languages.</p> - <br/> - - <h2><a id="arguments">Script arguments</a></h2> - <p>The hook scripts are called with specific command line arguments, - depending upon the script, and the operation being performed.</p> - <p>The guest hook scripts, qemu and lxc, are also given the <b>full</b> - XML description for the domain on their stdin. This includes items - such the UUID of the domain and its storage information, and is - intended to provide all the libvirt information the script needs.</p> - <p>For all cases, stdin of the network hook script is provided with the - full XML description of the network status in the following form:</p> - -<pre><hookData> - <network> - <name>$network_name</name> - <uuid>afca425a-2c3a-420c-b2fb-dd7b4950d722</uuid> - ... - </network> -</hookData></pre> - - <p>In the case of an network port being created / deleted, the network - XML will be followed with the full XML description of the port:</p> - -<pre><hookData> - <network> - <name>$network_name</name> - <uuid>afca425a-2c3a-420c-b2fb-dd7b4950d722</uuid> - ... - </network> - <networkport> - <uuid>5d744f21-ba4a-4d6e-bdb2-30a35ff3207d</uuid> - ... - <plug type='direct' dev='ens3' mode='vepa'/> - </networkport> -</hookData></pre> - - <p>Please note that this approach is different from other cases such as - <code>daemon</code>, <code>qemu</code> or <code>lxc</code> hook scripts, - because two XMLs may be passed here, while in the other cases only a single - XML is passed.</p> - - <p>The command line arguments take this approach:</p> - <ol> - <li>The first argument is the name of the <b>object</b> involved in the - operation, or '-' if there is none.<br/><br/> - For example, the name of a guest being started.<br/><br/></li> - <li>The second argument is the name of the <b>operation</b> being - performed.<br/><br/> - For example, "start" if a guest is being started.<br/><br/></li> - <li>The third argument is a <b>sub-operation</b> indication, or '-' if there - is none.<br/><br/></li> - <li>The last argument is an <b>extra argument</b> string, or '-' if there is - none.</li> - </ol> - - <h4><a id="arguments_specifics">Specifics</a></h4> - <p>This translates to the following specifics for each hook script:</p> - - <h5><a id="daemon">/etc/libvirt/hooks/daemon</a></h5> - <ul> - <li>When the libvirt daemon is started, this script is called as:<br/> - <pre>/etc/libvirt/hooks/daemon - start - start</pre></li> - <li>When the libvirt daemon is shut down, this script is called as:<br/> - <pre>/etc/libvirt/hooks/daemon - shutdown - shutdown</pre></li> - <li>When the libvirt daemon receives the SIGHUP signal, it reloads its - configuration and triggers the hook script as:<br/> - <pre>/etc/libvirt/hooks/daemon - reload begin SIGHUP</pre></li> - </ul> - <p>Please note that when the libvirt daemon is restarted, the <i>daemon</i> - hook script is called once with the "shutdown" operation, and then once - with the "start" operation. There is no specific operation to indicate - a "restart" is occurring.</p> - - <h5><a id="qemu">/etc/libvirt/hooks/qemu</a></h5> - <ul> - <li>Before a QEMU guest is started, the qemu hook script is - called in three locations; if any location fails, the guest - is not started. The first location, <span class="since">since - 0.9.0</span>, is before libvirt performs any resource - labeling, and the hook can allocate resources not managed by - libvirt such as DRBD or missing bridges. This is called as:<br/> - <pre>/etc/libvirt/hooks/qemu guest_name prepare begin -</pre> - The second location, available <span class="since">Since - 0.8.0</span>, occurs after libvirt has finished labeling - all resources, but has not yet started the guest, called as:<br/> - <pre>/etc/libvirt/hooks/qemu guest_name start begin -</pre> - The third location, <span class="since">0.9.13</span>, - occurs after the QEMU process has successfully started up:<br/> - <pre>/etc/libvirt/hooks/qemu guest_name started begin -</pre> - </li> - <li>When a QEMU guest is stopped, the qemu hook script is called - in two locations, to match the startup. - First, <span class="since">since 0.8.0</span>, the hook is - called before libvirt restores any labels:<br/> - <pre>/etc/libvirt/hooks/qemu guest_name stopped end -</pre> - Then, after libvirt has released all resources, the hook is - called again, <span class="since">since 0.9.0</span>, to allow - any additional resource cleanup:<br/> - <pre>/etc/libvirt/hooks/qemu guest_name release end -</pre></li> - <li><span class="since">Since 0.9.11</span>, the qemu hook script - is also called at the beginning of incoming migration. It is called - as: <pre>/etc/libvirt/hooks/qemu guest_name migrate begin -</pre> - with domain XML sent to standard input of the script. In this case, - the script acts as a filter and is supposed to modify the domain - XML and print it out on its standard output. Empty output is - identical to copying the input XML without changing it. In case the - script returns failure or the output XML is not valid, incoming - migration will be canceled. This hook may be used, e.g., to change - location of disk images for incoming domains.</li> - <li><span class="since">Since 1.2.9</span>, the qemu hook script is - also called when restoring a saved image either via the API or - automatically when restoring a managed save machine. It is called - as: <pre>/etc/libvirt/hooks/qemu guest_name restore begin -</pre> - with domain XML sent to standard input of the script. In this case, - the script acts as a filter and is supposed to modify the domain - XML and print it out on its standard output. Empty output is - identical to copying the input XML without changing it. In case the - script returns failure or the output XML is not valid, restore of the - image will be aborted. This hook may be used, e.g., to change - location of disk images for restored domains.</li> - <li><span class="since">Since 6.5.0</span>, you can also place several - hook scripts in the directory - <code>/etc/libvirt/hooks/qemu.d/</code>. They are executed in - alphabetical order after main script. In this case each script also - acts as filter and can modify the domain XML and print it out on - its standard output. This script output is passed to standard input - next script in order. Empty output from any script is also identical - to copying the input XML without changing it. - In case any script returns failure common process will be aborted, - but all scripts from the directory will are executed.</li> - <li><span class="since">Since 0.9.13</span>, the qemu hook script - is also called when the libvirtd daemon restarts and reconnects - to previously running QEMU processes. If the script fails, the - existing QEMU process will be killed off. It is called as: - <pre>/etc/libvirt/hooks/qemu guest_name reconnect begin -</pre> - </li> - <li><span class="since">Since 0.9.13</span>, the qemu hook script - is also called when the QEMU driver is told to attach to an - externally launched QEMU process. It is called as: - <pre>/etc/libvirt/hooks/qemu guest_name attach begin -</pre> - </li> - </ul> - - <h5><a id="lxc">/etc/libvirt/hooks/lxc</a></h5> - <ul> - <li>Before a LXC guest is started, the lxc hook script is - called in three locations; if any location fails, the guest - is not started. The first location, <span class="since">since - 0.9.13</span>, is before libvirt performs any resource - labeling, and the hook can allocate resources not managed by - libvirt such as DRBD or missing bridges. This is called as:<br/> - <pre>/etc/libvirt/hooks/lxc guest_name prepare begin -</pre> - The second location, available <span class="since">Since - 0.8.0</span>, occurs after libvirt has finished labeling - all resources, but has not yet started the guest, called as:<br/> - <pre>/etc/libvirt/hooks/lxc guest_name start begin -</pre> - The third location, <span class="since">0.9.13</span>, - occurs after the LXC process has successfully started up:<br/> - <pre>/etc/libvirt/hooks/lxc guest_name started begin -</pre> - </li> - <li>When a LXC guest is stopped, the lxc hook script is called - in two locations, to match the startup. - First, <span class="since">since 0.8.0</span>, the hook is - called before libvirt restores any labels:<br/> - <pre>/etc/libvirt/hooks/lxc guest_name stopped end -</pre> - Then, after libvirt has released all resources, the hook is - called again, <span class="since">since 0.9.0</span>, to allow - any additional resource cleanup:<br/> - <pre>/etc/libvirt/hooks/lxc guest_name release end -</pre></li> - <li><span class="since">Since 0.9.13</span>, the lxc hook script - is also called when the libvirtd daemon restarts and reconnects - to previously running LXC processes. If the script fails, the - existing LXC process will be killed off. It is called as: - <pre>/etc/libvirt/hooks/lxc guest_name reconnect begin -</pre> - </li> - </ul> - - <h5><a id="libxl">/etc/libvirt/hooks/libxl</a></h5> - <ul> - <li>Before a Xen guest is started using libxl driver, the libxl hook - script is called in three locations; if any location fails, the guest - is not started. The first location, <span class="since">since - 2.1.0</span>, is before libvirt performs any resource - labeling, and the hook can allocate resources not managed by - libvirt. This is called as:<br/> - <pre>/etc/libvirt/hooks/libxl guest_name prepare begin -</pre> - The second location, available <span class="since">Since - 2.1.0</span>, occurs after libvirt has finished labeling - all resources, but has not yet started the guest, called as:<br/> - <pre>/etc/libvirt/hooks/libxl guest_name start begin -</pre> - The third location, <span class="since">2.1.0</span>, - occurs after the domain has successfully started up:<br/> - <pre>/etc/libvirt/hooks/libxl guest_name started begin -</pre> - </li> - <li>When a libxl-handled Xen guest is stopped, the libxl hook script - is called in two locations, to match the startup. - First, <span class="since">since 2.1.0</span>, the hook is - called before libvirt restores any labels:<br/> - <pre>/etc/libvirt/hooks/libxl guest_name stopped end -</pre> - Then, after libvirt has released all resources, the hook is - called again, <span class="since">since 2.1.0</span>, to allow - any additional resource cleanup:<br/> - <pre>/etc/libvirt/hooks/libxl guest_name release end -</pre></li> - <li><span class="since">Since 2.1.0</span>, the libxl hook script - is also called at the beginning of incoming migration. It is called - as: <pre>/etc/libvirt/hooks/libxl guest_name migrate begin -</pre> - with domain XML sent to standard input of the script. In this case, - the script acts as a filter and is supposed to modify the domain - XML and print it out on its standard output. Empty output is - identical to copying the input XML without changing it. In case the - script returns failure or the output XML is not valid, incoming - migration will be canceled. This hook may be used, e.g., to change - location of disk images for incoming domains.</li> - <li><span class="since">Since 6.5.0</span>, you can also place several - hook scripts in the directory - <code>/etc/libvirt/hooks/libxl.d/</code>. They are executed in - alphabetical order after main script. In this case each script also - acts as filter and can modify the domain XML and print it out on - its standard output. This script output is passed to standard input - next script in order. Empty output from any script is also identical - to copying the input XML without changing it. - In case any script returns failure common process will be aborted, - but all scripts from the directory will are executed.</li> - <li><span class="since">Since 2.1.0</span>, the libxl hook script - is also called when the libvirtd daemon restarts and reconnects - to previously running Xen domains. If the script fails, the - existing Xen domains will be killed off. It is called as: - <pre>/etc/libvirt/hooks/libxl guest_name reconnect begin -</pre> - </li> - </ul> - - <h5><a id="network">/etc/libvirt/hooks/network</a></h5> - <ul> - <li><span class="since">Since 1.2.2</span>, before a network is started, - this script is called as:<br/> - <pre>/etc/libvirt/hooks/network network_name start begin -</pre></li> - <li>After the network is started, up & running, the script is - called as:<br/> - <pre>/etc/libvirt/hooks/network network_name started begin -</pre></li> - <li>When a network is shut down, this script is called as:<br/> - <pre>/etc/libvirt/hooks/network network_name stopped end -</pre></li> - <li>Later, when network is started and there's an interface from a - domain to be plugged into the network, the hook script is called as:<br/> - <pre>/etc/libvirt/hooks/network network_name port-created begin -</pre> - Please note, that in this case, the script is passed both network and - port XMLs on its stdin.</li> - <li>When network is updated, the hook script is called as:<br/> - <pre>/etc/libvirt/hooks/network network_name updated begin -</pre></li> - <li>When the domain from previous case is shutting down, the interface - is unplugged. This leads to another script invocation:<br/> - <pre>/etc/libvirt/hooks/network network_name port-deleted begin -</pre> - And again, as in previous case, both network and port XMLs are passed - onto script's stdin.</li> - </ul> - - <br/> - - <h2><a id="execution">Script execution</a></h2> - <ul> - <li>The "start" operation for the guest and network hook scripts, - executes <b>prior</b> to the object (guest or network) being created. - This allows the object start operation to be aborted if the script - returns indicating failure.<br/><br/></li> - <li>The "stopped" operation for the guest and network hook scripts, - executes <b>after</b> the object (guest or network) has stopped. If - the hook script indicates failure in its return, the shut down of the - object cannot be aborted because it has already been performed. - <br/><br/></li> - <li>Hook scripts execute in a synchronous fashion. Libvirt waits - for them to return before continuing the given operation.<br/><br/> - This is most noticeable with the guest or network start operation, - as a lengthy operation in the hook script can mean an extended wait - for the guest or network to be available to end users.<br/><br/></li> - <li>For a hook script to be utilised, it must have its execute bit set - (e.g. chmod o+rx <i>qemu</i>), and must be present when the libvirt - daemon is started.<br/><br/></li> - <li>If a hook script is added to a host after the libvirt daemon is - already running, it won't be used until the libvirt daemon - next starts.</li> - </ul> - <br/> - - <h2><a id="qemu_migration">QEMU guest migration</a></h2> - <p>Migration of a QEMU guest involves running hook scripts on both the - source and destination hosts:</p> - <ol> - <li>At the beginning of the migration, the <i>qemu</i> hook script on - the <b>destination</b> host is executed with the "migrate" - operation.</li> - <li>Before QEMU process is spawned, the two operations ("prepare" and - "start") called for domain start are executed on - <b>destination</b> host.</li> - <li>If both of these hook script executions exit successfully (exit - status 0), the migration continues. Any other exit code indicates - failure, and the migration is aborted.</li> - <li>The QEMU guest is then migrated to the destination host.</li> - <li>Unless an error occurs during the migration process, the <i>qemu</i> - hook script on the <b>source</b> host is then executed with the - "stopped" and "release" operations to indicate it is no longer - running on this host. Regardless of the return codes, the - migration is not aborted as it has already been performed.</li> - </ol> - <br/> - - <h2><a id="recursive">Calling libvirt functions from within a hook script</a></h2> - <p><b>DO NOT DO THIS!</b></p> - <p>A hook script must not call back into libvirt, as the libvirt daemon - is already waiting for the script to exit.</p> - <p>A deadlock is likely to occur.</p> - <br/> - - <h2><a id="return_codes">Return codes and logging</a></h2> - <p>If a hook script returns with an exit code of 0, the libvirt daemon - regards this as successful and performs no logging of it.</p> - <p>However, if a hook script returns with a non zero exit code, the libvirt - daemon regards this as a failure, logs its return code, and - additionally logs anything on stderr the hook script returns.</p> - <p>For example, a hook script might use this code to indicate failure, - and send a text string to stderr:</p> - <pre>echo "Could not find required XYZZY" >&2 -exit 1</pre> - <p>The resulting entry in the libvirt log will appear as:</p> - <pre>20:02:40.297: error : virHookCall:285 : Hook script execution failed: internal error Child process (LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin - HOME=/root USER=root LOGNAME=root /etc/libvirt/hooks/qemu qemu prepare begin -) unexpected exit status 1: Could not find required XYZZY</pre> - </body> -</html> diff --git a/docs/hooks.rst b/docs/hooks.rst new file mode 100644 index 0000000000..9c5f3ff456 --- /dev/null +++ b/docs/hooks.rst @@ -0,0 +1,518 @@ +.. role:: since + +==================================== +Hooks for specific system management +==================================== + +.. contents:: + +Custom event scripts +-------------------- + +Beginning with libvirt 0.8.0, specific events on a host system will trigger +custom scripts. + +These custom **hook** scripts are executed when any of the following actions +occur: + +- The libvirt daemon starts, stops, or reloads its configuration ( + :since:`since 0.8.0` ) +- A QEMU guest is started or stopped ( :since:`since 0.8.0` ) +- An LXC guest is started or stopped ( :since:`since 0.8.0` ) +- A libxl-handled Xen guest is started or stopped ( :since:`since 2.1.0` ) +- A network is started or stopped or an interface is plugged/unplugged to/from + the network ( :since:`since 1.2.2` ) + +Script location +--------------- + +The libvirt hook scripts are located in the directory +``$SYSCONFDIR/libvirt/hooks/``. + +- In Linux distributions such as Fedora and RHEL, this is + ``/etc/libvirt/hooks/``. Other Linux distributions may do this differently. +- If your installation of libvirt has instead been compiled from source, it is + likely to be ``/usr/local/etc/libvirt/hooks/``. +- :since:`Since 6.5.0` , you can also place several hook scripts in the + directories ``/etc/libvirt/hooks/<driver>.d/``. + +To use hook scripts, you will need to create this ``hooks`` directory manually, +place the desired hook scripts inside, then make them executable. + +Script names +------------ + +At present, there are five hook scripts that can be called: + +- ``/etc/libvirt/hooks/daemon`` + Executed when the libvirt daemon is started, stopped, or reloads its + configuration +- ``/etc/libvirt/hooks/qemu`` + Executed when a QEMU guest is started, stopped, or migrated +- ``/etc/libvirt/hooks/lxc`` + Executed when an LXC guest is started or stopped +- ``/etc/libvirt/hooks/libxl`` + Executed when a libxl-handled Xen guest is started, stopped, or migrated +- ``/etc/libvirt/hooks/network`` + Executed when a network is started or stopped or an interface is + plugged/unplugged to/from the network + +:since:`Since 6.5.0` , you can also have several scripts with any name in the +directories ``/etc/libvirt/hooks/<driver>.d/``. They are executed in +alphabetical order after main script. + +Script structure +---------------- + +The hook scripts are executed using standard Linux process creation functions. +Therefore, they must begin with the declaration of the command interpreter to +use. + +For example: + +:: + + #!/bin/bash + +or: + +:: + + #!/usr/bin/python + +Other command interpreters are equally valid, as is any executable binary, so +you are welcome to use your favourite languages. + +Script arguments +---------------- + +The hook scripts are called with specific command line arguments, depending upon +the script, and the operation being performed. + +The guest hook scripts, qemu and lxc, are also given the **full** XML +description for the domain on their stdin. This includes items such the UUID of +the domain and its storage information, and is intended to provide all the +libvirt information the script needs. + +For all cases, stdin of the network hook script is provided with the full XML +description of the network status in the following form: + +:: + + <hookData> + <network> + <name>$network_name</name> + <uuid>afca425a-2c3a-420c-b2fb-dd7b4950d722</uuid> + ... + </network> + </hookData> + +In the case of an network port being created / deleted, the network XML will be +followed with the full XML description of the port: + +:: + + <hookData> + <network> + <name>$network_name</name> + <uuid>afca425a-2c3a-420c-b2fb-dd7b4950d722</uuid> + ... + </network> + <networkport> + <uuid>5d744f21-ba4a-4d6e-bdb2-30a35ff3207d</uuid> + ... + <plug type='direct' dev='ens3' mode='vepa'/> + </networkport> + </hookData> + +Please note that this approach is different from other cases such as ``daemon``, +``qemu`` or ``lxc`` hook scripts, because two XMLs may be passed here, while in +the other cases only a single XML is passed. + +The command line arguments take this approach: + +#. The first argument is the name of the **object** involved in the operation, + or '-' if there is none. + For example, the name of a guest being started. +#. The second argument is the name of the **operation** being performed. + For example, "start" if a guest is being started. +#. The third argument is a **sub-operation** indication, or '-' if there is + none. +#. The last argument is an **extra argument** string, or '-' if there is none. + +Specifics +~~~~~~~~~ + +This translates to the following specifics for each hook script: + +/etc/libvirt/hooks/daemon +^^^^^^^^^^^^^^^^^^^^^^^^^ + +- | When the libvirt daemon is started, this script is called as: + + :: + + /etc/libvirt/hooks/daemon - start - start + +- | When the libvirt daemon is shut down, this script is called as: + + :: + + /etc/libvirt/hooks/daemon - shutdown - shutdown + +- | When the libvirt daemon receives the SIGHUP signal, it reloads its + configuration and triggers the hook script as: + + :: + + /etc/libvirt/hooks/daemon - reload begin SIGHUP + +Please note that when the libvirt daemon is restarted, the *daemon* hook script +is called once with the "shutdown" operation, and then once with the "start" +operation. There is no specific operation to indicate a "restart" is occurring. + +/etc/libvirt/hooks/qemu +^^^^^^^^^^^^^^^^^^^^^^^ + +- | Before a QEMU guest is started, the qemu hook script is called in three + locations; if any location fails, the guest is not started. The first + location, :since:`since 0.9.0` , is before libvirt performs any resource + labeling, and the hook can allocate resources not managed by libvirt such + as DRBD or missing bridges. This is called as: + + :: + + /etc/libvirt/hooks/qemu guest_name prepare begin - + + | The second location, available :since:`Since 0.8.0` , occurs after libvirt + has finished labeling all resources, but has not yet started the guest, + called as: + + :: + + /etc/libvirt/hooks/qemu guest_name start begin - + + | The third location, :since:`0.9.13` , occurs after the QEMU process has + successfully started up: + + :: + + /etc/libvirt/hooks/qemu guest_name started begin - + +- | When a QEMU guest is stopped, the qemu hook script is called in two + locations, to match the startup. First, :since:`since 0.8.0` , the hook is + called before libvirt restores any labels: + + :: + + /etc/libvirt/hooks/qemu guest_name stopped end - + + | Then, after libvirt has released all resources, the hook is called again, + :since:`since 0.9.0` , to allow any additional resource cleanup: + + :: + + /etc/libvirt/hooks/qemu guest_name release end - + +- :since:`Since 0.9.11` , the qemu hook script is also called at the beginning + of incoming migration. It is called as: + + :: + + /etc/libvirt/hooks/qemu guest_name migrate begin - + + with domain XML sent to standard input of the script. In this case, the + script acts as a filter and is supposed to modify the domain XML and print it + out on its standard output. Empty output is identical to copying the input + XML without changing it. In case the script returns failure or the output XML + is not valid, incoming migration will be canceled. This hook may be used, + e.g., to change location of disk images for incoming domains. + +- :since:`Since 1.2.9` , the qemu hook script is also called when restoring a + saved image either via the API or automatically when restoring a managed save + machine. It is called as: + + :: + + /etc/libvirt/hooks/qemu guest_name restore begin - + + with domain XML sent to standard input of the script. In this case, the + script acts as a filter and is supposed to modify the domain XML and print it + out on its standard output. Empty output is identical to copying the input + XML without changing it. In case the script returns failure or the output XML + is not valid, restore of the image will be aborted. This hook may be used, + e.g., to change location of disk images for restored domains. + +- :since:`Since 6.5.0` , you can also place several hook scripts in the + directory ``/etc/libvirt/hooks/qemu.d/``. They are executed in alphabetical + order after main script. In this case each script also acts as filter and can + modify the domain XML and print it out on its standard output. This script + output is passed to standard input next script in order. Empty output from + any script is also identical to copying the input XML without changing it. In + case any script returns failure common process will be aborted, but all + scripts from the directory will are executed. + +- :since:`Since 0.9.13` , the qemu hook script is also called when the libvirtd + daemon restarts and reconnects to previously running QEMU processes. If the + script fails, the existing QEMU process will be killed off. It is called as: + + :: + + /etc/libvirt/hooks/qemu guest_name reconnect begin - + +- :since:`Since 0.9.13` , the qemu hook script is also called when the QEMU + driver is told to attach to an externally launched QEMU process. It is called + as: + + :: + + /etc/libvirt/hooks/qemu guest_name attach begin - + +/etc/libvirt/hooks/lxc +^^^^^^^^^^^^^^^^^^^^^^ + +- | Before a LXC guest is started, the lxc hook script is called in three + locations; if any location fails, the guest is not started. The first + location, :since:`since 0.9.13` , is before libvirt performs any resource + labeling, and the hook can allocate resources not managed by libvirt such + as DRBD or missing bridges. This is called as: + + :: + + /etc/libvirt/hooks/lxc guest_name prepare begin - + + | The second location, available :since:`Since 0.8.0` , occurs after libvirt + has finished labeling all resources, but has not yet started the guest, + called as: + + :: + + /etc/libvirt/hooks/lxc guest_name start begin - + + | The third location, :since:`0.9.13` , occurs after the LXC process has + successfully started up: + + :: + + /etc/libvirt/hooks/lxc guest_name started begin - + +- | When a LXC guest is stopped, the lxc hook script is called in two + locations, to match the startup. First, :since:`since 0.8.0` , the hook is + called before libvirt restores any labels: + + :: + + /etc/libvirt/hooks/lxc guest_name stopped end - + + | Then, after libvirt has released all resources, the hook is called again, + :since:`since 0.9.0` , to allow any additional resource cleanup: + + :: + + /etc/libvirt/hooks/lxc guest_name release end - + +- :since:`Since 0.9.13` , the lxc hook script is also called when the libvirtd + daemon restarts and reconnects to previously running LXC processes. If the + script fails, the existing LXC process will be killed off. It is called as: + + :: + + /etc/libvirt/hooks/lxc guest_name reconnect begin - + +/etc/libvirt/hooks/libxl +^^^^^^^^^^^^^^^^^^^^^^^^ + +- | Before a Xen guest is started using libxl driver, the libxl hook script is + called in three locations; if any location fails, the guest is not started. + The first location, :since:`since 2.1.0` , is before libvirt performs any + resource labeling, and the hook can allocate resources not managed by + libvirt. This is called as: + + :: + + /etc/libvirt/hooks/libxl guest_name prepare begin - + + | The second location, available :since:`Since 2.1.0` , occurs after libvirt + has finished labeling all resources, but has not yet started the guest, + called as: + + :: + + /etc/libvirt/hooks/libxl guest_name start begin - + + | The third location, :since:`2.1.0` , occurs after the domain has + successfully started up: + + :: + + /etc/libvirt/hooks/libxl guest_name started begin - + +- | When a libxl-handled Xen guest is stopped, the libxl hook script is called + in two locations, to match the startup. First, :since:`since 2.1.0` , the + hook is called before libvirt restores any labels: + + :: + + /etc/libvirt/hooks/libxl guest_name stopped end - + + | Then, after libvirt has released all resources, the hook is called again, + :since:`since 2.1.0` , to allow any additional resource cleanup: + + :: + + /etc/libvirt/hooks/libxl guest_name release end - + +- :since:`Since 2.1.0` , the libxl hook script is also called at the beginning + of incoming migration. It is called as: + + :: + + /etc/libvirt/hooks/libxl guest_name migrate begin - + + with domain XML sent to standard input of the script. In this case, the + script acts as a filter and is supposed to modify the domain XML and print it + out on its standard output. Empty output is identical to copying the input + XML without changing it. In case the script returns failure or the output XML + is not valid, incoming migration will be canceled. This hook may be used, + e.g., to change location of disk images for incoming domains. + +- :since:`Since 6.5.0` , you can also place several hook scripts in the + directory ``/etc/libvirt/hooks/libxl.d/``. They are executed in alphabetical + order after main script. In this case each script also acts as filter and can + modify the domain XML and print it out on its standard output. This script + output is passed to standard input next script in order. Empty output from + any script is also identical to copying the input XML without changing it. In + case any script returns failure common process will be aborted, but all + scripts from the directory will are executed. + +- :since:`Since 2.1.0` , the libxl hook script is also called when the libvirtd + daemon restarts and reconnects to previously running Xen domains. If the + script fails, the existing Xen domains will be killed off. It is called as: + + :: + + /etc/libvirt/hooks/libxl guest_name reconnect begin - + +/etc/libvirt/hooks/network +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- | :since:`Since 1.2.2` , before a network is started, this script is called + as: + + :: + + /etc/libvirt/hooks/network network_name start begin - + +- | After the network is started, up & running, the script is called as: + + :: + + /etc/libvirt/hooks/network network_name started begin - + +- | When a network is shut down, this script is called as: + + :: + + /etc/libvirt/hooks/network network_name stopped end - + +- | Later, when network is started and there's an interface from a domain to be + plugged into the network, the hook script is called as: + + :: + + /etc/libvirt/hooks/network network_name port-created begin - + + Please note, that in this case, the script is passed both network and port + XMLs on its stdin. + +- | When network is updated, the hook script is called as: + + :: + + /etc/libvirt/hooks/network network_name updated begin - + +- | When the domain from previous case is shutting down, the interface is + unplugged. This leads to another script invocation: + + :: + + /etc/libvirt/hooks/network network_name port-deleted begin - + + And again, as in previous case, both network and port XMLs are passed onto + script's stdin. + +Script execution +---------------- + +- The "start" operation for the guest and network hook scripts, executes + **prior** to the object (guest or network) being created. This allows the + object start operation to be aborted if the script returns indicating + failure. +- The "stopped" operation for the guest and network hook scripts, executes + **after** the object (guest or network) has stopped. If the hook script + indicates failure in its return, the shut down of the object cannot be + aborted because it has already been performed. +- Hook scripts execute in a synchronous fashion. Libvirt waits for them to + return before continuing the given operation. + This is most noticeable with the guest or network start operation, as a + lengthy operation in the hook script can mean an extended wait for the guest + or network to be available to end users. +- For a hook script to be utilised, it must have its execute bit set (e.g. + chmod o+rx *qemu*), and must be present when the libvirt daemon is started. +- If a hook script is added to a host after the libvirt daemon is already + running, it won't be used until the libvirt daemon next starts. + +QEMU guest migration +-------------------- + +Migration of a QEMU guest involves running hook scripts on both the source and +destination hosts: + +#. At the beginning of the migration, the *qemu* hook script on the + **destination** host is executed with the "migrate" operation. +#. Before QEMU process is spawned, the two operations ("prepare" and "start") + called for domain start are executed on **destination** host. +#. If both of these hook script executions exit successfully (exit status 0), + the migration continues. Any other exit code indicates failure, and the + migration is aborted. +#. The QEMU guest is then migrated to the destination host. +#. Unless an error occurs during the migration process, the *qemu* hook script + on the **source** host is then executed with the "stopped" and "release" + operations to indicate it is no longer running on this host. Regardless of + the return codes, the migration is not aborted as it has already been + performed. + +Calling libvirt functions from within a hook script +--------------------------------------------------- + +**DO NOT DO THIS!** + +A hook script must not call back into libvirt, as the libvirt daemon is already +waiting for the script to exit. + +A deadlock is likely to occur. + +Return codes and logging +------------------------ + +If a hook script returns with an exit code of 0, the libvirt daemon regards this +as successful and performs no logging of it. + +However, if a hook script returns with a non zero exit code, the libvirt daemon +regards this as a failure, logs its return code, and additionally logs anything +on stderr the hook script returns. + +For example, a hook script might use this code to indicate failure, and send a +text string to stderr: + +:: + + echo "Could not find required XYZZY" >&2 + exit 1 + +The resulting entry in the libvirt log will appear as: + +:: + + 20:02:40.297: error : virHookCall:285 : Hook script execution failed: internal error Child process (LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin + HOME=/root USER=root LOGNAME=root /etc/libvirt/hooks/qemu qemu prepare begin -) unexpected exit status 1: Could not find required XYZZY diff --git a/docs/meson.build b/docs/meson.build index 22eca7d8bd..a0e96e2453 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -25,7 +25,6 @@ docs_html_in_files = [ 'formatnetwork', 'formatnode', 'formatnwfilter', - 'hooks', 'index', 'internals', 'java', @@ -92,6 +91,7 @@ docs_rst_files = [ 'goals', 'governance', 'hacking', + 'hooks', 'libvirt-go', 'libvirt-go-xml', 'macos', -- 2.35.1

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/java.html.in | 121 ------------------------------------------- docs/java.rst | 128 ++++++++++++++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 129 insertions(+), 122 deletions(-) delete mode 100644 docs/java.html.in create mode 100644 docs/java.rst diff --git a/docs/java.html.in b/docs/java.html.in deleted file mode 100644 index 1f8c255d26..0000000000 --- a/docs/java.html.in +++ /dev/null @@ -1,121 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Java API bindings</h1> - -<h2>Presentation</h2> - <p>The Java bindings make use of <a href="https://jna.dev.java.net/">JNA</a> - to expose the C API in a Java friendly way. The bindings are based on - work initiated by Toth Istvan.</p> - -<h2>Getting it</h2> -<p> - The latest versions of the libvirt Java bindings can be downloaded from: -</p> - -<ul> -<li><a href="ftp://libvirt.org/libvirt/java/">libvirt.org FTP server</a></li> -<li><a href="https://libvirt.org/sources/java/">libvirt.org HTTP server</a></li> -</ul> - -<h3>Maven</h3> -<p>A maven repository is located at <a href="https://libvirt.org/maven2/">https://libvirt.org/maven2/</a> -which you can use to include this in your maven projects.</p> - -<h2>GIT source repository</h2> -<p> The Java bindings code source is now maintained in a <a -href="https://git-scm.com/">git</a> repository available on -<a href="https://gitlab.com/libvirt/libvirt-java/">gitlab.com</a>: -</p> -<pre> -git clone https://gitlab.com/libvirt/libvirt-java.git -</pre> - -<h2>Building</h2> -<p>The code is built using ant, and assumes that you have the jna jar installed. Once you have downloaded -the code you can build the code with</p> - -<pre> - -% cd libvirt-java -% ant build -</pre> - - -<h2>Content</h2> -<p>The bindings are articulated around a few -classes in the <code>org/libvirt</code> package, notably the -<code>Connect</code>, <code>Domain</code> and <code>Network</code> -ones. Functions in the <a href="html/index.html">C API</a> -taking <code>virConnectPtr</code>, <code>virDomainPtr</code> or -<code>virNetworkPtr</code> as their first argument usually become -methods for the classes, their name is just stripped from the -virConnect or virDomain(Get) prefix and the first letter gets converted to -lower case, for example the C functions:</p> - <p> - <code>int <a href="html/libvirt-libvirt-domain.html#virConnectNumOfDomains">virConnectNumOfDomains</a> -(virConnectPtr conn);</code> - </p> - <p> - <code>int <a href="html/libvirt-libvirt-domain.html#virDomainSetMaxMemory">virDomainSetMaxMemory</a> -(virDomainPtr domain, unsigned long memory);</code> - </p> - <p>become</p> - <p> - <code>virConn.numOfDomains()</code> - </p> - <p> - <code>virDomain.setMaxMemory(long memory)</code> - </p> - <p> There is of course some functions where the mapping is less direct -and using extra classes to map complex arguments. The <a href="https://libvirt.org/sources/java/javadoc">Javadoc</a> is available online or as -part of a separate libvirt-java-javadoc package.</p> - <p>So let's look at a simple example inspired from the -<code>test.java</code> test found in <code>src</code> in the source tree:</p> - <pre>import <span style="color: #0071FF; background-color: #FFFFFF">org.libvirt.*</span>; -public class minitest { - public static void main(String[] args) { - Connect conn=null; - try{ - conn = new <span style="color: #0071FF; background-color: #FFFFFF">Connect</span>("test:///default", true); - } catch (<span style="color: #0071FF; background-color: #FFFFFF">LibvirtException</span> e) { - System.out.println("exception caught:"+e); - System.out.println(e.getError()); - } - try{ - <span style="color: #0071FF; background-color: #FFFFFF">Domain</span> testDomain=conn.<span style="color: #007F00; background-color: #FFFFFF">domainLookupByName</span>("test"); - System.out.println("Domain:" + testDomain.<span style="color: #E50073; background-color: #FFFFFF">getName</span>() + " id " + - testDomain.<span style="color: #E50073; background-color: #FFFFFF">getID</span>() + " running " + - testDomain.<span style="color: #E50073; background-color: #FFFFFF">getOSType</span>()); - } catch (<span style="color: #0071FF; background-color: #FFFFFF">LibvirtException</span> e) { - System.out.println("exception caught:"+e); - System.out.println(e.getError()); - } - } -} -</pre> - <p>There is not much to comment about it, it really is a straight mapping -from the C API, the only points to notice are:</p> - <ul> - <li>the import of the modules in the <code><span style="color: #0071FF; background-color: #FFFFFF">org.libvirt</span></code> package</li> - <li>getting a connection to the hypervisor, in that case using the - readonly access to the default test hypervisor.</li> - <li>getting an object representing the test domain using <span style="color: #007F00; background-color: #FFFFFF">lookupByName</span></li> - <li>if the domain is not found a LibvirtError exception will be raised</li> - <li>extracting and printing some information about the domain using - various <span style="color: #E50073; background-color: #FFFFFF">methods</span> - associated to the Domain class.</li> - </ul> -<h2>Maven</h2> - <p>Up until version 0.4.7 the Java bindings were available from the central maven repository.</p> - <p>If you want to use 0.4.8 or higher, please add the following repository to your pom.xml</p> - <pre><repositories> - <repository> - <id>libvirt-org</id> - <url>https://libvirt.org/maven2</url> - </repository> -</repositories></pre> - - </body> -</html> diff --git a/docs/java.rst b/docs/java.rst new file mode 100644 index 0000000000..df846c6fc6 --- /dev/null +++ b/docs/java.rst @@ -0,0 +1,128 @@ +================= +Java API bindings +================= + +.. contents:: + +Presentation +------------ + +The Java bindings make use of `JNA <https://jna.dev.java.net/>`__ to expose the +C API in a Java friendly way. The bindings are based on work initiated by Toth +Istvan. + +Getting it +---------- + +The latest versions of the libvirt Java bindings can be downloaded from: + +- `libvirt.org FTP server <ftp://libvirt.org/libvirt/java/>`__ +- `libvirt.org HTTP server <https://libvirt.org/sources/java/>`__ + +A maven repository is located at https://libvirt.org/maven2/ which you can use +to include this in your maven projects. + +GIT source repository +--------------------- + +The Java bindings code source is now maintained in a +`git <https://git-scm.com/>`__ repository available on +`gitlab.com <https://gitlab.com/libvirt/libvirt-java/>`__: + +:: + + git clone https://gitlab.com/libvirt/libvirt-java.git + +Building +-------- + +The code is built using ant, and assumes that you have the jna jar installed. +Once you have downloaded the code you can build the code with + +:: + + + % cd libvirt-java + % ant build + +Content +------- + +The bindings are articulated around a few classes in the ``org/libvirt`` +package, notably the ``Connect``, ``Domain`` and ``Network`` ones. Functions in +the `C API <html/index.html>`__ taking ``virConnectPtr``, ``virDomainPtr`` or +``virNetworkPtr`` as their first argument usually become methods for the +classes, their name is just stripped from the virConnect or virDomain(Get) +prefix and the first letter gets converted to lower case, for example the C +functions: + +``int virConnectNumOfDomains (virConnectPtr conn);`` + +``int virDomainSetMaxMemory (virDomainPtr domain, unsigned long memory);`` + +become + +``virConn.numOfDomains()`` + +``virDomain.setMaxMemory(long memory)`` + +There is of course some functions where the mapping is less direct and using +extra classes to map complex arguments. The +`Javadoc <https://libvirt.org/sources/java/javadoc>`__ is available online or as +part of a separate libvirt-java-javadoc package. + +So let's look at a simple example inspired from the ``test.java`` test found in +``src`` in the source tree: + +:: + + import org.libvirt.*; + public class minitest { + public static void main(String[] args) { + Connect conn=null; + try{ + conn = new Connect("test:///default", true); + } catch (LibvirtException e) { + System.out.println("exception caught:"+e); + System.out.println(e.getError()); + } + try{ + Domain testDomain=conn.domainLookupByName("test"); + System.out.println("Domain:" + testDomain.getName() + " id " + + testDomain.getID() + " running " + + testDomain.getOSType()); + } catch (LibvirtException e) { + System.out.println("exception caught:"+e); + System.out.println(e.getError()); + } + } + } + +There is not much to comment about it, it really is a straight mapping from the +C API, the only points to notice are: + +- the import of the modules in the ``org.libvirt`` package +- getting a connection to the hypervisor, in that case using the readonly + access to the default test hypervisor. +- getting an object representing the test domain using ``lookupByName`` +- if the domain is not found a LibvirtError exception will be raised +- extracting and printing some information about the domain using various + methods associated to the Domain class. + +Maven +----- + +Up until version 0.4.7 the Java bindings were available from the central maven +repository. + +If you want to use 0.4.8 or higher, please add the following repository to your +pom.xml + +:: + + <repositories> + <repository> + <id>libvirt-org</id> + <url>https://libvirt.org/maven2</url> + </repository> + </repositories> diff --git a/docs/meson.build b/docs/meson.build index a0e96e2453..e1b618438c 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -27,7 +27,6 @@ docs_html_in_files = [ 'formatnwfilter', 'index', 'internals', - 'java', 'logging', 'php', 'python', @@ -92,6 +91,7 @@ docs_rst_files = [ 'governance', 'hacking', 'hooks', + 'java', 'libvirt-go', 'libvirt-go-xml', 'macos', -- 2.35.1

On Mon, Mar 28, 2022 at 02:10:41PM +0200, Peter Krempa wrote:
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- ...
-<h2>Maven</h2> - <p>Up until version 0.4.7 the Java bindings were available from the central maven repository.</p> - <p>If you want to use 0.4.8 or higher, please add the following repository to your pom.xml</p> - <pre><repositories> - <repository> - <id>libvirt-org</id> - <url>https://libvirt.org/maven2 </url>
Have you tuned this by hand? ^This extra space causes another merge conflict. The rest is fine: Reviewed-by: Erik Skultety <eskultet@redhat.com>

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/logging.html.in | 243 ------------------------------------------- docs/logging.rst | 243 +++++++++++++++++++++++++++++++++++++++++++ docs/meson.build | 2 +- 3 files changed, 244 insertions(+), 244 deletions(-) delete mode 100644 docs/logging.html.in create mode 100644 docs/logging.rst diff --git a/docs/logging.html.in b/docs/logging.html.in deleted file mode 100644 index 1052b763a0..0000000000 --- a/docs/logging.html.in +++ /dev/null @@ -1,243 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1 >Logging in the library and the daemon</h1> - - <p>Libvirt includes logging facilities starting from version 0.6.0, - this complements the <a href="errors.html">error handling</a> - mechanism and APIs to allow tracing through the execution of the - library as well as in the libvirtd daemon.</p> - - <ul id="toc"/> - - <h2> - <a id="log_library">Logging in the library</a> - </h2> - <p>The logging functionalities in libvirt are based on 3 key concepts, - similar to the one present in other generic logging facilities like - log4j:</p> - <ul> - <li><b>log messages</b>: they are information generated at runtime by - the libvirt code. Each message includes a priority level (DEBUG = 1, - INFO = 2, WARNING = 3, ERROR = 4), a category, function name and - line number, indicating where it originated from, and finally - a formatted message. In addition the library adds a timestamp - at the beginning of the message</li> - <li><b>log filters</b>: a set of patterns and priorities to accept - or reject a log message. If the message category matches a filter, - the message priority is compared to the filter priority, if lower - the message is discarded, if higher the message is output. If - no filter matches, then a general priority level is applied to - all remaining messages. This allows, for example, capturing all - debug messages for the QEMU driver, but otherwise only allowing - errors to show up from other parts.</li> - <li><b>log outputs</b>: once a message has gone through filtering a set of - output defines where to send the message, they can also filter - based on the priority, for example it may be useful to output - all messages to a debugging file but only allow errors to be - logged through syslog.</li> - </ul> - - <h2> - <a id="log_config">Configuring logging in the library</a> - </h2> - <p>The library configuration of logging is through 3 environment variables - allowing to control the logging behaviour:</p> - <ul> - <li>LIBVIRT_DEBUG: it can take the four following values: - <ul> - <li>1 or "debug": asking the library to log every message emitted, - though the filters can be used to avoid filling up the output</li> - <li>2 or "info": log all non-debugging information</li> - <li>3 or "warn": log warnings and errors, that's the default value</li> - <li>4 or "error": log only error messages</li> - </ul></li> - <li>LIBVIRT_LOG_FILTERS: defines logging filters</li> - <li>LIBVIRT_LOG_OUTPUTS: defines logging outputs</li> - </ul> - <p>Note that, for example, setting LIBVIRT_DEBUG= is the same as unset. If - you specify an invalid value, it will be ignored with a warning. If you - have an error in a filter or output string, some of the settings may be - applied up to the point at which libvirt encountered the error.</p> - <h2> - <a id="log_daemon">Logging in the daemon</a> - </h2> - <p>Similarly the daemon logging behaviour can be tuned using 3 config - variables, stored in the configuration file:</p> - <ul> - <li>log_level: accepts the following values: - <ul> - <li>4: only errors</li> - <li>3: warnings and errors</li> - <li>2: information, warnings and errors</li> - <li>1: debug and everything</li> - </ul></li> - <li>log_filters: defines logging filters</li> - <li>log_outputs: defines logging outputs</li> - </ul> - <p>When starting the libvirt daemon, any logging environment variable - settings will override settings in the config file. Command line options - take precedence over all. If no outputs are defined for libvirtd, it - will try to use</p> - <ul> - <li>0.10.0 or later: systemd journal, if <code>/run/systemd/journal/socket</code> exists</li> - <li>0.9.0 or later: file <code>/var/log/libvirt/libvirtd.log</code> if running as a daemon</li> - <li>before 0.9.0: syslog if running as a daemon</li> - <li>all versions: to stderr stream if running in the foreground</li> - </ul> - <p>Libvirtd does not reload its logging configuration when issued a SIGHUP. - If you want to reload the configuration, you must do a <code>service - libvirtd restart</code> or manually stop and restart the daemon - yourself.</p> - <p>Starting from 0.9.0, the daemon can save all the content of the debug - buffer to the defined error channels (or /var/log/libvirt/libvirtd.log - by default) in case of crash, this can also be activated explicitly - for debugging purposes by sending the daemon a USR2 signal:</p> - <pre>killall -USR2 libvirtd</pre> - <h2> - <a id="log_syntax">Syntax for filters and output values</a> - </h2> - <p>The syntax for filters and outputs is the same for both types of - variables.</p> - <p>The format for a filter is:</p> - <pre> -x:name</pre> - <p>where <code>name</code> is a string which is matched against - the category given in the VIR_LOG_INIT() at the top of each - libvirt source file, e.g., <code>remote</code>, <code>qemu</code>, - or <code>util.json</code> (the name in the filter can be a - substring of the full category name, in order to match multiple - similar categories), and <code>x</code> is the minimal - level where matching messages should be logged:</p> - <ul> - <li>1: DEBUG</li> - <li>2: INFO</li> - <li>3: WARNING</li> - <li>4: ERROR</li> - </ul> - <p>Multiple filters can be defined in a single string, they just need to be - separated by spaces, e.g: <code>"3:remote 4:event"</code> to only get - warning or errors from the remote layer and only errors from the event - layer.</p> - <p>If you specify a log priority in a filter that is below the default log - priority level, messages that match that filter will still be logged, - while others will not. In order to see those messages, you must also have - an output defined that includes the priority level of your filter.</p> - <p>The format for an output can be one of the following forms:</p> - <ul> - <li><code>x:stderr</code> output goes to stderr</li> - <li><code>x:syslog:name</code> use syslog for the output and use the - given <code>name</code> as the ident</li> - <li><code>x:file:file_path</code> output to a file, with the given - filepath</li> - <li><code>x:journald</code> output goes to systemd journal</li> - </ul> - <p>In all cases the x prefix is the minimal level, acting as a filter:</p> - <ul> - <li>1: DEBUG</li> - <li>2: INFO</li> - <li>3: WARNING</li> - <li>4: ERROR</li> - </ul> - <p>Multiple output can be defined, they just need to be separated by - spaces, e.g.: <code>"3:syslog:libvirtd 1:file:/tmp/libvirt.log"</code> - will log all warnings and errors to syslog under the libvirtd ident - but also log all debug and information included in the - file <code>/tmp/libvirt.log</code></p> - - <h2><a id="journald">Systemd journal fields</a></h2> - - <p> - When logging to the systemd journal, the following fields - are defined, in addition to any automatically recorded - <a href="https://www.freedesktop.org/software/systemd/man/systemd.journal-fields.html">standard fields</a>: - </p> - - <dl> - <dt><code>MESSAGE</code></dt> - <dd>The log message string</dd> - <dt><code>PRIORITY</code></dt> - <dd>The log priority value</dd> - <dt><code>LIBVIRT_SOURCE</code></dt> - <dd>The source type, one of "file", "error", "audit", "trace", "library"</dd> - <dt><code>CODE_FILE</code></dt> - <dd>The name of the file emitting the log record</dd> - <dt><code>CODE_LINE</code></dt> - <dd>The line number of the file emitting the log record</dd> - <dt><code>CODE_FUNC</code></dt> - <dd>The name of the function emitting the log record</dd> - <dt><code>LIBVIRT_DOMAIN</code></dt> - <dd>The libvirt error domain (values from virErrorDomain enum), if LIBVIRT_SOURCE="error"</dd> - <dt><code>LIBVIRT_CODE</code></dt> - <dd>The libvirt error code (values from virErrorCode enum), if LIBVIRT_SOURCE="error"</dd> - </dl> - - <h3><a id="journaldids">Well known message ID values</a></h3> - - <p> - Certain areas of the code will emit log records tagged with well known - unique id values, which are guaranteed never to change in the future. - This allows applications to identify critical log events without doing - string matching on the <code>MESSAGE</code> field. - </p> - - <dl> - <dt><code>MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361</code></dt> - <dd>Generated by the QEMU driver when it identifies a QEMU system - emulator binary, but is unable to extract information about its - capabilities. This is usually an indicator of a broken QEMU - build or installation. When this is emitted, the <code>LIBVIRT_QEMU_BINARY</code> - message field will provide the full path of the QEMU binary that failed. - </dd> - </dl> - - <p> - The <code>journalctl</code> command can be used to search the journal - matching on specific message ID values - </p> - - <pre> -$ journalctl MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361 --output=json -{ ...snip... - "LIBVIRT_SOURCE" : "file", - "PRIORITY" : "3", - "CODE_FILE" : "qemu/qemu_capabilities.c", - "CODE_LINE" : "2770", - "CODE_FUNC" : "virQEMUCapsLogProbeFailure", - "MESSAGE_ID" : "8ae2f3fb-2dbe-498e-8fbd-012d40afa361", - "LIBVIRT_QEMU_BINARY" : "/bin/qemu-system-xtensa", - "MESSAGE" : "Failed to probe capabilities for /bin/qemu-system-xtensa:" \ - "internal error: Child process (LC_ALL=C LD_LIBRARY_PATH=/home/berrange" \ - "/src/virt/libvirt/src/.libs PATH=/usr/lib64/ccache:/usr/local/sbin:" \ - "/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin HOME=/root " \ - "USER=root LOGNAME=root /bin/qemu-system-xtensa -help) unexpected " \ - "exit status 127: /bin/qemu-system-xtensa: error while loading shared " \ - "libraries: libglapi.so.0: cannot open shared object file: No such " \ - "file or directory\n" } - </pre> - - <h2> - <a id="log_examples">Examples</a> - </h2> - <p>For example setting up the following:</p> - <pre>export LIBVIRT_DEBUG=1 -export LIBVIRT_LOG_OUTPUTS="1:file:virsh.log"</pre> - <p>and then running virsh will accumulate the logs in the - <code>virsh.log</code> file in a way similar to:</p> - <pre>14:29:04.771: debug : virInitialize:278 : register drivers -14:29:04.771: debug : virRegisterDriver:618 : registering Test as driver 0</pre> - <p>the messages are timestamped, there is also the level recorded, - if debug the name of the function is also printed and then the formatted - message. This should be sufficient to at least get a precise idea of - what is happening and where things are going wrong, allowing to then - put the correct breakpoints when running under a debugger.</p> - <p>To activate full debug of the libvirt entry points, utility - functions and the QEMU/KVM driver, set:</p> - <pre>log_filters="1:libvirt 1:util 1:qemu" -log_outputs="1:file:/var/log/libvirt/libvirtd.log"</pre> - <p>in libvirtd.conf and restart the daemon will allow to - gather a copious amount of debugging traces for the operations done - in those areas.</p> - </body> -</html> diff --git a/docs/logging.rst b/docs/logging.rst new file mode 100644 index 0000000000..204176c6f5 --- /dev/null +++ b/docs/logging.rst @@ -0,0 +1,243 @@ +===================================== +Logging in the library and the daemon +===================================== + +.. contents:: + +Libvirt includes logging facilities starting from version 0.6.0, this +complements the `error handling <errors.html>`__ mechanism and APIs to allow +tracing through the execution of the library as well as in the libvirtd daemon. + +Logging in the library +---------------------- + +The logging functionalities in libvirt are based on 3 key concepts, similar to +the one present in other generic logging facilities like log4j: + +- **log messages**: they are information generated at runtime by the libvirt + code. Each message includes a priority level (DEBUG = 1, INFO = 2, WARNING = + 3, ERROR = 4), a category, function name and line number, indicating where it + originated from, and finally a formatted message. In addition the library + adds a timestamp at the beginning of the message +- **log filters**: a set of patterns and priorities to accept or reject a log + message. If the message category matches a filter, the message priority is + compared to the filter priority, if lower the message is discarded, if higher + the message is output. If no filter matches, then a general priority level is + applied to all remaining messages. This allows, for example, capturing all + debug messages for the QEMU driver, but otherwise only allowing errors to + show up from other parts. +- **log outputs**: once a message has gone through filtering a set of output + defines where to send the message, they can also filter based on the + priority, for example it may be useful to output all messages to a debugging + file but only allow errors to be logged through syslog. + +Configuring logging in the library +---------------------------------- + +The library configuration of logging is through 3 environment variables allowing +to control the logging behaviour: + +- LIBVIRT_DEBUG: it can take the four following values: + + - 1 or "debug": asking the library to log every message emitted, though the + filters can be used to avoid filling up the output + - 2 or "info": log all non-debugging information + - 3 or "warn": log warnings and errors, that's the default value + - 4 or "error": log only error messages + +- LIBVIRT_LOG_FILTERS: defines logging filters +- LIBVIRT_LOG_OUTPUTS: defines logging outputs + +Note that, for example, setting LIBVIRT_DEBUG= is the same as unset. If you +specify an invalid value, it will be ignored with a warning. If you have an +error in a filter or output string, some of the settings may be applied up to +the point at which libvirt encountered the error. + +Logging in the daemon +--------------------- + +Similarly the daemon logging behaviour can be tuned using 3 config variables, +stored in the configuration file: + +- log_level: accepts the following values: + + - 4: only errors + - 3: warnings and errors + - 2: information, warnings and errors + - 1: debug and everything + +- log_filters: defines logging filters +- log_outputs: defines logging outputs + +When starting the libvirt daemon, any logging environment variable settings will +override settings in the config file. Command line options take precedence over +all. If no outputs are defined for libvirtd, it will try to use + +- 0.10.0 or later: systemd journal, if ``/run/systemd/journal/socket`` exists +- 0.9.0 or later: file ``/var/log/libvirt/libvirtd.log`` if running as a daemon +- before 0.9.0: syslog if running as a daemon +- all versions: to stderr stream if running in the foreground + +Libvirtd does not reload its logging configuration when issued a SIGHUP. If you +want to reload the configuration, you must do a +``service libvirtd restart`` or manually stop and restart the daemon +yourself. + +Starting from 0.9.0, the daemon can save all the content of the debug buffer to +the defined error channels (or /var/log/libvirt/libvirtd.log by default) in case +of crash, this can also be activated explicitly for debugging purposes by +sending the daemon a USR2 signal: + +:: + + killall -USR2 libvirtd + +Syntax for filters and output values +------------------------------------ + +The syntax for filters and outputs is the same for both types of variables. + +The format for a filter is: + +:: + + x:name + +where ``name`` is a string which is matched against the category given in the +VIR_LOG_INIT() at the top of each libvirt source file, e.g., ``remote``, +``qemu``, or ``util.json`` (the name in the filter can be a substring of the +full category name, in order to match multiple similar categories), and ``x`` is +the minimal level where matching messages should be logged: + +- 1: DEBUG +- 2: INFO +- 3: WARNING +- 4: ERROR + +Multiple filters can be defined in a single string, they just need to be +separated by spaces, e.g: ``"3:remote 4:event"`` to only get warning or errors +from the remote layer and only errors from the event layer. + +If you specify a log priority in a filter that is below the default log priority +level, messages that match that filter will still be logged, while others will +not. In order to see those messages, you must also have an output defined that +includes the priority level of your filter. + +The format for an output can be one of the following forms: + +- ``x:stderr`` output goes to stderr +- ``x:syslog:name`` use syslog for the output and use the given ``name`` as the + ident +- ``x:file:file_path`` output to a file, with the given filepath +- ``x:journald`` output goes to systemd journal + +In all cases the x prefix is the minimal level, acting as a filter: + +- 1: DEBUG +- 2: INFO +- 3: WARNING +- 4: ERROR + +Multiple output can be defined, they just need to be separated by spaces, e.g.: +``"3:syslog:libvirtd 1:file:/tmp/libvirt.log"`` will log all warnings and errors +to syslog under the libvirtd ident but also log all debug and information +included in the file ``/tmp/libvirt.log`` + +Systemd journal fields +---------------------- + +When logging to the systemd journal, the following fields are defined, in +addition to any automatically recorded `standard +fields <https://www.freedesktop.org/software/systemd/man/systemd.journal-fields.html>`__: + +``MESSAGE`` + The log message string +``PRIORITY`` + The log priority value +``LIBVIRT_SOURCE`` + The source type, one of "file", "error", "audit", "trace", "library" +``CODE_FILE`` + The name of the file emitting the log record +``CODE_LINE`` + The line number of the file emitting the log record +``CODE_FUNC`` + The name of the function emitting the log record +``LIBVIRT_DOMAIN`` + The libvirt error domain (values from virErrorDomain enum), if + LIBVIRT_SOURCE="error" +``LIBVIRT_CODE`` + The libvirt error code (values from virErrorCode enum), if + LIBVIRT_SOURCE="error" + +Well known message ID values +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Certain areas of the code will emit log records tagged with well known unique id +values, which are guaranteed never to change in the future. This allows +applications to identify critical log events without doing string matching on +the ``MESSAGE`` field. + +``MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361`` + Generated by the QEMU driver when it identifies a QEMU system emulator + binary, but is unable to extract information about its capabilities. This is + usually an indicator of a broken QEMU build or installation. When this is + emitted, the ``LIBVIRT_QEMU_BINARY`` message field will provide the full path + of the QEMU binary that failed. + +The ``journalctl`` command can be used to search the journal matching on +specific message ID values + +:: + + $ journalctl MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361 --output=json + { ...snip... + "LIBVIRT_SOURCE" : "file", + "PRIORITY" : "3", + "CODE_FILE" : "qemu/qemu_capabilities.c", + "CODE_LINE" : "2770", + "CODE_FUNC" : "virQEMUCapsLogProbeFailure", + "MESSAGE_ID" : "8ae2f3fb-2dbe-498e-8fbd-012d40afa361", + "LIBVIRT_QEMU_BINARY" : "/bin/qemu-system-xtensa", + "MESSAGE" : "Failed to probe capabilities for /bin/qemu-system-xtensa:" \ + "internal error: Child process (LC_ALL=C LD_LIBRARY_PATH=/home/berrange" \ + "/src/virt/libvirt/src/.libs PATH=/usr/lib64/ccache:/usr/local/sbin:" \ + "/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin HOME=/root " \ + "USER=root LOGNAME=root /bin/qemu-system-xtensa -help) unexpected " \ + "exit status 127: /bin/qemu-system-xtensa: error while loading shared " \ + "libraries: libglapi.so.0: cannot open shared object file: No such " \ + "file or directory\n" } + +Examples +-------- + +For example setting up the following: + +:: + + export LIBVIRT_DEBUG=1 + export LIBVIRT_LOG_OUTPUTS="1:file:virsh.log" + +and then running virsh will accumulate the logs in the ``virsh.log`` file in a +way similar to: + +:: + + 14:29:04.771: debug : virInitialize:278 : register drivers + 14:29:04.771: debug : virRegisterDriver:618 : registering Test as driver 0 + +the messages are timestamped, there is also the level recorded, if debug the +name of the function is also printed and then the formatted message. This should +be sufficient to at least get a precise idea of what is happening and where +things are going wrong, allowing to then put the correct breakpoints when +running under a debugger. + +To activate full debug of the libvirt entry points, utility functions and the +QEMU/KVM driver, set: + +:: + + log_filters="1:libvirt 1:util 1:qemu" + log_outputs="1:file:/var/log/libvirt/libvirtd.log" + +in libvirtd.conf and restart the daemon will allow to gather a copious amount of +debugging traces for the operations done in those areas. diff --git a/docs/meson.build b/docs/meson.build index e1b618438c..f6e51353f0 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -27,7 +27,6 @@ docs_html_in_files = [ 'formatnwfilter', 'index', 'internals', - 'logging', 'php', 'python', 'remote', @@ -94,6 +93,7 @@ docs_rst_files = [ 'java', 'libvirt-go', 'libvirt-go-xml', + 'logging', 'macos', 'migration', 'newreposetup', -- 2.35.1

The 'debuglogs' knowledge base page has way more info and examples on how to set logging use it instead of the ad-hoc examples. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/logging.rst | 34 +++------------------------------- 1 file changed, 3 insertions(+), 31 deletions(-) diff --git a/docs/logging.rst b/docs/logging.rst index 204176c6f5..c7c14504ae 100644 --- a/docs/logging.rst +++ b/docs/logging.rst @@ -210,34 +210,6 @@ specific message ID values Examples -------- -For example setting up the following: - -:: - - export LIBVIRT_DEBUG=1 - export LIBVIRT_LOG_OUTPUTS="1:file:virsh.log" - -and then running virsh will accumulate the logs in the ``virsh.log`` file in a -way similar to: - -:: - - 14:29:04.771: debug : virInitialize:278 : register drivers - 14:29:04.771: debug : virRegisterDriver:618 : registering Test as driver 0 - -the messages are timestamped, there is also the level recorded, if debug the -name of the function is also printed and then the formatted message. This should -be sufficient to at least get a precise idea of what is happening and where -things are going wrong, allowing to then put the correct breakpoints when -running under a debugger. - -To activate full debug of the libvirt entry points, utility functions and the -QEMU/KVM driver, set: - -:: - - log_filters="1:libvirt 1:util 1:qemu" - log_outputs="1:file:/var/log/libvirt/libvirtd.log" - -in libvirtd.conf and restart the daemon will allow to gather a copious amount of -debugging traces for the operations done in those areas. +Examples with useful log settings along with more information on how to properly +configure logging for various situations can be found in the +`logging knowledge base article <kbase/debuglogs.html>`__. -- 2.35.1

On Mon, Mar 28, 2022 at 02:10:43PM +0200, Peter Krempa wrote:
The 'debuglogs' knowledge base page has way more info and examples on how to set logging use it instead of the ad-hoc examples.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- Reviewed-by: Erik Skultety <eskultet@redhat.com>

Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/meson.build | 2 +- docs/php.html.in | 28 ---------------------------- docs/php.rst | 23 +++++++++++++++++++++++ 3 files changed, 24 insertions(+), 29 deletions(-) delete mode 100644 docs/php.html.in create mode 100644 docs/php.rst diff --git a/docs/meson.build b/docs/meson.build index f6e51353f0..c1def26655 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -27,7 +27,6 @@ docs_html_in_files = [ 'formatnwfilter', 'index', 'internals', - 'php', 'python', 'remote', 'storage', @@ -100,6 +99,7 @@ docs_rst_files = [ 'nss', 'pci-addresses', 'pci-hotplug', + 'php', 'platforms', 'programming-languages', 'securityprocess', diff --git a/docs/php.html.in b/docs/php.html.in deleted file mode 100644 index 9340c81eec..0000000000 --- a/docs/php.html.in +++ /dev/null @@ -1,28 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>PHP API bindings</h1> - -<h2>Presentation</h2> - <p>The libvirt-php, originally called php-libvirt, is the PHP API bindings for - the libvirt virtualization toolkit originally developed by Radek Hladik.</p> - -<h2>Getting the source</h2> -<p> The PHP bindings code source is now maintained in a <a -href="https://git-scm.com/">git</a> repository available on -<a href="https://gitlab.com/libvirt/libvirt-php">gitlab.com</a>: -</p> -<pre> -git clone https://gitlab.com/libvirt/libvirt-php.git -</pre> - -<p></p> -<h2>Project pages</h2> -<p>Since February 2011 the project has its own pages hosted at libvirt.org. For more information on the project - please refer to <a href="https://libvirt.org/php">https://libvirt.org/php</a>. - -</p> - - </body> -</html> diff --git a/docs/php.rst b/docs/php.rst new file mode 100644 index 0000000000..36f7c44bed --- /dev/null +++ b/docs/php.rst @@ -0,0 +1,23 @@ +================ +PHP API bindings +================ + +The libvirt-php, originally called php-libvirt, is the PHP API bindings for the +libvirt virtualization toolkit originally developed by Radek Hladik. + +Getting the source +------------------ + +The PHP bindings code source is now maintained in a +`git <https://git-scm.com/>`__ repository available on +`gitlab.com <https://gitlab.com/libvirt/libvirt-php>`__: + +:: + + git clone https://gitlab.com/libvirt/libvirt-php.git + +Project pages +------------- + +Since February 2011 the project has its own pages hosted at libvirt.org. For +more information on the project please refer to https://libvirt.org/php. -- 2.35.1

Reviewed-by: Erik Skultety <eskultet@redhat.com>

Can you please rebase? The series doesn't apply cleanly anymore. Isn't this a part 3 actually? Based on [1]... :P [1] https://listman.redhat.com/archives/libvir-list/2022-March/229208.html Erik

On Fri, Apr 01, 2022 at 09:23:54 +0200, Erik Skultety wrote:
Can you please rebase? The series doesn't apply cleanly anymore.
I've noted that patch 17 doesn't apply cleanly. I planned to drop it (and solve the trivial conflict in docs/meson.build in patch 18). Then I'll re-convert it in the next pass. It doesn't make too much sense for me to try to chase and re-apply what has changed to the converted docs.
Isn't this a part 3 actually? Based on [1]... :P
[1] https://listman.redhat.com/archives/libvir-list/2022-March/229208.html
It is, but that series is pushed for some time now. The problem was that there was a docs change in one of the files converted in this series. I can re-send with the conflict solved if you'd like me to.
participants (2)
-
Erik Skultety
-
Peter Krempa