[Libvir] [RFC] Add Container support to libvirt

Greetings, I'd like to extend libvirt to support Containers. As libvirt already supports Xen, KVM, QEMU and OpenVZ, I think it would be valuable to be able to utilize existing utilities to manage containers. I've spent some time looking through the libvirt api and how this Container support will fit. Based on the XML format section of the libvirt website and some list discussions I put together the following proposed XML format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem> <mount>/etc = /home/user/lxc_files/etc</mount> <mount>/var = /home/user/lxc_files/var</mount> </filesystem> <application>dbserver</application> <network hostname='browndog'> <ip address="192.168.1.110" netmask="255.255.255.0"/> <gateway address="192.168.1.1"/> <nameserver>192.168.1.1</nameserver> </ip> </network> <cpushare>40</cpushare> <memory>65536</memory> </container> <devices> <console tty='/dev/pts/4' /> </devices> </domain> The clone() function is used with the CLONE_NEWPID and CLONE_NEWNS flags to start a new process within it's own process name space. The only processes visible to it will be itself and any processes that it spawns. The process that clone creates will start out preparing the container environment. This involves setting up any network interface, setting up the file system by performing any requested mounts, mounting /proc, setting up a tty device, populating /dev as necessary, and performing any other necessary initializations. It will then start the application(s) requested by the user. The executables started within the container could be an application or script or possibly /sbin/init. The mounts that the user specifies will need to be populated with the appropriate contents for whatever applications they are going to run within the container. cgroup will be used for isolation and association with controllers for cpu and memory resources. I'm planning to start in on defining a container. All comments and questions are welcome. Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization

Hi, Will definitely be a great enhancement! On Dec 28, 2007 2:04 PM, Dave Leskovec <dlesko@linux.vnet.ibm.com> wrote: [snip]
<application>dbserver</application>
[snip] What exactly is the "application" tag used for? Thanks, -- Shuveb Hussain B I N A R Y K A R M A Chennai, India. Phone : +91 44-64621656 Mobile: +91 98403-80386 http://www.binarykarma.com

Shuveb, Thanks! The application tag is intended to be used to specify one or executables or scripts to run in the container. For simple cases where there is only one or a few things to be started in the container, they could be specified with this tag. Otherwise, it would point to an initialization script that would take care of starting the appropriate items for the container. Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization Shuveb Hussain wrote:
Hi,
Will definitely be a great enhancement!
On Dec 28, 2007 2:04 PM, Dave Leskovec <dlesko@linux.vnet.ibm.com> wrote: [snip]
<application>dbserver</application>
[snip]
What exactly is the "application" tag used for?
Thanks,

Hi, How do you plan to maintain VM states across base machine reboots. I mean just as Xen uses XenStore and OpenVZ uses config files to "remember" VM configs? Are there tools being developed to do that? As I see it the container infrastructure is going into 2.6.24 as a generic UNIXish mechanism. It is well integrated into the kernel. So I'm really wondering if there are generic tools going to be available to manage the configuration of containers. I don't know if it is a good idea to use libvirt's XML files themselves for container config and start containers during libvirtd start-up. I'm just thinking aloud here. What was your plan? --shuveb On Jan 2, 2008 1:15 PM, Dave Leskovec <dlesko@linux.vnet.ibm.com> wrote:
Shuveb,
Thanks!
The application tag is intended to be used to specify one or executables or scripts to run in the container. For simple cases where there is only one or a few things to be started in the container, they could be specified with this tag. Otherwise, it would point to an initialization script that would take care of starting the appropriate items for the container.
Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization
Shuveb Hussain wrote:
Hi,
Will definitely be a great enhancement!
On Dec 28, 2007 2:04 PM, Dave Leskovec <dlesko@linux.vnet.ibm.com> wrote: [snip]
<application>dbserver</application>
[snip]
What exactly is the "application" tag used for?
Thanks,
-- Shuveb Hussain B I N A R Y K A R M A Chennai, India. Phone : +91 44-64621656 Mobile: +91 98403-80386 http://www.binarykarma.com

On Thu, Jan 03, 2008 at 10:07:39AM +0530, Shuveb Hussain wrote:
Hi,
How do you plan to maintain VM states across base machine reboots. I mean just as Xen uses XenStore and OpenVZ uses config files to "remember" VM configs? Are there tools being developed to do that?
As I see it the container infrastructure is going into 2.6.24 as a generic UNIXish mechanism. It is well integrated into the kernel. So I'm really wondering if there are generic tools going to be available to manage the configuration of containers.
If there are existing tools available or being developed we should use them, otherwise we should just define the process ourselves.
I don't know if it is a good idea to use libvirt's XML files themselves for container config and start containers during libvirtd start-up. I'm just thinking aloud here. What was your plan?
I think that using the libvirt XML file format would be a good initial starting point, since its in keeping with the QEMU/KVM driver and there's no other pre-existing config file format available. You could have a dir /etc/libvirt/[your driver] to store them in. You'd be able to re-use much of the code in the existing qemu driver to this - I do similar re-use and file formats for the forthcoming storage API drivers. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

There are no existing tools or any being developed so we'll have to create it ourselves. My plan is to store the configuration in files. I was debating whether to store the information directly in XML or come up with some other format. I see some advantages to storing it in XML - especially since there's an existing implementation (Thanks Dan!). Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization Daniel P. Berrange wrote:
On Thu, Jan 03, 2008 at 10:07:39AM +0530, Shuveb Hussain wrote:
Hi,
How do you plan to maintain VM states across base machine reboots. I mean just as Xen uses XenStore and OpenVZ uses config files to "remember" VM configs? Are there tools being developed to do that?
As I see it the container infrastructure is going into 2.6.24 as a generic UNIXish mechanism. It is well integrated into the kernel. So I'm really wondering if there are generic tools going to be available to manage the configuration of containers.
If there are existing tools available or being developed we should use them, otherwise we should just define the process ourselves.
I don't know if it is a good idea to use libvirt's XML files themselves for container config and start containers during libvirtd start-up. I'm just thinking aloud here. What was your plan?
I think that using the libvirt XML file format would be a good initial starting point, since its in keeping with the QEMU/KVM driver and there's no other pre-existing config file format available. You could have a dir /etc/libvirt/[your driver] to store them in. You'd be able to re-use much of the code in the existing qemu driver to this - I do similar re-use and file formats for the forthcoming storage API drivers.
Dan.

On Thu, Jan 03, 2008 at 10:49:16AM -0800, Dave Leskovec wrote:
There are no existing tools or any being developed so we'll have to create it ourselves. My plan is to store the configuration in files. I was debating whether to store the information directly in XML or come up with some other format. I see some advantages to storing it in XML - especially since there's an existing implementation (Thanks Dan!).
Yes, I'd really recommend against inventing a new format - we've got enough different config file parsers already :-) Lets try and just stick with the main XML format for any drivers unless there are existing tools we need to interact with (eg as we did for the /etc/xen config files). Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Greetings, Following up on the XML format for the Linux Container support I proposed... I've made the following recommended changes: * Changed mount tags * Changed nameserver tag to be consistent with gateway * Moved cpushare and memory tags outside container tag This is the updated format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem> <mount> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </mount> <mount> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </mount> </filesystem> <application>/usr/sbin/container_init</application> <network hostname='browndog'> <ip address="192.168.1.110" netmask="255.255.255.0"/> <gateway address="192.168.1.1"/> <nameserver address="192.168.1.1"/nameserver> </ip> </network> </container> <cpushare>40</cpushare> <memory>65536</memory> <devices> <console tty='/dev/pts/4'/> </devices> </domain> Does this look ok now? All comments and questions are welcome. -- Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization

Dave Leskovec wrote:
Greetings,
Following up on the XML format for the Linux Container support I proposed... I've made the following recommended changes: * Changed mount tags * Changed nameserver tag to be consistent with gateway * Moved cpushare and memory tags outside container tag
You might want to look at https://www.redhat.com/archives/libvir-list/2008-January/msg00097.html which has already gone through this process once before (though no comments have been made on that particular format).
This is the updated format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem> <mount> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </mount> <mount> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </mount> </filesystem> <application>/usr/sbin/container_init</application> <network hostname='browndog'> <ip address="192.168.1.110" netmask="255.255.255.0"/> <gateway address="192.168.1.1"/> <nameserver address="192.168.1.1"/nameserver> </ip> </network> </container> <cpushare>40</cpushare>
This is tuning information and doesn't belong in the XML. (Unless I missed the follow-ups on that discussion.)
<memory>65536</memory> <devices> <console tty='/dev/pts/4'/> </devices> </domain>
Does this look ok now? All comments and questions are welcome.
-- Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization
-- Daniel Hokka Zakrisson

On Tue, Jan 15, 2008 at 12:26:43AM -0800, Dave Leskovec wrote:
Greetings,
Following up on the XML format for the Linux Container support I proposed... I've made the following recommended changes: * Changed mount tags * Changed nameserver tag to be consistent with gateway * Moved cpushare and memory tags outside container tag
This is the updated format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem> <mount> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </mount> <mount> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </mount> </filesystem>
Comparing this to the Linux-VServer XML that Daniel posted, you're both pretty much representing the same concepts so we need to make a decision about which format to use for filesystem mounts. OpenVZ also provides a /domain/container/filesystem tag, though it uses a concept of filesystem templates auto-cloned per container rather than explicit mounts. I think I'd like to see <filesystem type="mount"> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </filesystem> For the existing OpenVZ XML, we can augment their <filesystem> tag with an attribute type="template".
<application>/usr/sbin/container_init</application> <network hostname='browndog'> <ip address="192.168.1.110" netmask="255.255.255.0"/> <gateway address="192.168.1.1"/> <nameserver address="192.168.1.1"/nameserver> </ip> </network>
Again this is pretty similar to needs of VServer / OpenVZ. In the existing OpenVZ XML, the gateway and nameserver tags are immediately within the <network> tag, rather than nested inside the <ip> tag. Aside from that it looks to be a consistent set of information.
</container> <cpushare>40</cpushare>
As Daniel points out, we've thus far explicitly excluded tuning info from the XML. Not that I have any suggestion on where else to put it at this time. This is a minor thing though, easily implemented once we come to a decision.
<memory>65536</memory> <devices> <console tty='/dev/pts/4'/> </devices> </domain>
Does this look ok now? All comments and questions are welcome.
Pretty close. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

* Daniel P. Berrange <berrange@redhat.com> [2008-01-15 15:52:13]:
On Tue, Jan 15, 2008 at 12:26:43AM -0800, Dave Leskovec wrote:
Greetings,
Following up on the XML format for the Linux Container support I proposed... I've made the following recommended changes: * Changed mount tags * Changed nameserver tag to be consistent with gateway * Moved cpushare and memory tags outside container tag
This is the updated format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem> <mount> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </mount> <mount> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </mount> </filesystem>
Comparing this to the Linux-VServer XML that Daniel posted, you're both pretty much representing the same concepts so we need to make a decision about which format to use for filesystem mounts.
OpenVZ also provides a /domain/container/filesystem tag, though it uses a concept of filesystem templates auto-cloned per container rather than explicit mounts. I think I'd like to see
<filesystem type="mount"> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </filesystem>
For the existing OpenVZ XML, we can augment their <filesystem> tag with an attribute type="template".
<application>/usr/sbin/container_init</application> <network hostname='browndog'> <ip address="192.168.1.110" netmask="255.255.255.0"/> <gateway address="192.168.1.1"/> <nameserver address="192.168.1.1"/nameserver> </ip> </network>
Again this is pretty similar to needs of VServer / OpenVZ. In the existing OpenVZ XML, the gateway and nameserver tags are immediately within the <network> tag, rather than nested inside the <ip> tag. Aside from that it looks to be a consistent set of information.
</container> <cpushare>40</cpushare>
As Daniel points out, we've thus far explicitly excluded tuning info from the XML. Not that I have any suggestion on where else to put it at this time. This is a minor thing though, easily implemented once we come to a decision.
At some point, we'll need resource management extensions to libvirt. vserver and openVZ both use them and it will also be useful for containers and kvm/qemu as well. I think we'll need a resource management feature extension to the XML format. Currently resource management is provided through control groups (I can send out links if desired). Ideally once configured the control groups should be persistent (visible across reboots, so we need to save state). Thoughts?
<memory>65536</memory> <devices> <console tty='/dev/pts/4'/> </devices> </domain>
Does this look ok now? All comments and questions are welcome.
Pretty close.
Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL

Balbir, I'd appreciate any links... How will this configuration be persisted across reboots? Specifically, once a configuration is set up for a container, who is responsible for storing this configuration? Will a libvirt driver need to store it somewhere in it's configuration file(s) or will it be stored somewhere else by a resource management facility such that it can associated it with a container after a reboot? Balbir Singh wrote:
* Daniel P. Berrange <berrange@redhat.com> [2008-01-15 15:52:13]:
On Tue, Jan 15, 2008 at 12:26:43AM -0800, Dave Leskovec wrote:
Greetings,
Following up on the XML format for the Linux Container support I proposed... I've made the following recommended changes: * Changed mount tags * Changed nameserver tag to be consistent with gateway * Moved cpushare and memory tags outside container tag
This is the updated format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem> <mount> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </mount> <mount> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </mount> </filesystem>
Comparing this to the Linux-VServer XML that Daniel posted, you're both pretty much representing the same concepts so we need to make a decision about which format to use for filesystem mounts.
OpenVZ also provides a /domain/container/filesystem tag, though it uses a concept of filesystem templates auto-cloned per container rather than explicit mounts. I think I'd like to see
<filesystem type="mount"> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </filesystem>
For the existing OpenVZ XML, we can augment their <filesystem> tag with an attribute type="template".
<application>/usr/sbin/container_init</application> <network hostname='browndog'> <ip address="192.168.1.110" netmask="255.255.255.0"/> <gateway address="192.168.1.1"/> <nameserver address="192.168.1.1"/nameserver> </ip> </network>
Again this is pretty similar to needs of VServer / OpenVZ. In the existing OpenVZ XML, the gateway and nameserver tags are immediately within the <network> tag, rather than nested inside the <ip> tag. Aside from that it looks to be a consistent set of information.
</container> <cpushare>40</cpushare>
As Daniel points out, we've thus far explicitly excluded tuning info from the XML. Not that I have any suggestion on where else to put it at this time. This is a minor thing though, easily implemented once we come to a decision.
At some point, we'll need resource management extensions to libvirt. vserver and openVZ both use them and it will also be useful for containers and kvm/qemu as well. I think we'll need a resource management feature extension to the XML format.
Currently resource management is provided through control groups (I can send out links if desired). Ideally once configured the control groups should be persistent (visible across reboots, so we need to save state).
Thoughts?
<memory>65536</memory> <devices> <console tty='/dev/pts/4'/> </devices> </domain>
Does this look ok now? All comments and questions are welcome.
Pretty close.
Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization

Thanks again for the feedback. I've made the following additional recommended changes: * Changed filesystem tag for consistency * Changed network spec to match most recent OpenVZ format. The latest OpenVZ format I could find was here: https://www.redhat.com/archives/libvir-list/2007-August/msg00209.html Shuveb - does this line up with what OpenVZ is now using? * Removed cpushare and memory tuning parameters Updated format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem type="mount"> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </filesystem> <filesystem type="mount"> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </filesystem> <application>/usr/sbin/container_init</application> <network> <ipaddress>192.168.1.110</ipaddress> <hostname>browndog</hostname> <gateway>192.168.1.1</gateway> <nameserver>192.168.1.1</nameserver> <netmask>255.255.255.0</netmask> </network> </container> <devices> <console tty='/dev/pts/4'/> </devices> </domain> As always, all comments and questions are welcome. Daniel P. Berrange wrote:
On Tue, Jan 15, 2008 at 12:26:43AM -0800, Dave Leskovec wrote:
Greetings,
Following up on the XML format for the Linux Container support I proposed... I've made the following recommended changes: * Changed mount tags * Changed nameserver tag to be consistent with gateway * Moved cpushare and memory tags outside container tag
This is the updated format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem> <mount> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </mount> <mount> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </mount> </filesystem>
Comparing this to the Linux-VServer XML that Daniel posted, you're both pretty much representing the same concepts so we need to make a decision about which format to use for filesystem mounts.
OpenVZ also provides a /domain/container/filesystem tag, though it uses a concept of filesystem templates auto-cloned per container rather than explicit mounts. I think I'd like to see
<filesystem type="mount"> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </filesystem>
For the existing OpenVZ XML, we can augment their <filesystem> tag with an attribute type="template".
<application>/usr/sbin/container_init</application> <network hostname='browndog'> <ip address="192.168.1.110" netmask="255.255.255.0"/> <gateway address="192.168.1.1"/> <nameserver address="192.168.1.1"/nameserver> </ip> </network>
Again this is pretty similar to needs of VServer / OpenVZ. In the existing OpenVZ XML, the gateway and nameserver tags are immediately within the <network> tag, rather than nested inside the <ip> tag. Aside from that it looks to be a consistent set of information.
</container> <cpushare>40</cpushare>
As Daniel points out, we've thus far explicitly excluded tuning info from the XML. Not that I have any suggestion on where else to put it at this time. This is a minor thing though, easily implemented once we come to a decision.
<memory>65536</memory> <devices> <console tty='/dev/pts/4'/> </devices> </domain>
Does this look ok now? All comments and questions are welcome.
Pretty close.
Dan.
-- Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization

Dave Leskovec wrote:
Thanks again for the feedback. I've made the following additional recommended changes: * Changed filesystem tag for consistency * Changed network spec to match most recent OpenVZ format. The latest OpenVZ format I could find was here: https://www.redhat.com/archives/libvir-list/2007-August/msg00209.html Shuveb - does this line up with what OpenVZ is now using? * Removed cpushare and memory tuning parameters
Memory is fine, it's only CPU tuning that's not in the XML.
Updated format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem type="mount"> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </filesystem> <filesystem type="mount"> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </filesystem> <application>/usr/sbin/container_init</application>
Could we call this init instead? Or boot?
<network> <ipaddress>192.168.1.110</ipaddress> <hostname>browndog</hostname> <gateway>192.168.1.1</gateway> <nameserver>192.168.1.1</nameserver> <netmask>255.255.255.0</netmask> </network> </container> <devices> <console tty='/dev/pts/4'/> </devices> </domain>
As always, all comments and questions are welcome.
Daniel P. Berrange wrote:
On Tue, Jan 15, 2008 at 12:26:43AM -0800, Dave Leskovec wrote:
Greetings,
Following up on the XML format for the Linux Container support I proposed... I've made the following recommended changes: * Changed mount tags * Changed nameserver tag to be consistent with gateway * Moved cpushare and memory tags outside container tag
This is the updated format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem> <mount> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </mount> <mount> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </mount> </filesystem>
Comparing this to the Linux-VServer XML that Daniel posted, you're both pretty much representing the same concepts so we need to make a decision about which format to use for filesystem mounts.
OpenVZ also provides a /domain/container/filesystem tag, though it uses a concept of filesystem templates auto-cloned per container rather than explicit mounts. I think I'd like to see
<filesystem type="mount"> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </filesystem>
For the existing OpenVZ XML, we can augment their <filesystem> tag with an attribute type="template".
<application>/usr/sbin/container_init</application> <network hostname='browndog'> <ip address="192.168.1.110" netmask="255.255.255.0"/> <gateway address="192.168.1.1"/> <nameserver address="192.168.1.1"/nameserver> </ip> </network>
Again this is pretty similar to needs of VServer / OpenVZ. In the existing OpenVZ XML, the gateway and nameserver tags are immediately within the <network> tag, rather than nested inside the <ip> tag. Aside from that it looks to be a consistent set of information.
</container> <cpushare>40</cpushare>
As Daniel points out, we've thus far explicitly excluded tuning info from the XML. Not that I have any suggestion on where else to put it at this time. This is a minor thing though, easily implemented once we come to a decision.
<memory>65536</memory> <devices> <console tty='/dev/pts/4'/> </devices> </domain>
Does this look ok now? All comments and questions are welcome.
Pretty close.
Dan.
--
Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization
-- Daniel Hokka Zakrisson

Daniel Hokka Zakrisson wrote:
Dave Leskovec wrote:
Thanks again for the feedback. I've made the following additional recommended changes: * Changed filesystem tag for consistency * Changed network spec to match most recent OpenVZ format. The latest OpenVZ format I could find was here: https://www.redhat.com/archives/libvir-list/2007-August/msg00209.html Shuveb - does this line up with what OpenVZ is now using? * Removed cpushare and memory tuning parameters
Memory is fine, it's only CPU tuning that's not in the XML.
Ah, ok.
Updated format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem type="mount"> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </filesystem> <filesystem type="mount"> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </filesystem> <application>/usr/sbin/container_init</application>
Could we call this init instead? Or boot?
The file indicated by this tag can be an init script as this example would tend to indicate. It could also be a single program to run within the container. Something like this: <application>/usr/sbin/sshd</application> Would init or boot make sense in this case as well? I'm open to changing it as long as it makes sense to everyone.
<network> <ipaddress>192.168.1.110</ipaddress> <hostname>browndog</hostname> <gateway>192.168.1.1</gateway> <nameserver>192.168.1.1</nameserver> <netmask>255.255.255.0</netmask> </network> </container> <devices> <console tty='/dev/pts/4'/> </devices> </domain>
As always, all comments and questions are welcome.
Daniel P. Berrange wrote:
On Tue, Jan 15, 2008 at 12:26:43AM -0800, Dave Leskovec wrote:
Greetings,
Following up on the XML format for the Linux Container support I proposed... I've made the following recommended changes: * Changed mount tags * Changed nameserver tag to be consistent with gateway * Moved cpushare and memory tags outside container tag
This is the updated format: <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem> <mount> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </mount> <mount> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </mount> </filesystem>
Comparing this to the Linux-VServer XML that Daniel posted, you're both pretty much representing the same concepts so we need to make a decision about which format to use for filesystem mounts.
OpenVZ also provides a /domain/container/filesystem tag, though it uses a concept of filesystem templates auto-cloned per container rather than explicit mounts. I think I'd like to see
<filesystem type="mount"> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </filesystem>
For the existing OpenVZ XML, we can augment their <filesystem> tag with an attribute type="template".
<application>/usr/sbin/container_init</application> <network hostname='browndog'> <ip address="192.168.1.110" netmask="255.255.255.0"/> <gateway address="192.168.1.1"/> <nameserver address="192.168.1.1"/nameserver> </ip> </network>
Again this is pretty similar to needs of VServer / OpenVZ. In the existing OpenVZ XML, the gateway and nameserver tags are immediately within the <network> tag, rather than nested inside the <ip> tag. Aside from that it looks to be a consistent set of information.
</container> <cpushare>40</cpushare>
As Daniel points out, we've thus far explicitly excluded tuning info from the XML. Not that I have any suggestion on where else to put it at this time. This is a minor thing though, easily implemented once we come to a decision.
<memory>65536</memory> <devices> <console tty='/dev/pts/4'/> </devices> </domain>
Does this look ok now? All comments and questions are welcome.
Pretty close.
Dan.
--
Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization
-- Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization

DL> Would init or boot make sense in this case as well? I'm open to DL> changing it as long as it makes sense to everyone. I think that "init" is a widely-recognized term for the master process in a given process namespace. I would also think that, at least from a libvirt perspective, most people would be interested in having an init-like process structure within their containers. Thus, I would vote for using <init> over <boot> or <application>. -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

On Wed, Jan 16, 2008 at 07:05:19AM -0800, Dan Smith wrote:
DL> Would init or boot make sense in this case as well? I'm open to DL> changing it as long as it makes sense to everyone.
I think that "init" is a widely-recognized term for the master process in a given process namespace. I would also think that, at least from a libvirt perspective, most people would be interested in having an init-like process structure within their containers. Thus, I would vote for using <init> over <boot> or <application>.
I agree - particularly as we already use <boot> elsewhere to refer to something else. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Ok, I've made a couple more recommended changes: * Changed application to init * Restored the memory tag <domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem type="mount"> <source dir="/home/user/lxc_files/etc/"/> <target dir="/etc/"/> </filesystem> <filesystem type="mount"> <source dir="/home/user/lxc_files/var/"/> <target dir="/var/"/> </filesystem> <init>/usr/sbin/container_init</init> <network> <ipaddress>192.168.1.110</ipaddress> <hostname>browndog</hostname> <gateway>192.168.1.1</gateway> <nameserver>192.168.1.1</nameserver> <netmask>255.255.255.0</netmask> </network> </container> <memory>65536</memory> <devices> <console tty='/dev/pts/4'/> </devices> </domain> Any other comments or questions? -- Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization

DL> <network> DL> <ipaddress>192.168.1.110</ipaddress> DL> <hostname>browndog</hostname> DL> <gateway>192.168.1.1</gateway> DL> <nameserver>192.168.1.1</nameserver> DL> <netmask>255.255.255.0</netmask> DL> </network> This is me showing my complete ignorance here, but can you enforce the hostname inside a container like that? I would expect that to be done by the init scripts inside a container, much like a real machine. Also, the above network block seems to imply that a container will only ever have one network interface, which I expect is not the case. The placement of the network block also seems to imply that this bit of network config, and specifically this IP address, is a characteristic property of the container itself. That seems strange to me. Should this be under <devices> so we can represent more than one interface? -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

On Jan 17, 2008 2:30 AM, Dan Smith <danms@us.ibm.com> wrote:
DL> <network> DL> <ipaddress>192.168.1.110</ipaddress> DL> <hostname>browndog</hostname> DL> <gateway>192.168.1.1</gateway> DL> <nameserver>192.168.1.1</nameserver> DL> <netmask>255.255.255.0</netmask> DL> </network>
This is me showing my complete ignorance here, but can you enforce the hostname inside a container like that? I would expect that to be done by the init scripts inside a container, much like a real machine.
It is most flexible to manage virtual machines from the base machine on which they run. The idea is to run an agent on the base machine alone that is able to configure all virtual machines without the need to have agents inside of each of the virtual machines. When an OpenVZ VPS is being created, the tools actually set /etc/resolv.conf and also its IP address from its configuration file. In containers, there are no tools that do it (yet). As of now, I am too wondering how the network configuration values in the config are going to be applied to the virtual machine. If it were like a real virtual machine, then these things get set from the machine's init script itself, like you said. Thanks, -- Shuveb Hussain B I N A R Y K A R M A Chennai, India. Phone : +91 44-64621656 Mobile: +91 98403-80386 http://www.binarykarma.com

On Thu, 17 Jan 2008, Shuveb Hussain wrote:
On Jan 17, 2008 2:30 AM, Dan Smith <danms@us.ibm.com> wrote:
DL> <network> DL> <ipaddress>192.168.1.110</ipaddress> DL> <hostname>browndog</hostname> DL> <gateway>192.168.1.1</gateway> DL> <nameserver>192.168.1.1</nameserver> DL> <netmask>255.255.255.0</netmask> DL> </network>
This is me showing my complete ignorance here, but can you enforce the hostname inside a container like that? I would expect that to be done by the init scripts inside a container, much like a real machine.
It is most flexible to manage virtual machines from the base machine on which they run. The idea is to run an agent on the base machine alone that is able to configure all virtual machines without the need to have agents inside of each of the virtual machines. When an OpenVZ VPS is being created, the tools actually set /etc/resolv.conf and also
Containers would similarly be managed from the init namespace. [On 2.6.24 kernel ensure you have cgroup and controllers support included - CGROUP_* set of config options. Then mount as follows - mount -t cgroup -o ns none /containers] If a Container is clone()ed now it will automatically show up in /containers as node_<pid> of the child. One can then rename this to the desired name as well as pass the desired hostname as part of the configuration for the Container wherein its init task/script can set the hostname. Am I understanding and correlating the above correctly to what you mention wrt OpenVZ? Vivek
its IP address from its configuration file. In containers, there are no tools that do it (yet). As of now, I am too wondering how the network configuration values in the config are going to be applied to the virtual machine. If it were like a real virtual machine, then these things get set from the machine's init script itself, like you said.
Thanks, -- Shuveb Hussain B I N A R Y K A R M A Chennai, India. Phone : +91 44-64621656 Mobile: +91 98403-80386 http://www.binarykarma.com
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
__ Vivek Kashyap Linux Technology Center, IBM

On Jan 17, 2008 12:55 PM, Vivek Kashyap <kashyapv@us.ibm.com> wrote:
On Thu, 17 Jan 2008, Shuveb Hussain wrote:
On Jan 17, 2008 2:30 AM, Dan Smith <danms@us.ibm.com> wrote:
DL> <network> DL> <ipaddress>192.168.1.110</ipaddress> DL> <hostname>browndog</hostname> DL> <gateway>192.168.1.1</gateway> DL> <nameserver>192.168.1.1</nameserver> DL> <netmask>255.255.255.0</netmask> DL> </network>
This is me showing my complete ignorance here, but can you enforce the hostname inside a container like that? I would expect that to be done by the init scripts inside a container, much like a real machine.
It is most flexible to manage virtual machines from the base machine on which they run. The idea is to run an agent on the base machine alone that is able to configure all virtual machines without the need to have agents inside of each of the virtual machines. When an OpenVZ VPS is being created, the tools actually set /etc/resolv.conf and also
Containers would similarly be managed from the init namespace.
[On 2.6.24 kernel ensure you have cgroup and controllers support included - CGROUP_* set of config options. Then mount as follows -
mount -t cgroup -o ns none /containers]
If a Container is clone()ed now it will automatically show up in /containers as node_<pid> of the child. One can then rename this to the desired name as well as pass the desired hostname as part of the configuration for the Container wherein its init task/script can set the hostname.
Am I understanding and correlating the above correctly to what you mention wrt OpenVZ?
You are correct. Also, like mentioned before the container(its init script) is responsible for picking up its networking parameters and setting it up. Thanks, -- Shuveb Hussain B I N A R Y K A R M A Chennai, India. Phone : +91 44-64621656 Mobile: +91 98403-80386 http://www.binarykarma.com

On Fri, Dec 28, 2007 at 12:34:14AM -0800, Dave Leskovec wrote:
I'd like to extend libvirt to support Containers. As libvirt already supports Xen, KVM, QEMU and OpenVZ, I think it would be valuable to be able to utilize existing utilities to manage containers.
When you say 'Containers' are you meaning that in the general sense of the word, or specifically referring to the container support that is being incrementally added to the upstream Linux ?
I've spent some time looking through the libvirt api and how this Container support will fit. Based on the XML format section of the libvirt website and some list discussions I put together the following proposed XML format:
<domain type='linuxcontainer'> <name>Container123</name> <uuid>8dfd44b31e76d8d335150a2d98211ea0</uuid> <container> <filesystem> <mount>/etc = /home/user/lxc_files/etc</mount> <mount>/var = /home/user/lxc_files/var</mount> </filesystem>
You have structured data in the CDATA section there which is bad practice. A couple of possible alternatives ... <mount node="/etc/">/home/user/lxc_files/etc</mount> Or <mount> <source dir="/home/usr/lxc_files/etc"/> <target dir="/etc"/> </mount>
<application>dbserver</application>
Not sure what this is for - could you elaborate a little ?
<network hostname='browndog'> <ip address="192.168.1.110" netmask="255.255.255.0"/> <gateway address="192.168.1.1"/> <nameserver>192.168.1.1</nameserver>
For consistency with the <gateway> tag that would be better off as being <nameserver address="192.168.1.1"/>
</ip> </network> <cpushare>40</cpushare> <memory>65536</memory>
This can be in the top level - ie outside the '<container>' tag since the <memory> tag is already there for non-container based virtualization and there's no neeed to differ here.
</container> <devices> <console tty='/dev/pts/4' /> </devices> </domain>
The clone() function is used with the CLONE_NEWPID and CLONE_NEWNS flags to start a new process within it's own process name space. The only processes visible to it will be itself and any processes that it spawns. The process that clone creates will start out preparing the container environment. This involves setting up any network interface, setting up the file system by performing any requested mounts, mounting /proc, setting up a tty device, populating /dev as necessary, and performing any other necessary initializations. It will then start the application(s) requested by the user. The executables started within the container could be an application or script or possibly /sbin/init.
Rather than trying to define a syntax for libvirt to start multiple apps within the container, I reckon just having libvirt call a single /sbin/init script would be best - that way admins can use whatever they like for startup inside the container, whether it be a simple shell script, SysV init, or something else entirely.
The mounts that the user specifies will need to be populated with the appropriate contents for whatever applications they are going to run within the container. cgroup will be used for isolation and association with controllers for cpu and memory resources.
I'm planning to start in on defining a container. All comments and questions are welcome.
Ok, it definitely sounds like you're talking specifically about the Linux based container impl now. Aside from the existing OpenVZ driver, there was also a proof of concept for the Linux-VServer based containers posted a short while back which may be a useful reference: https://www.redhat.com/archives/libvir-list/2007-October/msg00273.html I can see that all of the OpenVZ, VServer and Container concepts match up pretty nicely, so we ought to be able to get a good XML representation that works with all of them. One major area we've never successfully tackled yet, even for non-container based virtualization is the topic of resource controls / tuning info. I'd say that should be one of our priorities to sort out now we've got all the NUMA stuff integrated. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel, Thanks for your comments. I'll make the suggested changes and post an updated version. The application tag is intended to allow the user to specify one or more executables or scripts to run in the container. A container can be extremely simple running a single executable or script. In these simple cases, the application tag would allow the user to specify one or a few things to run in the container without having to deal with init scripts or configuration files. In more complex setups, the application tag would be used to specify the initialization script or executable of their choosing. Would your suggestion be to limit the user to specify a single app and if they want to run multiple apps, they should specify an init script? I've looked quite a bit at the OpenVZ driver and will take a look at the VServer reference you provided. Best Regards, Dave Leskovec IBM Linux Technology Center Open Virtualization
participants (7)
-
Balbir Singh
-
Dan Smith
-
Daniel Hokka Zakrisson
-
Daniel P. Berrange
-
Dave Leskovec
-
Shuveb Hussain
-
Vivek Kashyap