On Thu, Mar 26, 2020 at 18:19:20 +0000, Daniel Berrange wrote:
On Wed, Mar 25, 2020 at 03:11:40PM +0800, Shi Lei wrote:
[...]
IOW if we go down the route of generating everything from the RNG
schema,
we're quite likely to need to do work to make the RNG schema more accurate.
We'll probably aso need to change a fair number of the struts we use. The
domain XML schema is the one most affected by this. All the other XML
schemas are fairly simple in general. None of this is a bad thing, but it
is potentially a lot of work.
What I'm afraid of is making the internal structs closer to the RNG
schema in many cases. I spent some time sanitizing the storage of data
in the internal structs which don't necessarily match the RNG schema.
Making them look more like the schema would be a pretty big regression
in understandibility of the code and I'd like to avoid that at all
costs. Specifically since we can't change the schema.
Most notable is the ultra-messy NUMA topology definition, where some
things are split accross multiple not-very appropriate elements and
fixing that will be unpleasant.
Specifically NUMA topology is a child of <cpu>, properties of numa nodes
are under <numatune> etc ... These must not be changed otherwise it will
be a mess.
One other thing to worry about is, that schema validation is not
mandatory. Coupled with guarantee of backwards compatibility, we must
preserve the quirks of the parser, including parsing of stuff which is
not in the schema and this will be extremely hard to preserve.
The other extreme is when the schema is too generic and there's specific
checking in the parser. We've got plenty of that too.
As you've pointed out in terms of the disk schema, there's plenty of the
above going on, including under (we were accepting seclabels even for
disks which don't use them, it came handy once) and overspecified
schema and also plenty of non-conformat storage. Specifically the <disk>
element contains the type, which actually belongs to the storage source.
The storage source also has 'backingStore' but in the XML it's not a
child but a sibling. Changing this to the format in the schema would
make the code, messy and more unmaintainable than it is now.
In this regards, the best thing that we could do, is to generate the
parser and then hand-write transformation from the XML schema into what
we use internally, but that is not really better than the current state.