On Mon, May 30, 2016 at 09:31:00 +0200, Michal Privoznik wrote:
The apibuild script is a terrifying beast that parses some source
files of ours and produces an XML representation of them. When it
comes to parsing enums we have in some header files, it tries to
be clever and detect a value that an enum member has (or if it is
an alias for a different member). Whilst doing that it has to
deal with values we give to the members in many formats. At some
places we just pass the value in decimal:
VIR_DOMAIN_BLOCK_JOB_TYPE_PULL = 1,
in other places, we use the aliasing:
VIR_CONNECT_GET_ALL_DOMAINS_STATS_ACTIVE = VIR_CONNECT_LIST_DOMAINS_ACTIVE,
and in other places bitwise shifts are used:
VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS = 1 << 31, /* enforce requested
stats */
The script tries to parse all of these resulting in the following
tokens: "1", "VIR_CONNECT_LIST_DOMAINS_ACTIVE",
"1<<31"; Then, the
script tries to turn these into integers using python's eval()
function. This function succeeds on the first and the last
tokens. But, if we were to modify the last example so that it's
of the following form:
VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS = 1U << 31, /* enforce
requested stats */
the token representing enum's member value will then be "1U<<31".
So our parsing is good. Unfortunately, python is not aware of the
difference between signed and unsigned C types, therefore eval()
fails over this token and the parser falls back thinking it's an
alias to another enum member. Well it's not.
The solution is to transform [0-9]U into [0-9] as for our
purposes here it's the same thing.
I was worried that it might mangle aliases for other enums containing
numbers followed by a capital U but I didn't manage to break it.
ACK