On Wed, Mar 22, 2017 at 21:40:04 -0400, John Ferlan wrote:
On 03/22/2017 12:26 PM, Jiri Denemark wrote:
> On Fri, Mar 17, 2017 at 14:39:00 -0400, John Ferlan wrote:
...
>> Since incoming migrations that don't reach the Finish
stage will be
>> killed in qemuProcessRecoverMigrationIn and the only purpose at that
>> point would be to free memory, it's not necessary to set up any sort
>> of recovery. Additionally, subsequent migrations will check if the
>> migration parameters are set and adjust them appropriately if for
>> some reason libvirtd restarts after setting the Finish marker, but
>> before actually resetting the environment.
>
> Sure, migrations will do that, but what about snapshots, saving a domain
> to a file and similar stuff which internally uses migration too. Do we
> properly reset the parameters for them too? I think we should delete the
> objects and reset migration parameters in qemuProcessRecoverMigrationIn
> anyway.
Why would someone use TLS for snapshots, saving domain to a fail, and
other similar stuff? What am I missing? I don't mind adding the extra
call. If the TLS parameters *only* get adjusted by libvirt during
PrepareAny and Run, but both those phases cause reset of the migration
back to the original source on failure *and* Cancel does the reset -
then I'm missing the connection.
If libvirtd restarts after entering QEMU_MIGRATION_PHASE_FINISH3 but
before completely finishing the migration we need to decide whether to
keep the domain running. The decision is based on the domain state, we kill
the domain unless its state is already VIR_DOMAIN_RUNNING. But the TLS
parameters may still be set at that point. If we don't reset them when
libvirtd starts again they will still be set when someone tries to do a
snapshot (or similar thing which uses "migrate" QMP command). Thus QEMU
will try to use TLS, which will obviously fail. That's why we can't rely
on just PrepareAny/Run to clear them, we need to make sure they are
always cleared when migration finishes.
...
>> @@ -246,47 +283,18 @@ struct _qemuDomainObjPrivate {
>>
>> /* note whether memory device alias does not correspond to slot number */
>> bool memAliasOrderMismatch;
>> -};
>>
>> -# define QEMU_DOMAIN_PRIVATE(vm) \
>> - ((qemuDomainObjPrivatePtr) (vm)->privateData)
>> + /* for migrations using TLS with a secret (not to be saved in our */
>> + /* private XML). */
>> + qemuDomainSecretInfoPtr migSecinfo;
>>
>> -/* Type of domain secret */
>> -typedef enum {
>> - VIR_DOMAIN_SECRET_INFO_TYPE_PLAIN = 0,
>> - VIR_DOMAIN_SECRET_INFO_TYPE_AES, /* utilize GNUTLS_CIPHER_AES_256_CBC */
>> -
>> - VIR_DOMAIN_SECRET_INFO_TYPE_LAST
>> -} qemuDomainSecretInfoType;
>> -
>> -typedef struct _qemuDomainSecretPlain qemuDomainSecretPlain;
>> -typedef struct _qemuDomainSecretPlain *qemuDomainSecretPlainPtr;
>> -struct _qemuDomainSecretPlain {
>> - char *username;
>> - uint8_t *secret;
>> - size_t secretlen;
>> + /* Used when fetching/storing the current 'tls-creds' migration
setting */
>> + /* (not to be saved in our private XML). */
>> + char *migTLSAlias;
>
> I'm not quite sure why we need to store the original value.
>
It determines whether or not TLS is possible... It can ensure that when
not using TLS for migration that we set the params to "" if they aren't
already set that way.
Which is in fact only needed because the check is done in the Begin
phase while the parameters are set in the Perform phase. It could have
been just a bool flag, though. Not a big deal.
...
>> +/* qemuMigrationCheckTLSCreds
>> + * @driver: pointer to qemu driver
>> + * @vm: domain object
>> + * @asyncJob: migration job to join
>> + *
>> + * Query the migration parameters looking for the 'tls-creds'
parameter.
>> + * The parameter was initially supported in QEMU 2.7; however, there was
>> + * no mechanism provided to clear the parameter. For QEMU 2.9, a change
>> + * was made to allow setting the parameter to an empty string in order
>> + * to clear. An additional change was made to initialize the parameter
>> + * to the empty string. Although still not perfect since it's possible
>> + * that a pre-2.9 release set the string to something and we should not
>
> How would this happen? QEMU won't magically set it to something by
> itself. And we don't need to care about someone messing up with the
> monitor or command line manually.
Yes it does - 2.9-rc0 initializes it to "" while it was NULL beforehand.
See QEMU commit '4af245dc3'.
We don't care about TLS migration with QEMU older than 2.9 since
tls-creds won't be listed in the reply from query-migrate-parameters.
All QEMU versions between 2.7 and 2.9 support tls-cred but the default
value disables the use of TLS. That's why any comment mentioning this in
the code is irrelevant. We just need to check whether tls-creds is
there and either set it to something or "" depending on VIR_MIGRATE_TLS.
> Anyway, I don't see a real value in repeating the story over
and over.
> We just ask for the default parameter values and if TLS parameters are
> not there, they are not supported from our point of view. That's pretty
> clear and version agnostic.
Well I always find it difficult to stumble upon code and wonder why
something is being done a particular way. So if I over state what
something does, is it really a bad thing?
It's fine to describe what we do for QEMU >= 2.9 which reports
tls-creds. We don't do anything if tls-creds is not reported so why
should we confuse people reading the code with details about QEMU
versions which support TLS but don't advertise tls-creds? If you don't
want to keep this info just in the commit message, I think the
documentation of this (qemuMigrationCheckTLSCreds) function is the right
place to keep it. But repeating it elsewhere is not very useful.
...
>> +/* qemuMigrationSetEmptyTLSParams
>> + * @priv: Pointer to private domain data
>> + * @migParams: Pointer to a migration parameters block
>> + *
>> + * If the qemuMigrationCheckTLSCreds query finds a non empty alias and it
>> + * is set to the alias that libvirt set, then we need to set the migration
>> + * parameters to "" in order to force clearing the TLS values from
our
>> + * previous migration that may not have been cleared properly if libvirtd
>> + * restarted during the finish phase before the ResetTLSParams was run.
>> + *
>> + * Returns 0 on success, -1 on failure
>> + */
>> +static int
>> +qemuMigrationSetEmptyTLSParams(qemuDomainObjPrivatePtr priv,
>> + qemuMonitorMigrationParamsPtr migParams)
>> +{
>> + char *tlsAlias = NULL;
>> +
>> + if (priv->migTLSAlias) {
>> + if (*priv->migTLSAlias == '\0')
>> + return 0;
>> +
>> + if (!(tlsAlias =
>> + qemuAliasTLSObjFromSrcAlias(QEMU_MIGRATION_TLS_ALIAS_BASE)))
>> + return -1;
>> +
>> + if (STRNEQ(priv->migTLSAlias, tlsAlias)) {
>> + VIR_FREE(tlsAlias);
>> + return 0;
>
> Why? We should just always reset the parameters if VIR_MIGRATE_TLS is
> not set. No matter what was the previous value.
>
Perhaps I'm over thinking... I guess I was going down the path of
clearing them when we didn't set them, but then again - I don't change
the "migTLSAlias" so sure I can remove the above 3 checks.
The checks make sure we don't reset the parameters if we didn't set
them. Which is not OK, we need to always reset them to make sure TLS is
not used if VIR_MIGRATE_TLS is not set.
...
>> @@ -3829,6 +4137,32 @@
qemuMigrationPrepareAny(virQEMUDriverPtr driver,
>> compression, &migParams) < 0)
>> goto stopjob;
>>
>> + /* Migrations using TLS need to add the "tls-creds-x509" object
and
>> + * set the migration TLS parameters */
>> + if (flags & VIR_MIGRATE_TLS) {
>> + cfg = virQEMUDriverGetConfig(driver);
>> + if (qemuMigrationCheckSetupTLS(dconn, driver, cfg, vm,
>> + QEMU_ASYNC_JOB_MIGRATION_IN) < 0)
>> + goto stopjob;
>> +
>> + if (qemuMigrationAddTLSObjects(driver, vm, cfg, true,
>> + QEMU_ASYNC_JOB_MIGRATION_IN,
>> + &tlsAlias, &secAlias,
&migParams) < 0)
>> + goto stopjob;
>> +
>> + /* Force reset of 'tls-hostname', just in case */
>
> The "just in case" part is confusing. What about replacing it with
> something about tls-hostname making no sense on destination?
>
The destination cannot have tls-hostname set - it's a source parameter.
If the target was a source and for some reason the parameter was set,
then just in case we'll be sure to clear it.
Exactly. That's why something like "..., it's a source-only parameter"
would be better than "..., just in case".
Jirka