[libvirt] Fix virGet{User, Group}IDByName when the user/group cannot be found

Hey, I've been having issues recently after upgrading to a newer libvirt on a RHEL6 box. 'virsh start vmname' fails with 'error :internal error Failed to get user record for name '107': No such file or directory ' Looking deeper into this, this comes from 'user not found' errors from getpwnam_r/getgrnam_r not being correctly handled. This patch series attempts to improve that, but I'm not that happy with the seemingly random values of errno being checked there. Christophe

virGetUserIDByName is documented as returning 1 if the username cannot be found. getpwnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or uid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. » virGetUserIDByName returns an error when the return value of getpwnam_r is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the requested user cannot be found, which then causes virGetUserID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an uid). This commit makes virGetUserIDByName ignore the various values listed in getpwnam_r manpage and return a 'user not found' result in such cases. --- src/util/util.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/util/util.c b/src/util/util.c index 2fd0f2c..df1af7e 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2530,7 +2530,8 @@ virGetUserIDByName(const char *name, uid_t *uid) } } - if (rc != 0) { + if ((rc != 0) && (rc != ENOENT) && (rc != ESRCH) + && (rc != EBADF) && (rc != EPERM)) { virReportSystemError(rc, _("Failed to get user record for name '%s'"), name); goto cleanup; -- 1.8.0.1

On 2012年12月05日 20:03, Christophe Fergeau wrote:
virGetUserIDByName is documented as returning 1 if the username cannot be found. getpwnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or uid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetUserIDByName returns an error when the return value of getpwnam_r is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the requested user cannot be found, which then causes virGetUserID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an uid).
This commit makes virGetUserIDByName ignore the various values listed in getpwnam_r manpage and return a 'user not found' result in such cases. --- src/util/util.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/util/util.c b/src/util/util.c index 2fd0f2c..df1af7e 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2530,7 +2530,8 @@ virGetUserIDByName(const char *name, uid_t *uid) } }
- if (rc != 0) { + if ((rc != 0)&& (rc != ENOENT)&& (rc != ESRCH) +&& (rc != EBADF)&& (rc != EPERM)) {
Correct indention should be: if ((rc != 0) && (rc != ENOENT) && (rc != ESRCH) && (rc != EBADF) && (rc != EPERM)) Per the POSIX standard, even these errors are not enough, should we give up using getpwnam_r/getgrnam_r, and create helpers using getpwent/getgrent to do the lookup work ourselves? <...> The getpwent() function returns a pointer to a passwd structure, or NULL if there are no more entries or an error occurs. If an error occurs, errno is set appropriately. If one wants to check errno after the call, it should be set to zero before the call. </...> <...> ERRORS EINTR A signal was caught. EIO I/O error. EMFILE The maximum number (OPEN_MAX) of files was open already in the calling process. ENFILE The maximum number of files was open already in the system. ENOMEM Insufficient memory to allocate passwd structure. ERANGE Insufficient buffer space supplied. </...> I think it's pretty easy to distiguish NOT FOUND and other error then. Regards, Osier

On 12/06/12 10:25, Osier Yang wrote:
On 2012年12月05日 20:03, Christophe Fergeau wrote:
virGetUserIDByName is documented as returning 1 if the username cannot be found. getpwnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or uid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetUserIDByName returns an error when the return value of getpwnam_r is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the requested user cannot be found, which then causes virGetUserID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an uid).
This commit makes virGetUserIDByName ignore the various values listed in getpwnam_r manpage and return a 'user not found' result in such cases. --- src/util/util.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/util/util.c b/src/util/util.c index 2fd0f2c..df1af7e 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2530,7 +2530,8 @@ virGetUserIDByName(const char *name, uid_t *uid) } }
- if (rc != 0) { + if ((rc != 0)&& (rc != ENOENT)&& (rc != ESRCH) +&& (rc != EBADF)&& (rc != EPERM)) {
Correct indention should be:
if ((rc != 0) && (rc != ENOENT) && (rc != ESRCH) && (rc != EBADF) && (rc != EPERM))
When I was changing the code the first time, I got rid of this as the docs for getpwnam_r seemed clear enough to me that this shouldn't be needed. But I was wrong and I'd rather not reintroduce this ugly and nondeterministic error checking.
Per the POSIX standard, even these errors are not enough, should we give up using getpwnam_r/getgrnam_r, and create helpers using getpwent/getgrent to do the lookup work ourselves?
Hmm, those look fine to me. I didn't do my research properly when changing the code the first time.
<...> The getpwent() function returns a pointer to a passwd structure, or NULL if there are no more entries or an error occurs. If an error occurs, errno is set appropriately. If one wants to check errno after the call, it should be set to zero before the call. </...>
<...> ERRORS EINTR A signal was caught.
EIO I/O error.
EMFILE The maximum number (OPEN_MAX) of files was open already in the calling process.
ENFILE The maximum number of files was open already in the system.
ENOMEM Insufficient memory to allocate passwd structure.
ERANGE Insufficient buffer space supplied. </...>
I think it's pretty easy to distiguish NOT FOUND and other error then.
That's true. I'll look into this and check if it's feasible for us to do in libvirt.
Regards, Osier
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 12/06/12 10:59, Peter Krempa wrote:
On 12/06/12 10:25, Osier Yang wrote:
On 2012年12月05日 20:03, Christophe Fergeau wrote:
virGetUserIDByName is documented as returning 1 if the username cannot be found. getpwnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or uid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetUserIDByName returns an error when the return value of getpwnam_r is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the requested user cannot be found, which then causes virGetUserID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an uid).
This commit makes virGetUserIDByName ignore the various values listed in getpwnam_r manpage and return a 'user not found' result in such cases. --- src/util/util.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/util/util.c b/src/util/util.c index 2fd0f2c..df1af7e 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2530,7 +2530,8 @@ virGetUserIDByName(const char *name, uid_t *uid) } }
- if (rc != 0) { + if ((rc != 0)&& (rc != ENOENT)&& (rc != ESRCH) +&& (rc != EBADF)&& (rc != EPERM)) {
Correct indention should be:
if ((rc != 0) && (rc != ENOENT) && (rc != ESRCH) && (rc != EBADF) && (rc != EPERM))
When I was changing the code the first time, I got rid of this as the docs for getpwnam_r seemed clear enough to me that this shouldn't be needed. But I was wrong and I'd rather not reintroduce this ugly and nondeterministic error checking.
Per the POSIX standard, even these errors are not enough, should we give up using getpwnam_r/getgrnam_r, and create helpers using getpwent/getgrent to do the lookup work ourselves?
Hmm, those look fine to me. I didn't do my research properly when changing the code the first time.
<...> The getpwent() function returns a pointer to a passwd structure,
getpwent() isn't reentrant, and so isn't the supposedly reentrant version getpwent_r(): from the manpage: NOTES The function getpwent_r() is not really reentrant since it shares the reading position in the stream with all other threads. I will try to restructure the control flow so that if there's a fallback method (numeric parsing of the input string succeeds) errors to get the entry won't be fatal, but I'd like to avoid whitelisting of the return variables as the definition of them is vague.
or NULL if there are no more entries or an error occurs. If an error occurs, errno is set appropriately. If one wants to check errno after the call, it should be set to zero before the call. </...>
<...> ERRORS EINTR A signal was caught.
EIO I/O error.
EMFILE The maximum number (OPEN_MAX) of files was open already in the calling process.
ENFILE The maximum number of files was open already in the system.
ENOMEM Insufficient memory to allocate passwd structure.
ERANGE Insufficient buffer space supplied. </...>
I think it's pretty easy to distiguish NOT FOUND and other error then.
That's true. I'll look into this and check if it's feasible for us to do in libvirt.
Regards, Osier
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 12/05/12 13:03, Christophe Fergeau wrote:
virGetUserIDByName is documented as returning 1 if the username cannot be found. getpwnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or uid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetUserIDByName returns an error when the return value of getpwnam_r is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the requested user cannot be found, which then causes virGetUserID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an uid).
This commit makes virGetUserIDByName ignore the various values listed in getpwnam_r manpage and return a 'user not found' result in such cases. --- src/util/util.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/util/util.c b/src/util/util.c index 2fd0f2c..df1af7e 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2530,7 +2530,8 @@ virGetUserIDByName(const char *name, uid_t *uid) } }
- if (rc != 0) { + if ((rc != 0) && (rc != ENOENT) && (rc != ESRCH) + && (rc != EBADF) && (rc != EPERM)) { virReportSystemError(rc, _("Failed to get user record for name '%s'"), name); goto cleanup;
Hm, this is the most elegant solution to the problem I came across. getpwent_r suggested by Osier isn't reentrant as one would think and other options would require too invasive changes into this code. As we've already had such an error handling previously I'm in favor of taking this change. ACK with formatting fixed as Osier suggested and this comment: /* * From the manpage (terrifying but true): * * ERRORS * 0 or ENOENT or ESRCH or EBADF or EPERM or ... * The given name or uid was not found. */ placed before the call of the getpwnam_r function. (It was there before.) Peter

On Thu, Dec 06, 2012 at 03:07:32PM +0100, Peter Krempa wrote:
On 12/05/12 13:03, Christophe Fergeau wrote:
virGetUserIDByName is documented as returning 1 if the username cannot be found. getpwnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or uid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetUserIDByName returns an error when the return value of getpwnam_r is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the requested user cannot be found, which then causes virGetUserID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an uid).
This commit makes virGetUserIDByName ignore the various values listed in getpwnam_r manpage and return a 'user not found' result in such cases. --- src/util/util.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/util/util.c b/src/util/util.c index 2fd0f2c..df1af7e 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2530,7 +2530,8 @@ virGetUserIDByName(const char *name, uid_t *uid) } }
- if (rc != 0) { + if ((rc != 0) && (rc != ENOENT) && (rc != ESRCH) + && (rc != EBADF) && (rc != EPERM)) { virReportSystemError(rc, _("Failed to get user record for name '%s'"), name); goto cleanup;
Hm, this is the most elegant solution to the problem I came across. getpwent_r suggested by Osier isn't reentrant as one would think and other options would require too invasive changes into this code.
Another option could be to list the errno values for which we return an error rather than listing the errno values for which we don't want to return an error: if (rc != 0) { if ((rc == EINTR) || (rc == EIO) || (rc == EMFILE) || (rc == ENFILE) || (rc == ENOMEM)) { virReportSystemError(rc, _("Failed to get group record for name %s'"), name); goto cleanup; } }
ACK with formatting fixed as Osier suggested and this comment: /* * From the manpage (terrifying but true): * * ERRORS * 0 or ENOENT or ESRCH or EBADF or EPERM or ... * The given name or uid was not found. */
placed before the call of the getpwnam_r function. (It was there before.)
I'll also add an errno = 0; before the call as mentioned by Osier. I'll send a v2 with these changes once you tell me if you prefer the old filtering or the one suggested in this email. Thanks, Christophe

On 12/06/12 15:16, Christophe Fergeau wrote:
On Thu, Dec 06, 2012 at 03:07:32PM +0100, Peter Krempa wrote:
On 12/05/12 13:03, Christophe Fergeau wrote:
virGetUserIDByName is documented as returning 1 if the username cannot be found. getpwnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or uid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetUserIDByName returns an error when the return value of getpwnam_r is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the requested user cannot be found, which then causes virGetUserID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an uid).
This commit makes virGetUserIDByName ignore the various values listed in getpwnam_r manpage and return a 'user not found' result in such cases. --- src/util/util.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/util/util.c b/src/util/util.c index 2fd0f2c..df1af7e 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2530,7 +2530,8 @@ virGetUserIDByName(const char *name, uid_t *uid) } }
- if (rc != 0) { + if ((rc != 0) && (rc != ENOENT) && (rc != ESRCH) + && (rc != EBADF) && (rc != EPERM)) { virReportSystemError(rc, _("Failed to get user record for name '%s'"), name); goto cleanup;
Hm, this is the most elegant solution to the problem I came across. getpwent_r suggested by Osier isn't reentrant as one would think and other options would require too invasive changes into this code.
Another option could be to list the errno values for which we return an error rather than listing the errno values for which we don't want to return an error:
Hm, maybe that would be better. Those other errors seem to be specified more deterministic as those nasty tree points at the end of the "not found" case.
if (rc != 0) { if ((rc == EINTR) || (rc == EIO) || (rc == EMFILE) || (rc == ENFILE) || (rc == ENOMEM)) { virReportSystemError(rc, _("Failed to get group record for name %s'"), name); goto cleanup; } }
ACK with formatting fixed as Osier suggested and this comment: /* * From the manpage (terrifying but true): * * ERRORS * 0 or ENOENT or ESRCH or EBADF or EPERM or ... * The given name or uid was not found. */
placed before the call of the getpwnam_r function. (It was there before.)
I'll also add an errno = 0; before the call as mentioned by Osier. I'll
resetting of errno isn't needed, getpwnam_r doesn't use it
send a v2 with these changes once you tell me if you prefer the old filtering or the one suggested in this email. Thanks,
Christophe
Peter
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 2012年12月06日 22:16, Christophe Fergeau wrote:
On Thu, Dec 06, 2012 at 03:07:32PM +0100, Peter Krempa wrote:
On 12/05/12 13:03, Christophe Fergeau wrote:
virGetUserIDByName is documented as returning 1 if the username cannot be found. getpwnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or uid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetUserIDByName returns an error when the return value of getpwnam_r is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the requested user cannot be found, which then causes virGetUserID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an uid).
This commit makes virGetUserIDByName ignore the various values listed in getpwnam_r manpage and return a 'user not found' result in such cases. --- src/util/util.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/util/util.c b/src/util/util.c index 2fd0f2c..df1af7e 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2530,7 +2530,8 @@ virGetUserIDByName(const char *name, uid_t *uid) } }
- if (rc != 0) { + if ((rc != 0)&& (rc != ENOENT)&& (rc != ESRCH) +&& (rc != EBADF)&& (rc != EPERM)) { virReportSystemError(rc, _("Failed to get user record for name '%s'"), name); goto cleanup;
Hm, this is the most elegant solution to the problem I came across. getpwent_r suggested by Osier isn't reentrant as one would think and other options would require too invasive changes into this code.
Hmm, okay,
Another option could be to list the errno values for which we return an error rather than listing the errno values for which we don't want to return an error:
if (rc != 0) { if ((rc == EINTR) || (rc == EIO) || (rc == EMFILE) || (rc == ENFILE) || (rc == ENOMEM)) { virReportSystemError(rc, _("Failed to get group record for name %s'"), name); goto cleanup; } }
How about make the logic a bit more clear like: if (!pw) { if ((rc == 0) || (rc == EINTR) || (rc == EIO) || (rc == EMFILE) || (rc == ENFILE) || (rc == ENOMEM)) { VIR_DEBUG("User record for user '%s' does not exist", name); ret = 1; goto cleanup; } else { virReportSystemError(rc, _("Failed to get user record for name '%s'"), name); goto cleanup; } } *uid = pw->pw_uid; ret = 0;
ACK with formatting fixed as Osier suggested and this comment: /* * From the manpage (terrifying but true): * * ERRORS * 0 or ENOENT or ESRCH or EBADF or EPERM or ... * The given name or uid was not found. */
This will be nice. Osier

On Fri, Dec 07, 2012 at 01:40:20AM +0800, Osier Yang wrote:
How about make the logic a bit more clear like:
if (!pw) { if ((rc == 0) || (rc == EINTR) || (rc == EIO) || (rc == EMFILE) || (rc == ENFILE) || (rc == ENOMEM)) {
This does not look right, we want to run the code from the 'else' block when rc is EINTR, EIO, EMFILE, ENFILE or ENOMEM, ie something like if (!pw) { if (rc == 0 || (rc != EINTR && rc != EIO && rc != EMFILE && rc != ENFILE && rc != ENOMEM)) { VIR_DEBUG("User record for user '%s' does not exist", name); ret = 1; goto cleanup; } else { virReportSystemError(rc, _("Failed to get user record for name '%s'"), name); goto cleanup; } } } *uid = pw->pw_uid; ret = 0; I'm not sure this is much clearer than the initial version, though reworking that test would be a good opportunity to remove the extra () that Eric mentioned, so feel free to change it. Christophe

On 12/07/2012 02:02 AM, Christophe Fergeau wrote:
On Fri, Dec 07, 2012 at 01:40:20AM +0800, Osier Yang wrote:
How about make the logic a bit more clear like:
if (!pw) { if ((rc == 0) || (rc == EINTR) || (rc == EIO) || (rc == EMFILE) || (rc == ENFILE) || (rc == ENOMEM)) {
This does not look right, we want to run the code from the 'else' block when rc is EINTR, EIO, EMFILE, ENFILE or ENOMEM, ie something like
if (!pw) { if (rc == 0 ||
If successfully found nothing...
(rc != EINTR && rc != EIO && rc != EMFILE && rc != ENFILE && rc != ENOMEM)) {
or we failed, but the error is not any of the failures that are guaranteed to be transient and therefore unable to inform us of whether we found nothing...
VIR_DEBUG("User record for user '%s' does not exist", name); ret = 1;
then report that we found nothing
goto cleanup; } else { virReportSystemError(rc, _("Failed to get user record for name '%s'"), name); goto cleanup;
otherwise log the error (which might be one of the remaining errors that IS feasibly returned when there is logically nothing, like ENOENT). Yuck. This isn't right either.
} } } *uid = pw->pw_uid; ret = 0;
I'm not sure this is much clearer than the initial version, though reworking that test would be a good opportunity to remove the extra () that Eric mentioned, so feel free to change it.
I would MUCH rather see code that says: if we successfully got no entry, or if we got one of the errors known to indicate a permanent error in being able to resolve names (that is, ENOENT, ESRCH, EBADF, EPERM), then immediately return that we found no entry. If we got a different error that is transient (EINTR, EIO, EMFILE, ENFILE, ENOMEM), then we can't prove that the name will not resolve, and we should return failure. If we get an error that is not documented in the Linux man pages, then we should likewise return failure. That is, I'd much rather see: if (!pw) { if (rc == 0 || rc == ENOENT || rc == ESRCH || rc == EBADF || rc == EPERM) { /* Provably not found - either the function successfully * told us nothing resolves, or this is an indication of a * permanent failure to do any name resolution and future * attempts will likewise fail */ rc = 1; } else { /* Other error; documented possibilities include ENOMEM * which is transient in nature, so we can't prove that * the user does not exist, and are best off failing */ virReportSystemError(); } goto cleanup; } -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 12/07/12 17:34, Eric Blake wrote:
On 12/07/2012 02:02 AM, Christophe Fergeau wrote:
On Fri, Dec 07, 2012 at 01:40:20AM +0800, Osier Yang wrote:
How about make the logic a bit more clear like:
if (!pw) { if ((rc == 0) || (rc == EINTR) || (rc == EIO) || (rc == EMFILE) || (rc == ENFILE) || (rc == ENOMEM)) {
This does not look right, we want to run the code from the 'else' block when rc is EINTR, EIO, EMFILE, ENFILE or ENOMEM, ie something like
if (!pw) { if (rc == 0 ||
If successfully found nothing...
(rc != EINTR && rc != EIO && rc != EMFILE && rc != ENFILE && rc != ENOMEM)) {
or we failed, but the error is not any of the failures that are guaranteed to be transient and therefore unable to inform us of whether we found nothing...
VIR_DEBUG("User record for user '%s' does not exist", name); ret = 1;
then report that we found nothing
goto cleanup; } else { virReportSystemError(rc, _("Failed to get user record for name '%s'"), name); goto cleanup;
otherwise log the error (which might be one of the remaining errors that IS feasibly returned when there is logically nothing, like ENOENT).
Yuck. This isn't right either.
} } } *uid = pw->pw_uid; ret = 0;
I'm not sure this is much clearer than the initial version, though reworking that test would be a good opportunity to remove the extra () that Eric mentioned, so feel free to change it.
Hmm so much hassle about working around a defective-by-design function ....
I would MUCH rather see code that says:
if we successfully got no entry, or if we got one of the errors known to indicate a permanent error in being able to resolve names (that is, ENOENT, ESRCH, EBADF, EPERM), then immediately return that we found no
Okay, this is reasonable, but I wouldn't call the error permanent. I think the variety of return codes originates from the multiple backends where the user entry can come from and the possible failures they can return. For example let's have user entries in LDAP, when the server is inaccesible, we may get a "reliable" not found until the server starts to respond.
entry. If we got a different error that is transient (EINTR, EIO, EMFILE, ENFILE, ENOMEM), then we can't prove that the name will not resolve, and we should return failure. If we get an error that is not documented in the Linux man pages, then we should likewise return failure.
Those three dots in the manpage are tricky. I think they exist as a backdoor to allow returning all different errnos that might happen while trying to reach the user database. The more I think about this, the more I consider if it wouldn't be best to remove error reporting of such cases and just treat all possibilities if "pw" is NULL to be "user not found" cases. And if the username string isn't numeric, the function that parses the number will error out too. This covers all of the possible values of "..." and doesn't really lower the user experience.
That is, I'd much rather see:
if (!pw) { if (rc == 0 || rc == ENOENT || rc == ESRCH || rc == EBADF || rc == EPERM) { /* Provably not found - either the function successfully * told us nothing resolves, or this is an indication of a * permanent failure to do any name resolution and future * attempts will likewise fail */ rc = 1; } else { /* Other error; documented possibilities include ENOMEM * which is transient in nature, so we can't prove that * the user does not exist, and are best off failing */ virReportSystemError(); } goto cleanup; }
Okay, this looks better, but I don't think we should bother with reporting errors from getpwnam_r as it's design sucks and there's no reasonable substitute for that. Peter

virGetGroupIDByName is documented as returning 1 if the groupname cannot be found. getgrnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or gid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. » virGetGroupIDByName returns an error when the return value of getgrnam_r is non-0. However on my RHEL system, getgrnam_r returns ENOENT when the requested user cannot be found, which then causes virGetGroupID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an gid). This commit makes virGetGroupIDByName ignore the various values listed in getgrnam_r manpage and return a 'group not found' result in such cases. --- src/util/util.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/util/util.c b/src/util/util.c index df1af7e..e9474dd 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2613,7 +2613,8 @@ virGetGroupIDByName(const char *name, gid_t *gid) } } - if (rc != 0) { + if ((rc != 0) && (rc != ENOENT) && (rc != ESRCH) + && (rc != EBADF) && (rc != EPERM)) { virReportSystemError(rc, _("Failed to get group record for name '%s'"), name); goto cleanup; -- 1.8.0.1

On 12/05/12 13:03, Christophe Fergeau wrote:
virGetGroupIDByName is documented as returning 1 if the groupname cannot be found. getgrnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or gid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetGroupIDByName returns an error when the return value of getgrnam_r is non-0. However on my RHEL system, getgrnam_r returns ENOENT when the requested user cannot be found, which then causes virGetGroupID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an gid).
This commit makes virGetGroupIDByName ignore the various values listed in getgrnam_r manpage and return a 'group not found' result in such cases. --- src/util/util.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/util/util.c b/src/util/util.c index df1af7e..e9474dd 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2613,7 +2613,8 @@ virGetGroupIDByName(const char *name, gid_t *gid) } }
- if (rc != 0) { + if ((rc != 0) && (rc != ENOENT) && (rc != ESRCH) + && (rc != EBADF) && (rc != EPERM)) { virReportSystemError(rc, _("Failed to get group record for name '%s'"), name); goto cleanup;
ACK with same conditions as in my review of 1/2 in this series. Peter
participants (4)
-
Christophe Fergeau
-
Eric Blake
-
Osier Yang
-
Peter Krempa