[PATCH 0/4] Discourage use of some GLib functions

See 4/4 for the list and explanation. Michal Prívozník (4): qemu_shim: Replace g_file_get_contents() with virFileReadAll() virutil: Do not use g_get_host_name() to obtain hostname syntax-check: Update list of gethostname exceptions docs: Discourage use of some glib functions build-aux/syntax-check.mk | 2 +- docs/glib-adoption.rst | 17 +++++++++++++++++ src/qemu/qemu_shim.c | 18 +++++++++++------- src/util/virutil.c | 12 +++++++++--- 4 files changed, 38 insertions(+), 11 deletions(-) -- 2.26.2

The qemu_shim (compiled into virt-qemu-run-binary) reads several files provided by user (XML definition of secret, value of the secret, XML definition of domain) and it does so using g_file_get_contents(). This is potentially dangerous, because there is no limit on the size of files/buffers. Since this is a standalone binary it's not critical as it can't cause libvirtd to be OOM killed, but it's still worth fixing as I am planning on discouraging people from using the GLib function. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_shim.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_shim.c b/src/qemu/qemu_shim.c index c10598df4b..1b0c41a9d2 100644 --- a/src/qemu/qemu_shim.c +++ b/src/qemu/qemu_shim.c @@ -31,6 +31,10 @@ #define VIR_FROM_THIS VIR_FROM_QEMU +/* Below, several files are read into buffers. This defines the + * maximum possible size for such buffer. 10MiB should be enough. */ +#define MAX_FILE_SIZE (10 * 1024 * 1024) + static GMutex eventLock; static bool eventPreventQuitFlag; static bool eventQuitFlag; @@ -264,7 +268,7 @@ int main(int argc, char **argv) g_autofree char *sxml = NULL; g_autofree char *value = NULL; virSecretPtr sec; - size_t nvalue; + int nvalue; if (!bits || bits[0] == NULL || bits[1] == NULL) { g_printerr("%s: expected a pair of filenames for --secret argument\n", @@ -276,15 +280,15 @@ int main(int argc, char **argv) g_printerr("%s: %lld: loading secret %s and %s\n", argv[0], deltams(), bits[0], bits[1]); - if (!g_file_get_contents(bits[0], &sxml, NULL, &error)) { + if (virFileReadAll(bits[0], MAX_FILE_SIZE, &sxml) < 0) { g_printerr("%s: cannot read secret XML %s: %s\n", - argv[0], bits[0], error->message); + argv[0], bits[0], virGetLastErrorMessage()); goto cleanup; } - if (!g_file_get_contents(bits[1], &value, &nvalue, &error)) { + if ((nvalue = virFileReadAll(bits[1], MAX_FILE_SIZE, &value)) < 0) { g_printerr("%s: cannot read secret value %s: %s\n", - argv[0], bits[1], error->message); + argv[0], bits[1], virGetLastErrorMessage()); goto cleanup; } @@ -332,9 +336,9 @@ int main(int argc, char **argv) g_printerr("%s: %lld: fetching guest config %s\n", argv[0], deltams(), argv[1]); - if (!g_file_get_contents(argv[1], &xml, NULL, &error)) { + if (virFileReadAll(argv[1], MAX_FILE_SIZE, &xml) < 0) { g_printerr("%s: cannot read %s: %s\n", - argv[0], argv[1], error->message); + argv[0], argv[1], virGetLastErrorMessage()); goto cleanup; } -- 2.26.2

On Tue, Mar 09, 2021 at 03:26:22PM +0100, Michal Privoznik wrote:
The qemu_shim (compiled into virt-qemu-run-binary) reads several files provided by user (XML definition of secret, value of the secret, XML definition of domain) and it does so using g_file_get_contents(). This is potentially dangerous, because there is no limit on the size of files/buffers.
Since this is a standalone binary it's not critical as it can't cause libvirtd to be OOM killed, but it's still worth fixing as I am planning on discouraging people from using the GLib function.
I'm not convinced actually. I think that almost all of the places where we use virFileReadAll with a limit are in fact safe to use with g_file_get_contents. Almost all the files are from trusted sources - files exposed by the kernel, or files only writable by root, or by the same user as libvirtd. Almost no where does libvirt read files that are provided by an untrusted user. Reading disk image headers, or QEMU save files are the two main untrusted scenarios, and in both those cases we wdon't actually want more than the header. I think we probably get rid of virFileReadAll in the long term Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 3/9/21 3:56 PM, Daniel P. Berrangé wrote:
On Tue, Mar 09, 2021 at 03:26:22PM +0100, Michal Privoznik wrote:
The qemu_shim (compiled into virt-qemu-run-binary) reads several files provided by user (XML definition of secret, value of the secret, XML definition of domain) and it does so using g_file_get_contents(). This is potentially dangerous, because there is no limit on the size of files/buffers.
Since this is a standalone binary it's not critical as it can't cause libvirtd to be OOM killed, but it's still worth fixing as I am planning on discouraging people from using the GLib function.
I'm not convinced actually. I think that almost all of the places where we use virFileReadAll with a limit are in fact safe to use with g_file_get_contents. Almost all the files are from trusted sources - files exposed by the kernel, or files only writable by root, or by the same user as libvirtd.
Almost no where does libvirt read files that are provided by an untrusted user. Reading disk image headers, or QEMU save files are the two main untrusted scenarios, and in both those cases we wdon't actually want more than the header.
And virsh. But that again is standalone so probably no harm there. But we also read TLS keys/certs, but those are safe - given that only root should have access to them. My concern was that if we use g_file_get_contents() somewhere, it may sneak into scenario where we definitely don't want it. But maybe I'm worrying about nothing. Michal

The problem is that g_get_host_name() caches the hostname in a thread local variable. Therefore, it doesn't reflect any subsequent hostname changes. While this might be acceptable for logs where the hostname is printed exactly once when the libvirtd starts up, it is not optimal for virGetHostnameImpl() which is what our public virConnectGetHostname() API calls. If the hostname at the moment of the first API invocation happens to be "localhost" or contains a dot, then no further hostname changes will ever be reflected. This reverts 26d9748ff11, partially. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virutil.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/src/util/virutil.c b/src/util/virutil.c index 227997c7be..118ceec0db 100644 --- a/src/util/virutil.c +++ b/src/util/virutil.c @@ -475,11 +475,17 @@ static char * virGetHostnameImpl(bool quiet) { int r; - const char *hostname; - char *result = NULL; + char hostname[HOST_NAME_MAX+1], *result = NULL; struct addrinfo hints, *info; - hostname = g_get_host_name(); + r = gethostname(hostname, sizeof(hostname)); + if (r == -1) { + if (!quiet) + virReportSystemError(errno, + "%s", _("failed to determine host name")); + return NULL; + } + NUL_TERMINATE(hostname); if (STRPREFIX(hostname, "localhost") || strchr(hostname, '.')) { /* in this case, gethostname returned localhost (meaning we can't -- 2.26.2

On a Tuesday in 2021, Michal Privoznik wrote:
The problem is that g_get_host_name() caches the hostname in a thread local variable. Therefore, it doesn't reflect any subsequent hostname changes. While this might be acceptable for logs where the hostname is printed exactly once when the libvirtd starts up, it is not optimal for virGetHostnameImpl() which is what our public virConnectGetHostname() API calls.
If the hostname at the moment of the first API invocation happens to be
s/be/start with/
"localhost" or contains a dot, then no further hostname changes will ever be reflected.
This reverts 26d9748ff11, partially.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virutil.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

The only place where gethostname() is acceptable is in virGetHostnameImpl() which lives in src/util/virutil.c. Reflect this in the list of exceptions for the syntax-check rule. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- build-aux/syntax-check.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/build-aux/syntax-check.mk b/build-aux/syntax-check.mk index 2f4f932a5b..794ec326e4 100644 --- a/build-aux/syntax-check.mk +++ b/build-aux/syntax-check.mk @@ -1910,7 +1910,7 @@ exclude_file_name_regexp--sc_prohibit_empty_lines_at_EOF = \ exclude_file_name_regexp--sc_prohibit_fork_wrappers = \ (^(src/(util/(vircommand|virdaemon)|lxc/lxc_controller)|tests/testutils)\.c$$) -exclude_file_name_regexp--sc_prohibit_gethostname = ^src/util/vir(util|log)\.c$$ +exclude_file_name_regexp--sc_prohibit_gethostname = ^src/util/virutil\.c$$ exclude_file_name_regexp--sc_prohibit_internal_functions = \ ^src/(util/(viralloc|virutil|virfile)\.[hc]|esx/esx_vi\.c)$$ -- 2.26.2

On a Tuesday in 2021, Michal Privoznik wrote:
The only place where gethostname() is acceptable is in virGetHostnameImpl() which lives in src/util/virutil.c. Reflect this in the list of exceptions for the syntax-check rule.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- build-aux/syntax-check.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Unfortunately, not all GLib functions provide the level of security we want, or behave how we want. So far, g_file_get_contents() and g_get_host_name() have been identified. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/glib-adoption.rst | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/docs/glib-adoption.rst b/docs/glib-adoption.rst index f969ac80a1..bbc27fed6d 100644 --- a/docs/glib-adoption.rst +++ b/docs/glib-adoption.rst @@ -52,3 +52,20 @@ Objects https://developer.gnome.org/gobject/stable/gobject-The-Base-Object-Type.html Prefer ``GObject`` instead. + + +Forbidden functions +=================== + +There are some functions where Libvirt offers superior quality to GLib. In +those cases, GLib functions must NOT be used and Libvirt functions should be +used instead. + +``g_file_get_contents`` + Use ``virFileReadAll`` instead. The GLib function reads the entire file + into the memory without possibility to provide any limit on the buffer + size. + +``g_get_host_name`` + Prefer ``virGetHostname``. The GLib function caches the hostname and thus + does not reflect (possible) hostname changes. -- 2.26.2
participants (3)
-
Daniel P. Berrangé
-
Ján Tomko
-
Michal Privoznik