On 07/07/2011 08:17 AM, Daniel P. Berrange wrote:
From: "Daniel P. Berrange" <berrange(a)redhat.com>
If libvirtd crashes then the pidfile may not be cleaned up,
making a later restart fail, even though the original process
no longer exists.
Instead of simply using file creation as the test for successful
pidfile acquisition, actually take out a lock on the pidfile.
To avoid races, after locking the pidfile, verify that the
inode hasn't changed.
Also make sure the unprivileged libvirtd now acquires the
pidfile, instead of relying on the UNIX socket to ensure
mutual exclusion
Cool idea.
-static int daemonWritePidFile(const char *pidFile, const char
*argv0)
-{
+static int
+daemonAcquirePidFile(const char *argv0, const char *pidFile) {
Why'd you hoist the { to the previous line? Our convention has been
(slowly) leaning towards a function body starting on column 1.
int fd;
- FILE *fh;
char ebuf[1024];
+ unsigned long long pid = (unsigned long long)getpid();
See my comments in an earlier thread about our existing assumptions that
pid_t fits in int. In fact, I would be okay with:
verify(sizeof(pid_t) <= sizeof(int));
int pid = getpid();
char pidstr[INT_BUFSIZE_BOUND(pid)];
...
snprintf(pidstr, sizeof(pidstr), "%u", pid);
+ char pidstr[INT_BUFSIZE_BOUND(pid)];
if (pidFile[0] == '\0')
return 0;
- if ((fd = open(pidFile, O_WRONLY|O_CREAT|O_EXCL, 0644)) < 0) {
- VIR_ERROR(_("Failed to open pid file '%s' : %s"),
- pidFile, virStrerror(errno, ebuf, sizeof ebuf));
- return -1;
- }
+ while (1) {
+ struct stat a, b;
+ if ((fd = open(pidFile, O_WRONLY|O_CREAT, 0644)) < 0) {
+ VIR_ERROR(_("%s: Failed to open pid file '%s' : %s"),
+ argv0, pidFile, virStrerror(errno, ebuf, sizeof ebuf));
+ return -1;
+ }
+
+ if (fstat(fd, &b) < 0) {
+ VIR_ERROR(_("%s: Pid file '%s' disappeared: %s"),
Misleading error message. fstat can indeed fail (although such failures
are rare), but not because a file disappeared - after all, you have an
fd open to the file.
- if (fprintf(fh, "%lu\n", (unsigned long)getpid()) <
0) {
- VIR_ERROR(_("%s: Failed to write to pid file '%s' : %s"),
- argv0, pidFile, virStrerror(errno, ebuf, sizeof ebuf));
- VIR_FORCE_FCLOSE(fh);
- return -1;
- }
+ snprintf(pidstr, sizeof(pidstr), "%llu", pid);
- if (VIR_FCLOSE(fh) == EOF) {
- VIR_ERROR(_("%s: Failed to close pid file '%s' : %s"),
- argv0, pidFile, virStrerror(errno, ebuf, sizeof ebuf));
- return -1;
+ if (safewrite(fd, pidstr, strlen(pidstr)) < 0) {
Nice conversion from FILE* to write().
@@ -1436,6 +1453,12 @@ int main(int argc, char **argv) {
umask(old_umask);
}
+ /* Try to claim the pidfile, existing if we can't */
s/existing/exiting/
+ if ((pid_file_fd = daemonAcquirePidFile(argv[0], pid_file)) <
0) {
+ ret = VIR_DAEMON_ERR_PIDFILE;
+ goto cleanup;
+ }
+
if (!(srv = virNetServerNew(config->min_workers,
config->max_workers,
config->max_clients,
@@ -1570,8 +1593,10 @@ cleanup:
}
VIR_FORCE_CLOSE(statuswrite);
}
- if (pid_file)
- unlink (pid_file);
+ if (pid_file_fd != -1) {
+ unlink(pid_file);
+ VIR_FORCE_CLOSE(pid_file_fd);
Swap these two lines. Not that flock (or even libvirtd) works on mingw,
but mingw doesn't like unlink() on an open fd.
--
Eric Blake eblake(a)redhat.com +1-801-349-2682
Libvirt virtualization library
http://libvirt.org