Constantine A. Murenin ([info]cnst) wrote,
@ 2007-06-12 12:33:00
Previous Entry  Add to memories!  Tell a Friend  Next Entry
Entry tags:commit2cvs.openbsd.org, make, make/job.c, openbsd, patch

10-year-old pointer-arithmetic bug in make(1) is now gone, thanks to malloc.conf and some debugging
Well, today I'm really proud of myself — I've found and fixed a bug that was introduced no more, no less, but in 1997!

It prevented people from running `make -j` on OpenBSD with the fear of it looping on one the CPUs forever, without actually building anything.

Personally, I've encountered this bug a few times after todd@ told me about the -j option at c2k7. I immediately noticed performance improvements in building kernel with `make -j4` — which was obvious from the fact that now both CPUs were at close to 0% idle time for the duration of the build. However, I've also noticed that shortly after being started with a big -j number, like 16 or 24, the make process would become mysteriously quiet and would start to consume 100% of one of the CPUs (without actually running any jobs) every time I ran it!

Independently (or maybe due to the influence) of the above make(1) bug, one day I've decided to give the malloc options a try, with running `ln -s 'AFGHJPRX<' /etc/malloc.conf` to set it up. After setting these malloc options, I've noticed that make was crashing quite often when used with the -j option!

To make the long story short, I've discovered that the problem had to do with pointer arithmetic. A number of bytes was added as an offset for a pointer of type fd_set, which has a size of 128. I.e. the offset in the number of bytes was erroneously multiplied by a factor of 128, thus doing a memset on the unallocated piece of memory, leaving the allocated part uninitialised!

Anyhow, here is the simplest fix:

Index: job.c
===================================================================
RCS file: /cvs/src/usr.bin/make/job.c,v
retrieving revision 1.61
diff -u -d -p -4 -r1.61 job.c
--- job.c	2007/03/02 04:32:32	1.61
+++ job.c	2007/06/11 20:52:22
@@ -1352,12 +1352,14 @@ JobExec(Job *job, char **argv)
 		int bytes = howmany(job->inPipe+1, NFDBITS) * sizeof(fd_mask);
 		int obytes = howmany(outputsn+1, NFDBITS) * sizeof(fd_mask);
 
 		if (outputsp == NULL || obytes != bytes) {
+			if (outputsp == NULL)
+				obytes = 0;
 			outputsp = realloc(outputsp, bytes);
 			if (outputsp == NULL)
 				return;
-			memset(outputsp + obytes, 0, bytes - obytes);
+			memset((caddr_t)outputsp + obytes, 0, bytes - obytes);
 		}
 		outputsn = job->inPipe;
 	    }
 	    FD_SET(job->inPipe, outputsp);

A variation of this patch, that also solves a potential memory leak due to incorrect realloc usage, was committed to 4.1-current a few minutes ago. Below is the link to my actual commit message.

http://marc.info/?l=openbsd-cvs&m=118166609103190&w=2
http://www.openbsd.org/cgi-bin/cvsweb/src/usr.bin/make/job.c#rev1.62

Happy hacking!
Constantine.



Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…