dburrows/ blog/ entry/ Dear Lazyweb, does anyone understand job control?

I've been banging my head against this all day and it doesn't make any more sense than it did when I started.

I have a need to write some code that can manage job control on a terminal. More specifically, I need to run a single process and stuff it into the background at will, so that it gets suspended when it tries to read from the terminal. So, there's a controller process and a subprocess process.

 controller  ----------> subprocess
               manages

The man-pages and glibc info documentation make this look simple: disable TOSTOP if it's enabled via tcsetattr() so that the subprocess can write to the terminal, then start the subprocess in a new process group (by fork()ing and calling setpgid()). Once it's going, you can put it in the foreground or background by calling tcsetpgrp() to set the terminal's foreground process group to the subprocess (to put it in the foreground), or to the controller process (to put the subprocess in the background). Whenever the subprocess is in the background, it will be sent SIGTTIN if it tries to read from the terminal.

I have everything working -- except for the very last sentence of that last paragraph. I can see my processes being put into the right process group, and I can see them going into the foreground and the background. For instance, here the first line is the controller process and the rest of the lines are the process group of the subprocess (this one started a few other processes).

STAT CMD                           PID  PGID TPGID  PPID   SID
Ss+  ./src/aptitude              26989 26989 26989 26980 26989
S    ./src/aptitude              26991 26991 26989 26989 26989
S    /bin/sh -c /usr/sbin/dpkg-p 27027 26991 26989 26991 26989
S    /usr/bin/perl -w /usr/sbin/ 27028 26991 26989 27027 26989
Z    [dpkg-preconfigu] <defunct> 27034 26991 26989 27028 26989
S    /bin/sh -e /tmp/lynx-cur.co 27039 26991 26989 27028 26989
S    whiptail --backtitle Packag 27043 26991 26989 27028 26989

If I try to move the background process into the foreground, it is marked as the foreground process in this session:

STAT CMD                           PID  PGID TPGID  PPID   SID
Ss   ./src/aptitude              26989 26989 26991 26980 26989
S+   ./src/aptitude              26991 26991 26991 26989 26989
S+   /bin/sh -c /usr/sbin/dpkg-p 27027 26991 26991 26991 26989
S+   /usr/bin/perl -w /usr/sbin/ 27028 26991 26991 27027 26989
Z+   [dpkg-preconfigu] <defunct> 27034 26991 26991 27028 26989
S+   /bin/sh -e /tmp/lynx-cur.co 27039 26991 26991 27028 26989
S+   whiptail --backtitle Packag 27043 26991 26991 27028 26989

Note that the background process is not suspended, even though it's busy reading from the tty. If I manually suspend the process group with kill -TTIN -26991, it stops as expected:

STAT CMD                           PID  PGID TPGID  PPID   SID
Ss   ./src/aptitude              26989 26989 26991 26980 26989
T+   ./src/aptitude              26991 26991 26991 26989 26989
T+   /bin/sh -c /usr/sbin/dpkg-p 27027 26991 26991 26991 26989
T+   /usr/bin/perl -w /usr/sbin/ 27028 26991 26991 27027 26989
Z+   [dpkg-preconfigu] <defunct> 27034 26991 26991 27028 26989
T+   /bin/sh -e /tmp/lynx-cur.co 27039 26991 26991 27028 26989
T+   whiptail --backtitle Packag 27043 26991 26991 27028 26989

So the signal isn't being blocked or ignored. I can also run programs in a normal shell, outside my harness (e.g., links &) and watch them auto-suspend, but the same thing doesn't happen when I start them directly under my controller process. It's not even that they're starting as foreground processes: I start them in the background, and they never see a SIGTTIN.

Does anyone have a clue what's going on? Hopefully it's as simple as a flag I have to set somewhere...

Oh, and for extra fun, this is all happening inside a VTE terminal widget. That shouldn't make a difference (after all, it's what gnome-terminal uses, and SIGTTIN behaved as expected when I tested it there), but who knows, it might be relevant.

[UPDATE] I did a little more testing: I was bringing the subprocess into the foreground whenever the terminal was active, so I couldn't try typing into the terminal while the subprocess was in the background. If I do that, it does get SIGTTIN and it suspends. So it looks like select() doesn't count as a read for the purposes of job control. I doubt there's much I can do to get around that. :-(