I'm trying to exec() a program from my server, and attach my socket's IO to it, but I'm not getting all the data across. Why?
Unix Socket FAQ for Network programming
(Continued from previous question...)
I'm trying to exec() a program from my server, and attach my
socket's IO to it, but I'm not getting all the data across. Why?
If the program you are running uses printf(), etc (streams from
stdio.h) you have to deal with two buffers. The kernel buffers all
socket IO, and this is explained in ``section 2.11''. The second
buffer is the one that is causing you grief. This is the stdio
buffer, and the problem was well explained by Andrew:
(The short answer to this question is that you want to use a pty
rather than a socket; the remainder of this article is an attempt to
explain why.)
Firstly, the socket buffer controlled by setsockopt() has absolutly
nothing to do with stdio buffering. Setting it to 1 is guaranteed to
be the Wrong Thing(tm).
Perhaps the following diagram might make things a little clearer:
Process A Process B
+---------------------+ +---------------------+
| | | |
| mainline code | | mainline code |
| | | | ^ |
| v | | | |
| fputc() | | fgetc() |
| | | | ^ |
| v | | | |
| +-----------+ | | +-----------+ |
| | stdio | | | | stdio | |
| | buffer | | | | buffer | |
| +-----------+ | | +-----------+ |
| | | | ^ |
| | | | | |
| write() | | read() |
| | | | | |
+-------- | ----------+ +-------- | ----------+
| | User space
------------|-------------------------- | ---------------------------
| | Kernel space
v |
+-----------+ +-----------+
| socket | | socket |
| buffer | | buffer |
+-----------+ +-----------+
| ^
v |
(AF- and protocol- (AF- and protocol-
dependent code) dependent code)
Assuming these two processes are communicating with each other (I've
deliberately omitted the actual comms mechanisms, which aren't really
relevent), you can see that data written by process A to its stdio
buffer is completely inaccessible to process B. Only once the decision
is made to flush that buffer to the kernel (via write()) can the data
actually be delivered to the other process.
The only guaranteed way to affect the buffering within process A is to
change the code. However, the default buffering for stdout is
controlled by whether the underlying FD refers to a terminal or not;
generally, output to terminals is line-buffered, and output to non-
terminals (including but not limited to files, pipes, sockets, non-tty
devices, etc.) is fully buffered. So the desired effect can usually be
achieved by using a pty device; this, for example, is what the
'expect' program does.
Since the stdio buffer (and the FILE structure, and everything else
related to stdio) is user-level data, it is not preserved across an
exec() call, hence trying to use setvbuf() before the exec is
ineffective.
If it's an option, you can use some standalone program that will just
run something inside a pty and buffer its input/output. I've seen a
package by the name pty.tar.gz that did that; you could search around
for it with archie or AltaVista.
Another option (**warning, evil hack**) , if you're on a system that
supports this (SunOS, Solaris, Linux ELF do; I don't know about
others) is to, on your main program, putenv() the name of a shared
executable (*.so) in LD_PRELOAD, and then in that .so redefine some
commonly used libc function that the program you're exec'ing is known
to use early. There you can 'get control' on the running program, and
the first time you get it, do a setbuf(stdout, NULL) on the program's
behalf, and then call the original libc function with a dlopen() +
dlsym(). And you keep the dlsym() value on a static var, so you can
just call that the following times.
(Continued on next question...)
Other Interview Questions
|