14.8 Sockets and Standard I/O
In all our examples so far, we have used what is
sometimes called Unix I/O, the
read and write functions and their variants
(recv, send, etc.). These functions work with
descriptors and are normally
implemented as system calls within the Unix kernel.
Another method of performing I/O is the
standard I/O library. It is
specified by the ANSI C standard and is intended to be portable to
non-Unix systems that support ANSI C. The standard I/O library
handles some of the details that we must worry about ourselves when
using the Unix I/O functions, such as automatically buffering the
input and output streams. Unfortunately, its handling of a stream's
buffering can present a new set of problems we must worry about.
Chapter 5 of APUE covers the standard I/O library in detail, and
[Plauger 1992] presents and discusses a complete implementation of
the standard I/O library.
The term stream
is used with the standard I/O library, as in "we open an input
stream" or "we flush the output stream." Do not confuse this with
the STREAMS subsystem, which we will discuss in Chapter 31.
The standard I/O library can be used with
sockets, but there are a few items to consider:
-
A standard I/O stream can be created from any
descriptor by calling the fdopen function. Similarly,
given a standard I/O stream, we can obtain the corresponding
descriptor by calling fileno. Our first encounter with
fileno was in Figure 6.9 when we
wanted to call select on a standard I/O stream.
select works only with descriptors, so we had to obtain
the descriptor for the standard I/O stream.
-
TCP and UDP sockets are full-duplex. Standard
I/O streams can also be full-duplex: we just open the stream with a
type of r+, which means read-write. But on such a stream,
an output function cannot be followed by an input function without
an intervening call to fflush, fseek,
fsetpos, or rewind. Similarly, an input function
cannot be followed by an output function without an intervening
call to fseek, fsetpos, or rewind,
unless the input function encounters an EOF. The problem with these
latter three functions is that they all call lseek, which
fails on a socket.
-
The easiest way to handle this read-write
problem is to open two standard I/O streams for a given socket: one
for reading and one for writing.
Example: str_echo Function
Using Standard I/O
We now show an alternate version of our TCP echo
server (Figure 5.3), which
uses standard I/O instead of read and writen.
Figure 14.14 is a version
of our str_echo function that uses standard I/O. (This
version has a problem that we will describe shortly.)
Figure 14.14
str_echo function recoded to use standard I/O.
advio/str_echo_stdio02.c
1 #include "unp.h"
2 void
3 str_echo(int sockfd)
4 {
5 char line[MAXLINE];
6 FILE *fpin, *fpout;
7 fpin = Fdopen(sockfd, "r");
8 fpout = Fdopen(sockfd, "w");
9 while (Fgets(line, MAXLINE, fpin) != NULL)
10 Fputs(line, fpout);
11 }
Convert descriptor into input stream
and output stream
7鈥?0
Two standard I/O streams are created by fdopen: one for
input and one for output. The calls to read and
writen are replaced with calls to fgets and
fputs.
If we run our server with this version of
str_echo and then run our client, we see the
following:
hpux % tcpcli02 206.168.112.96
|
|
hello,
world
|
we type this line, but
nothing is echoed
|
and
hi
|
and this one, still no
echo
|
hello??
|
and this one, still no
echo
|
^D
|
and our EOF
character
|
hello, world
|
and then the three
echoed lines are output
|
and hi
|
|
hello??
|
|
There is a buffering problem here because
nothing is echoed by the server until we enter our EOF character.
The following steps take place:
-
We type the first line of input and it is sent
to the server.
-
The server reads the line with fgets
and echoes it with fputs.
-
The server's standard I/O stream is fully buffered by the standard I/O library.
This means the library copies the echoed line into its standard I/O
buffer for this stream, but does not write the buffer to the
descriptor, because the buffer is not full.
-
We type the second line of input and it is sent
to the server.
-
The server reads the line with fgets
and echoes it with fputs.
-
Again, the server's standard I/O library just
copies the line into its buffer, but does not write the buffer
because it is still not full.
-
The same scenario happens with the third line of
input that we enter.
-
We type our EOF character, and our
str_cli function (Figure 6.13) calls
shutdown, sending a FIN to the server.
-
The server TCP receives the FIN, which
fgets reads, causing fgets to return a null
pointer.
-
The str_echo function returns to the
server main function (Figure 5.12) and the
child terminates by calling exit.
-
The C library function exit calls the
standard I/O cleanup function (pp. 162鈥?64 of APUE). The output
buffer that was partially filled by our calls to fputs is
now output.
-
The server child process terminates, causing its
connected socket to be closed, sending a FIN to the client,
completing the TCP four-packet termination sequence.
-
The three echoed lines are received by our
str_cli function and output.
-
str_cli then receives an EOF on its
socket, and the client terminates.
The problem here is the buffering performed
automatically by the standard I/O library on the server. There are
three types of buffering performed by the standard I/O library:
-
Fully buffered means that I/O takes place only
when the buffer is full, the process explicitly calls
fflush, or the process terminates by calling
exit. A common size for the standard I/O buffer is 8,192
bytes.
-
Line buffered means that I/O takes place when
a newline is encountered, when the process calls fflush,
or when the process terminates by calling exit.
-
Unbuffered means that I/O takes place each
time a standard I/O output function is called.
Most Unix implementations of the standard I/O
library use the following rules:
-
Standard error is always unbuffered.
-
Standard input and standard output are fully
buffered, unless they refer to a terminal device, in which case,
they are line buffered.
-
All other streams are fully buffered unless they
refer to a terminal device, in which case, they are line
buffered.
Since a socket is not a terminal device, the
problem seen with our str_echo function in Figure 14.14 is that the output
stream (fpout) is fully buffered. One way around this is
to force the output stream to be line buffered by calling
setvbuf. Another is to force each echoed line to be output
by calling fflush after each call to fputs. But
in practice, either of these solutions is still error-prone and may
interact badly with the Nagle algorithm described in Section
7.9. In most cases, the best solution is to avoid using the
standard I/O library altogether for sockets and operate on buffers
instead of lines, as described in Section 3.9. Using
standard I/O on sockets may make sense when the convenience of
standard I/O streams outweighs the concerns about bugs due to
buffering, but these are rare cases.
Be aware that some implementations of the
standard I/O library still have a problem with descriptors greater
than 255. This can be a problem with network servers that handle
lots of descriptors. Check the definition of the FILE
structure in your <stdio.h> header to see what type
of variable holds the descriptor.
|