11.18 Re-entrant Functions
The gethostbyname function from
Section 11.3
presents an interesting problem that we have not yet examined in
the text: It is not re-entrant. We
will encounter this problem in general when we deal with threads in
Chapter 26,
but it is interesting to examine the problem now (without having to
deal with the concept of threads) and to see how to fix it.
First, let us look at how the function works. If
we look at its source code (which is easy since the source code for
the entire BIND release is publicly available), we see that one
file contains both gethostbyname and
gethostbyaddr, and the file has the following general
outline:
static struct hostent host ; /* result stored here */
struct hostent *
gethostbyname (const char *hostname)
{
return (gethostbyname2 (hostname, family));
}
struct hostent *
gethostbyname2 (const char *hostname, int family)
{
/* call DNS functions for A or AAAA query */
/* fill in host structure */
return (&host) ;
}
struct hostent *
gethostbyaddr (const char *addr, socklen_t len, int family)
{
/* call DNS functions for PTR query in in-addr.arpa domain */
/* fill in host structure */
return (&host);
}
We highlight the static storage class
specifier of the result structure because that is the basic
problem. The fact that these three functions share a single
host variable presents yet another problem that we will
discuss in Exercise 11.1.
(gethostbyname2 was introduced with the IPv6 support in
BIND 4.9.4. It has since been deprecated; see Section
11.20 for more detail. We will ignore the fact that
gethostbyname2 is involved when we call
gethostbyname, as that doesn't affect this
discussion.)
The re-entrancy problem can occur in a normal
Unix process that calls gethostbyname or
gethostbyaddr from both the main flow of control and from
a signal handler. When the signal handler is called (say it is a
SIGALRM signal that is generated once per second), the
main flow of control of the process is temporarily stopped and the
signal handling function is called. Consider the following:
main ()
{
struct hostent *hptr;
...
signal (SIGALRM, sig_alrm);
...
hptr = gethostbyname ( ... ) ;
...
}
void
sig_alrm (int signo)
{
struct hostent *hptr;
...
hptr = gethostbyname ( ... ) ;
...
}
If the main flow of control is in the middle of
gethostbyname when it is temporarily stopped (say the
function has filled in the host variable and is about to
return), and the signal handler then calls gethostbyname,
since only one copy of the variable host exists in the
process, it is reused. This overwrites the values that were
calculated for the call from the main flow of control with the
values calculated for the call from the signal handler.
If we look at the name and address conversion
functions presented in this chapter, along with the
inet_XXX functions from
Chapter 4,
we note the following:
-
Historically, gethostbyname,
gethostbyaddr, getservbyname, and get
servbyport are not re-entrant because all return a pointer to
a static structure.
Some implementations that support threads
(Solaris 2.x) provide re-entrant versions of these four functions
with names ending with the_r suffix, which we will
describe in the next section.
Alternately, some implementations that support
threads (HP-UX 10.30 and later) provide re-entrant versions of
these functions using thread-specific data (Section
26.5).
-
inet_pton and inet_ntop are
always re-entrant.
-
Historically, inet_ntoa is not
re-entrant, but some implementations that support threads provide a
re-entrant version that uses thread-specific data.
-
getaddrinfo is re-entrant only if it
calls re-entrant functions itself; that is, if it calls re-entrant
versions of gethostbyname for the hostname and
getservbyname for the service name. One reason that all
the memory for the results is dynamically allocated is to allow it
to be re-entrant.
-
getnameinfo is re-entrant only if it
calls re-entrant functions itself; that is, if it calls re-entrant
versions of gethostbyaddr to obtain the hostname and
getservbyport to obtain the service name. Notice that both
result strings (for the hostname and the service name) are
allocated by the caller to allow this reentrancy.
A similar problem occurs with the variable
errno. Historically, there has been a single copy of this
integer variable per process. If a process makes a system call that
returns an error, an integer error code is stored in this variable.
For example, when the function named close in the standard
C library is called, it might execute something like the following
pseudocode:
-
Put the argument to the system call (an integer
descriptor) into a register
-
Put a value in another register indicating the
close system call is being called
-
Invoke the system call (switch to the kernel
with a special instruction)
-
Test the value of a register to see if an error
occurred
-
If no error, return (0)
-
Store the value of some other register into
errno
-
return (-1)
First, notice that if an error does not occur,
the value of errno is not changed. That is why we cannot
look at the value of errno unless we know that an error
has occurred (normally indicated by the function returning -1).
Assume a program tests the return value of the
close function and then prints the value of errno
if an error occurred, as in the following:
if (close (fd) < 0) {
fprintf (stderr, "close error, errno = %d\n", errno)
exit (1) ;
}
There is a small window of time between the
storing of the error code into errno when the system call
returns and the printing of this value by the program, during which
another thread of execution within this process (i.e., a signal
handler) can change the value of errno. For example, if,
when the signal handler is called, the main flow of control is
between close and fprintf and the signal handler
calls some other system call that returns an error (say
write), then the errno value stored from the
write system call overwrites the value stored by the
close system call.
In looking at these two problems with regard to
signal handlers, one solution to the problem with
gethostbyname (returning a pointer to a static variable)
is to not call nonre-entrant
functions from a signal handler. The problem with errno (a
single global variable that can be changed by the signal handler)
can be avoided by coding the signal handler to save and restore the
value of errno in the signal handler as follows:
void
sig_alrm(int signo)
{
int errno_save;
errno_save = errno; /* save its value on entry *
if (write( ... ) != nbytes)
fprintf (stderr, "write error, errno = %d\n", errno);
errno = errno_save; /* restore its value on return */
}
In this example code, we also call
fprintf, a standard I/O function, from the signal handler.
This is yet another re-entrancy problem because many versions of
the standard I/O library are nonre-entrant: Standard I/O functions
should not be called from signal handlers.
We will revisit this problem of re-entrancy in
Chapter 26
and we will see how threads handle the problem of the
errno variable. The next section describes some reentrant
versions of the hostname functions.
|