26.1 Introduction
In the traditional Unix model, when a process
needs something performed by another entity, it forks a
child process and lets the child perform the processing. Most
network servers under Unix are written this way, as we have seen in
our concurrent server examples: The parent accepts the
connection, forks a child, and the child handles the
client.
While this paradigm has served well for many
years, there are problems with fork:
-
fork is expensive. Memory is copied
from the parent to the child, all descriptors are duplicated in the
child, and so on. Current implementations use a technique called
copy-on-write, which avoids a copy
of the parent's data space to the child until the child needs its
own copy. But, regardless of this optimization, fork is
expensive.
-
IPC is required to pass information between the
parent and child after the
fork. Passing information from the parent to the child
before the fork is easy,
since the child starts with a copy of the parent's data space and
with a copy of all the parent's descriptors. But, returning
information from the child to the parent takes more work.
Threads help with both problems. Threads are
sometimes called lightweight
processes since a thread is "lighter weight" than a process.
That is, thread creation can be 10鈥?00 times faster than process
creation.
All threads within a process share the same
global memory. This makes the sharing of information easy between
the threads, but along with this simplicity comes the problem of
synchronization.
More than just the global variables are shared.
All threads within a process share the following:
-
Process instructions
-
Most data
-
Open files (e.g., descriptors)
-
Signal handlers and signal dispositions
-
Current working directory
-
User and group IDs
But each thread has its own
-
Thread ID
-
Set of registers, including program counter and
stack pointer
-
Stack (for local variables and return
addresses)
-
errno
-
Signal mask
-
Priority
One analogy is to think of signal handlers as a
type of thread as we discussed in Section 11.18.
That is, in the traditional Unix model, we have the main flow of
execution (one thread) and a signal handler (another thread). If
the main flow is in the middle of updating a linked list when a
signal occurs, and the signal handler also tries to update the
linked list, havoc normally results. The main flow and signal
handler share the same global variables, but each has its own
stack.
In this text, we cover POSIX threads, also
called Pthreads. These were
standardized in 1995 as part of the POSIX.1c standard and most
versions of Unix will support them in the future. We will see that
all the Pthread functions begin with pthread_. This
chapter is an introduction to threads, so that we can use threads
in our network programs. For additional details see [Butenhof
1997].
|