11.6 getaddrinfo
Function
The gethostbyname and
gethostbyaddr functions only support IPv4. The API for
resolving IPv6 addresses went through several iterations, as will
be described in Section 11.20;
the final result is the getaddrinfo function. The
getaddrinfo function handles both name-to-address and
service-to-port translation, and returns sockaddr
structures instead of a list of addresses. These sockaddr
structures can then be used by the socket functions directly. In
this way, the getaddrinfo function hides all the protocol
dependencies in the library function, which is where they belong.
The application deals only with the socket address structures that
are filled in by getaddrinfo. This function is defined in
the POSIX specification.
The POSIX definition of this function comes from
an earlier proposal by Keith Sklower for a function named
getconninfo. This function was the result of discussions
with Eric Allman, William Durst, Michael Karels, and Steven Wise,
and from an early implementation written by Eric Allman. The
observation that specifying a hostname and a service name would
suffice for connecting to a service independent of protocol details
was made by Marshall Rose in a proposal to X/Open.
#include <netdb.h>
|
int getaddrinfo (const char
*hostname, const char
*service, const struct
addrinfo *hints, struct
addrinfo **result) ;
|
Returns: 0 if OK, nonzero on error (see
Figure 11.7)
|
This function returns through the result pointer a pointer to a linked list of
addrinfo structures, which is defined by including
<netdb.h>.
struct addrinfo {
int ai_flags; /* AI_PASSIVE, AI_CANONNAME */
int ai_family; /* AF_xxx */
int ai_socktype; /* SOCK_xxx */
int ai_protocol; /* 0 or IPPROTO_xxx for IPv4 and IPv6 */
socklen_t ai_addrlen; /* length of ai_addr */
char *ai_canonname; /* ptr to canonical name for host */
struct sockaddr *ai_addr; /* ptr to socket address structure */
struct addrinfo *ai_next; /* ptr to next structure in linked list */
};
The hostname is
either a hostname or an address string (dotted-decimal for IPv4 or
a hex string for IPv6). The service is either a service name or a decimal
port number string. (See also Exercise 11.4,
where we want to allow an address string for the host or a port
number string for the service.)
hints is either
a null pointer or a pointer to an addrinfo structure that
the caller fills in with hints about the types of information the
caller wants returned. For example, if the specified service is
provided for both TCP and UDP (e.g., the domain service,
which refers to a DNS server), the caller can set the
ai_socktype member of the hints structure to SOCK_DGRAM. The
only information returned will be for datagram sockets.
The members of the hints structure that can be set by the caller
are:
-
ai_flags (zero or more
AI_XXX values
OR'ed together)
-
ai_family (an AF_xxx value)
-
ai_socktype (a
SOCK_xxx value)
-
ai_protocol
The possible values for the ai_flags
member and their meanings are:
AI_PASSIVE
|
The caller will use the socket for a passive
open.
|
AI_CANONNAME
|
Tells the function to return the canonical name
of the host.
|
AI_NUMERICHOST
|
Prevents any kind of name-to-address mapping;
the hostname argument must be an
address string.
|
AI_NUMERICSERV
|
Prevents any kind of name-to-service mapping;
the service argument must be a
decimal port number string.
|
AI_V4MAPPED
|
If specified along with an ai_family of
AF_INET6, then returns IPv4-mapped IPv6 addresses
corresponding to A records if there are no available AAAA
records.
|
AI_ALL
|
If specified along with AI_V4MAPPED,
then returns IPv4-mapped IPv6 addresses in addition to any AAAA
records belonging to the name.
|
AI_ADDRCONFIG
|
Only looks up addresses for a given IP version
if there is one or more interface that is not a loopback interface
configured with an IP address of that version.
|
If the hints
argument is a null pointer, the function assumes a value of 0 for
ai_flags, ai_socktype, and ai_protocol,
and a value of AF_UNSPEC for ai_family.
If the function returns success (0), the
variable pointed to by the result
argument is filled in with a pointer to a linked list of
addrinfo structures, linked through the ai_next
pointer. There are two ways that multiple structures can be
returned:
-
If there are
multiple addresses associated with the hostname, one structure is returned for each
address that is usable with the requested address family (the
ai_family hint, if specified).
-
If the service
is provided for multiple socket types, one structure can be
returned for each socket type, depending on the
ai_socktype hint. (Note that most getaddrinfo
implementations consider a port number string to be implemented
only by the socket type requested in ai_socktype; if
ai_socktype is not specified, an error is returned
instead.)
For example, if no hints are provided and if the
domain service is looked up for a host with two IP
addresses, four addrinfo structures are returned:
-
One for the first IP address and a socket type
of SOCK_STREAM
-
One for the first IP address and a socket type
of SOCK_DGRAM
-
One for the second IP address and a socket type
of SOCK_STREAM
-
One for the second IP address and a socket type
of SOCK_DGRAM
We show this example in Figure 11.5. There is no guaranteed order of the
structures when multiple items are returned; that is, we cannot
assume that TCP services will be returned before UDP services.
Although not guaranteed, an implementation
should return the IP addresses in the same order as they are
returned by the DNS. Some resolvers allow the administrator to
specify an address sorting order in the /etc/resolv.conf
file. IPv6 specifies address selection rules (RFC 3484 [Draves
2003]), which could affect the order of addresses returned by
getaddrinfo.
The information returned in the
addrinfo structures is ready for a call to socket
and then either a call to connect or sendto (for
a client), or bind (for a server). The arguments to
socket are the members ai_family,
ai_socktype, and ai_protocol. The second and
third arguments to either connect or bind are
ai_addr (a pointer to a socket address structure of the
appropriate type, filled in by getaddrinfo) and
ai_addrlen (the length of this socket address
structure).
If the AI_CANONNAME flag is set in the
hints structure, the
ai_canonname member of the first returned structure points
to the canonical name of the host. In terms of the DNS, this is
normally the FQDN. Programs like telnet commonly use this
flag to be able to print the canonical hostname of the system to
which they are connecting, so that if the user supplied a shortcut
or an alias, he or she knows what got looked up.
Figure
11.5 shows the returned information if we execute the
following:
struct addrinfo hints, *res;
bzero(&hints, sizeof(hints) ) ;
hints.ai_flags = AI_CANONNAME;
hints.ai_family = AF_INET;
getaddrinfo("freebsd4", "domain", &hints, &res);
In this figure, everything except the
res variable is dynamically allocated memory (e.g., from
malloc). We assume that the canonical name of the host
freebsd4 is freebsd4.unpbook.com and that this
host has two IPv4 addresses in the DNS.
Port 53 is for the domain service. This
port number will be in network byte order in the socket address
structures. We also show the returned ai_protocol values
as IPPROTO_TCP or IPPROTO_UDP. It would also be acceptable for
getaddrinfo to return an ai_protocol of 0 for the
two SOCK_STREAM structures if that is sufficient to
specify TCP (it is not sufficient if the system implements SCTP,
for example), and an ai_protocol of 0 for the two
SOCK_DGRAM structures if the system doesn't implement any
other SOCK_DGRAM protocols for IP (as of this writing,
none are yet standardized, but two are in development in the IETF).
It is safest for getaddrinfo to always return the specific
protocol.
Figure
11.6 summarizes the number of addrinfo structures
returned for each address that is being returned, based on the
specified service name (which can be a decimal port number) and any
ai_socktype hint.
Multiple addrinfo structures are
returned for each IP address only when no ai_socktype hint
is provided and the service name is supported by multiple transport
protocols (as indicated in the /etc/services file).
If we were to enumerate all 64 possible inputs
to getaddrinfo (there are six input variables), many would
be invalid and some would make little sense. Instead, we will look
at the common cases.
-
Specify the hostname and service. This is normal for a TCP or UDP
client. On return, a TCP client loops through all returned IP
addresses, calling socket and connect for each
one, until the connection succeeds or until all addresses have been
tried. We will show an example of this with our
tcp_connect function in Figure 11.10.
-
For a UDP client, the socket address structure
filled in by getaddrinfo would be used in a call to
sendto or connect. If the client can tell that
the first address doesn't appear to work (either by receiving an
error on a connected UDP socket or by experiencing a timeout on an
unconnected socket), additional addresses can be tried.
-
If the client knows it handles only one type of
socket (e.g., Telnet and FTP clients handle only TCP; TFTP clients
handle only UDP), then the ai_socktype member of the
hints structure should be
specified as either SOCK_STREAM or
SOCK_DGRAM.
-
A typical server specifies the service but not the hostname, and specifies the
AI_PASSIVE flag in the hints structure. The socket address structures
returned should contain an IP address of INADDR_ANY (for
IPv4) or IN6ADDR_ANY_INIT (for IPv6). A TCP server then
calls socket, bind, and listen. If the
server wants to malloc another socket address structure to
obtain the client's address from accept, the returned
ai_addrlen value specifies this size.
-
A UDP server would call socket,
bind, and then recvfrom. If the server wants to
malloc another socket address structure to obtain the
client's address from recvfrom, the returned
ai_addrlen value specifies this size.
-
As with the typical client code, if the server
knows it only handles one type of socket, the ai_socktype
member of the hints structure
should be set to either SOCK_STREAM or
SOCK_DGRAM. This avoids having multiple structures
returned, possibly with the wrong ai_socktype value.
-
The TCP servers that we have shown so far create
one listening socket, and the UDP servers create one datagram
socket. That is what we assume in the previous item. An alternate
server design is for the server to handle multiple sockets using
select or poll. In this scenario, the server
would go through the entire list of structures returned by
getaddrinfo, create one socket per structure, and use
select or poll.
The problem with this technique is that one
reason for getaddrinfo returning multiple structures is
when a service can be handled by IPv4 and IPv6 (Figure 11.8).
But, these two protocols are not completely independent, as we will
see in Section 12.2. That
is, if we create a listening IPv6 socket for a given port, there is
no need to also create a listening IPv4 socket for that same port,
because connections arriving from IPv4 clients are automatically
handled by the protocol stack and by the IPv6 listening socket,
assuming that the IPV6_V6ONLY socket option is not
set.
Despite the fact that getaddrinfo is
"better" than the gethostbyname and getservbyname
functions (it makes it easier to write protocol-independent code;
one function handles both the hostname and the service; and all the
returned information is dynamically allocated, not statically
allocated), it is still not as easy to use as it could be. The
problem is that we must allocate a hints structure, initialize it to 0, fill in
the desired fields, call getaddrinfo, and then traverse a
linked list trying each one. In the next sections, we will provide
some simpler interfaces for the typical TCP and UDP clients and
servers that we will write in the remainder of this text.
getaddrinfo solves the problem of
converting hostnames and service names into socket address
structures. In Section 11.17, we
will describe the reverse function, get nameinfo, which
converts socket address structures into hostnames and service
names.
|