29.4 Linux: SOCK_PACKET and
PF_PACKET
There are two methods of receiving packets from
the datalink layer under Linux. The original method, which is more
widely available but less flexible, is to create a socket of type
SOCK_PACKET. The newer method, which introduces more
filtering and performance features, is to create a socket of family
PF_PACKET. To do either, we must have sufficient
privileges (similar to creating a raw socket), and the third
argument to socket must be a nonzero value specifying the
Ethernet frame type. When using PF_PACKET sockets, the
second argument to socket can be SOCK_DGRAM, for
"cooked" packets with the link-layer header removed, or
SOCK_RAW, for the complete link-layer packet.
SOCK_PACKET sockets only return the complete link layer
packet. For example, to receive all frames from the datalink, we
write
fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL)); /* newer systems*/
or
fd = socket(AF_INET, SOCK_PACKET, htons(ETH_P_ALL)); /* older systems*/
This would return frames for all protocols that
the datalink receives.
If we wanted only IPv4 frames, the call would
be
fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP)); /* newer systems */
or
fd = socket(AF_INET, SOCK_PACKET, htons(ETH_P_IP)); /* older systems */
Other constants for the final argument are
ETH_P_ARP and ETH_P_IPV6, for example.
Specifying a protocol of
ETH_P_xxx tells the
datalink which frame types to pass to the socket for the frames the
datalink receives. If the datalink supports a promiscuous mode
(e.g., an Ethernet), then the device must also be put into a
promiscuous mode, if desired. This is done with a
PACKET_ADD_MEMBERSHIP socket option, using a
packet_mreq structure specifying an interface and an
action of PACKET_MR_PROMISC. On older systems, this is
done instead by an ioctl of SIOCGIFFLAGS to fetch
the flags, setting the IFF_PROMISC flag, and then storing
the flags with SIOCSIFFLAGS. Unfortunately, with this
method, multiple promiscuous listeners can interfere with each
other and a buggy program can leave promiscuous mode on even after
it exits.
Some differences are evident when comparing this
Linux feature to BPF and DLPI:
-
The Linux feature
provides no kernel buffering and kernel filtering is only available
on newer systems (via the SO_ATTACH_FILTER socket option).
There is a normal socket receive buffer, but multiple frames cannot
be buffered together and passed to the application with a single
read. This increases the overhead involved in copying the
potentially voluminous amounts of data from the kernel to the
application.
-
SOCK_PACKET provides no filtering by
device. (PF_PACKET sockets can be linked to a device by
calling bind.) If ETH_P_IP is specified in the
call to socket, then all IPv4 packets from all devices
(Ethernets, PPP links, SLIP links, and the loopback device, for
example) are passed to the socket. A generic socket address
structure is returned by recvfrom, and the
sa_data member contains the device name (e.g.,
eth0). The application must then discard data from any
device in which it is not interested. The problem again is too much
data can be returned to the application, which can get in the way
when monitoring a high-speed network.
|