Hi all, today I’ll write something about sockets of the AF_UNIX family (aka as AF_LOCAL) and more specifically about the “abstract namespace” held by the Linux kernel.
Before I start, let me share what I expect you to know before reading this article, and also some references in case you are not familiar with these terms yet. 🙂
- Processes – A process is an independent part of a computer program; they are composed of a sequence of instructions that run following a sequential path, a private memory space and context information. Unlike threads, different processes don’t share memory, so when two of them need to communicate, we need to use one of the so called “inter-process communication methods”. For more information about that I recommend the Wikipedia and the book Modern Operating Systems, by Andrew Tanenbaum.
- Sockets – In this context, sockets are bi-directional inter-process communication channels; being available in several flavors, some of them allow communication through network interfaces, others rely on the system kernel only as the carrier, and so forth. Differences also exist on reliability, message boundaries preservation, etc. In this article I’ll focus on sockets of the family AF_UNIX (or AF_LOCAL), although most of the information provided here applies to all AF_UNIX sockets, examples will be based on sockets of the subtype SOCK_STREAM.
These sockets provide a communication channel between two processes in the same machine, they are not visible from other machines and cannot be used to establish communication using the network interface. For the latter case see AF_INET sockets.
Every socket family has its own addressing scheme, while you can identify a specific AF_INET socket by an IP address/port pair, this is not true with AF_UNIX sockets. So, how do I address these? Please keep reading, this is what this post is all about.
Reliability is present in most AF_UNIX sockets implementations, the Linux implementation is an example. However, the only subtypes that are formally reliable are the SOCK_STREAM and SOCK_SEQPACKET types, both connection oriented but with some differences in the maximum message length and message boundaries preservation.
A couple of weeks ago I was dealing with a situation where I had to code a script that would be run from a chroot’ed environment and that would be able to pull files from outside the chroot sandbox. In other situations I could do the opposite, ie. push files from the outside environment to the chroot folder, or maybe I could mount the source directory inside the chroot folder… but this case was different, neither I had control over the chroot’ing procedure nor I knew beforehand in which folder would my script would be running on. This because this script would be run in Anaconda’s or Yast’s post install, inside a just partitioned and installed system.
Although I could try using some heuristics to parse log files and discover the right target directory I wanted to make it right, put together something reliable, that I wouldn’t have to revisit in the future.
For some time I would think about how to do it… finally I though about some kind of client/server architecture where I would have a server with access to the files and a client run from the chroot’ed environment; both processes should communicate and the files would make their way through a pipe… or a socket! 😉
Then I started some investigation on how to code my new idea, first I though of pipes, but these wouldn’t work as the processes would be started independently, from different shells, in different moments, and pipes need to be created at once, not in parts. Then I remembered sockets, AF_UNIX sockets.
Sockets create/bind/listen/accept logic
Before you can use sockets as a bi-directional communication channel, some handshaking and setup is necessary. Sockets of type SOCK_STREAM or other connection-oriented types follow the following logic until a connection is established:
- Node A creates a new socket.
- If node A wants its socket to be visible from other processes, it must publish it, or assign it to an univocal address. This assignment is called “binding“.
- After the socket is bond to a name, node A must put its socket in the listen state, this creates a backlog queue where new connection requests will be placed and served on a FIFO base. If the socket is not in the listen state or if its backlog is full, connection attempts will be denied.
- Finally, when node A has a bond socket and a connection request, all it needs to do is accept this request. The accept command can be blocking and this will make the server wait for a client before proceeding. After a successfully connection a new “connected socket” that can be used for communication is created.
In the client side the process is even more straightforward.
- Node B creates a new socket, of the same family and type as the one Node A has created.
- Node B tries to connect to an address it must know somehow. When the connection is established, communication occurs under the scheme defined by the socket type, for instance, as a reliable byte stream when you have a SOCK_STREAM socket.
This procedure is common for most sockets families and types, with just two exceptions: datagram oriented sockets don’t require connection (so it’s just create, bind and go) and, the address scheme. All sockets have to be bond to an address before connections are received, but addresses are different for each socket family.
For instance, when one is creating a socket to communicate with another computer in the network, it must bind to a given port and network interface. This pair, IP and port is the address the other node will need to use when connecting to that socket. In AF_UNIX sockets, the address is slight different as you will see below.
AF_UNIX sockets addressing
In most cases AF_UNIX sockets are bond to filenames, that’s the ordinary addressing space used by them. So the server creates a socket, binds it to a filename somewhere in its filesystem and waits for a connection. It was clear that wouldn’t solve my problem. If I had a folder where one of my processes could write and the other read my very first problem wouldn’t exist 🙂 .
That’s when I found the Abstract Namespace.
The solution – Abstract Namespace
The Linux kernel holds in its memory space a table of open sockets. AF_UNIX sockets can be bond to “names” rather than “files”, it’s that simple, you just give your socket a name, an identifier, and then all other process can find it by its name, no matter which filesystem it’s running in! No matter whether it’s running in a chroot’ed environment or not! 😀
In a few minutes I had a functional proof of concept, two processes talking, one inside a chroot’ed environment and the other outside. After that all I had to do was code some extra logic and avoid race conditions.
How to bind a socket to the Abstract Namespace
AF_UNIX socket names are always a sequence of chars, something like a C string, but with some peculiarities and differences in behavior. Instructing the bind function to use a filename or an abstract name is just a matter of crafting the right char sequence.
Ordinary bind names, that are associated to filenames, are simply ordinary strings, sequences of letters and numbers terminated by a null-byte. Calling bind with a “name” in this format will create a file in the path specified by “name”.
The trick to trigger the abstract namespace is simple: start the name with a null-byte. All characters found after the null-byte will then be interpreted as the identifier name and the socket will registered by the Kernel under the given name.
After that, just use the same string in the client and you will have a working socket.
Note for C programmers
Unlike high-level bind functions found in languages like Python, the bind(2) syscall arguments requires some attention. As you may have noticed in the above text, unlike C strings and path names, abstract namespace identifiers are not null-terminated strings; their length must be provided in the socklen_t addrlen argument. This means that the char sequences:
- 0x48454C4C4F00 (“HELLO ”) and
- 0x48454C4C4F0000 (“HELLO ”)
represent different bind points. So make sure you pass the right length to avoid segmentation faults or undesired behavior, especially when creating clients and servers using different languages.
Simple example in Python
BTW, sorry for the bad indentation, I still must find a nice way of posting code using this blog theme. Suggestions are appreciated 🙂 Meanwhile, you can download the code here –> AF_UNIX – Python example.
Server code from socket import *
# Create an unbound and not-connected socket.
sock = socket(AF_UNIX, SOCK_STREAM)
# Bind the socket to “MyBindName” in the abstract namespace. Note the null-byte.