..

UNIX Domain Sockets vs. TCP Sockets

The term inter-process communication (IPC) is usually used to describe operating system mechanisms that allow processes to share information with one another on the same computer. But in reality, it's a very general term that doesn't require processes to be running on the same machine.

mysqli_real_connect(): (HY000/2002): No such file or directory

I cannot remember how long ago I first came across this PHP error message. It was probably around 7 or 8 years ago when I was hacking together my first web site on the LAMP stack. It feels like a really long time ago; back before I knew anything about programming, Linux, the World Wide Web, or the Internet. I couldn't tell you what the error means, but I got past it and my site went live and some years have passed since I switched to PostgreSQL for most projects.

Nowadays, when I see this error, it's because a student forgot to start their MySQL database server. I suggest they start the mysql service in their Ubuntu dev environment and then everything works as expected. But how did I know that this was the issue?

How the hell does this error suggest that MySQL is not running?!

At some point I just memorized the possible reasons that this errors occurs. So I figured I'd write a little about it.

Computer Processes in Web Applications

Computer processes are the programs running on a computer. You can see which programs are running on a machine by opening Activity Monitor (macOS), Task Manager (Windows), or executing the top command (Linux). Some of these programs need to work together as a system for some larger purpose.

When I teach "full stack" web development, I spend a lot of time talking about a few types of programs and how they work together as the tiers of a Web application:

If you are having connectivity issues, then your first step is to verify that the client, server, and data tiers are all running.

TCP Sockets

When I was self-studying, I slowly pieced together how clients and servers shoot HTTP messages at each other over TCP connections. I don't often interact with TCP connections directly; these are hidden behind client libraries or web frameworks, where HTTP request and response messages are the center of attention.

Here's my simplified summary of an HTTP exchange between a client and a server over TCP:

  1. Assuming the client knows the network address and port of the server, the client can attempt to establish a TCP connection to it.
  2. If the network is up and the server is listening, then the connection is established via a fancy handshake.
  3. The client sends an HTTP Request message over the connection.
  4. The server processes the request and then sends an HTTP Response message over the connection.
  5. The client and server then agree to close the TCP connection with another fancy handshake.

While a TCP connection is open, the client and server programs each receive a special data structure from their respective host operating systems. This data structure is a "socket" and it represents a network communication endpoint. Each program writes to or reads from its socket whenever it wants to exchange data with the program at the other end of the connection.

After I got my head around the ideas of hosts, IP addresses, and ports, my mental model of how clients, servers, and databases communicate with one another firmed up.

Sure, I got tripped up by the occasional error:

But these all sort of "fit" within my mental model of network communications.

Inter-process Communication

The term inter-process communication (IPC) is usually used to describe operating system mechanisms that allow processes to share information with one another on the same computer in the same operating system. But in reality, it's a more general term that doesn't require processes to be running on the same computer. In fact, network sockets, like those summarized above, are among the various IPC facilities that exist.

UNIX Domain Sockets

This is a topic I've never really looked into until now. I'm often trying to come up with quick and simple ways to make underlying technologies (like TCP) more concrete to my students. So I've been playing around with the net module while writing some Node.js lessons.

Then I saw this:

The net module supports IPC with named pipes on Windows, and UNIX domain sockets on other operating systems.
— nodejs.org/docs

There are a number of ways to start a net.Server, including listening at a UNIX socket path:

server.listen(path[, backlog][, callback])

So after a quick bit of reading I came to understand that a UNIX domain socket can be used to set up bi-directional communication between processes, much like TCP connections, but without the overhead (or security implications) of using any networking facilities. These sockets are represented as files in the filesystem hierarchy as opposed to just file descriptors (see Everything is a file). This means that you can actually see the file once it's been created by the socket server program.

If the UNIX domain socket (that is visible as a file system path) is created and used in conjunction with one of Node.js' API abstractions such as net.createServer(), it will be unlinked as part of server.close().
— nodejs.org/docs

Naturally, I had to fool around with it, and I ended up with a little GitHub repository for a barebones "chat" server that doesn't work over the network. Super useful, right?

Still not satisfied with how much time I had wasted, I decided to start an HTTP server with Node.js and have it listen on a socket path so I could try to manually type it HTTP requests via the command line.

For science.

Back to this PHP gem:

mysqli_real_connect(): (HY000/2002): No such file or directory

I've always seen a host or port being passed to mysqli_real_connect() so the above error doesn't really line up with my understanding. What file?

Digging into the documentation mysqli_real_connect() gives a hint near the bottom.

Note: Specifying the socket parameter will not explicitly determine the type of connection to be used when connecting to the MySQL server. How the connection is made to the MySQL database is determined by the host parameter.
— PHP.net

But I'm usually using 'localhost' as my host parameter when connecting to MySQL. The MySQL docs clear this up.

On Unix, MySQL programs treat the host name localhost specially, in a way that is likely different from what you expect compared to other network-based programs: the client connects using a Unix socket file. The --socket option or the MYSQL_UNIX_PORT environment variable may be used to specify the socket name.
— mysql.com

I'd seen this part of the MySQL configuration a long time ago. Even looking at it now, my eyes go straight to the port because I think of database servers as network services. "Oh, yes! That's MySQL's default port!"

# /etc/mysql/mysql.conf.d/mysqld.cnf
[mysqld]
user            = mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
port            = 3306
basedir         = /usr
datadir         = /var/lib/mysql
tmpdir          = /tmp
lc-messages-dir = /usr/share/mysql
skip-external-locking

Until now, the part I was fuzzy on was the socket setting.

So as an experiment I ran ls -lah /var/run/mysqld in my Ubuntu Docker container.

total 8.0K
drwxr-xr-x 1 mysql root 4.0K Dec  2 03:01 .
drwxr-xr-x 1 root  root 4.0K Dec  1 16:59 ..

Nothing. Then sudo service mysql start and look again.

total 16K
drwxr-xr-x 1 mysql root  4.0K Dec  2 02:33 .
drwxr-xr-x 1 root  root  4.0K Dec  1 16:59 ..
-rw-r----- 1 mysql mysql    5 Dec  2 02:33 mysqld.pid
srwxrwxrwx 1 mysql mysql    0 Dec  2 02:33 mysqld.sock
-rw------- 1 mysql mysql    5 Dec  2 02:33 mysqld.sock.lock

Holy shit!

There's an empty file there, named mysql.sock and its type is s (for socket!).

When passing 'localhost' as the first parameter to mysqli_real_connect(), it's assumed that MySQL has been started and /var/run/mysqld/mysqld.sock exists. When MySQL is stopped, it deletes the socket file. This explains the error message No such file or directory.

Still Learning

While I do think that a top-down approach to learning about technology is workable, things like this keep popping up for me. I may have an intuition for how or why something works, but in the end there are plenty of dots left unconnected. I'll try to continue writing about these gaps in my knowledge as I recognize and try to fill them.