CS456 - Systems Programming

Creating a Unix Shell part 3:

Unix Pipes

A pipe in Unix is a uni-directional data stream. The pipe() system call fills the contents of a 2 element integer array with two file-descriptors which represent the two ends of the data-stream.

#include <unistd.h>

int pipe(int pipefd[2]);

pipefd[0] represents the read end of the data-stream. Reads will normally block until data is written to pipefd[1] which then becomes available to be read from the read end.

A pipe is required to connect the output of one process to another process. To do this the pipe must be created prior to a fork(), after which the same pipe exists in both processes (one of the few examples of a truely shared resource between processes after a fork,) thus what is written on the write side of the pipe in one process can be read from the read end in the other process.

Note that because it's a uni-directional data stream it is not possible to use it to both send and receive data between processes, unless one were to take turns reading and writing. One either needs to use two pipes or a socketpair() in which both descriptors can both read and write.

EOF on a pipe

Normally the End Of File (EOF) on a pipe descriptor (i.e. read returns 0 bytes read) is only transmitted when the write end of a pipe has been closed (i.e. there can never be any more data to be read.) A gotcha arises when after we have forked and have two sets of pipe descriptors however, as both processes have both a write end and a read end. Since a EOF can only occur when the write end has been closed, it must be closed in both processes. Thus it becomes important to close the write end in the process that will be reading from the pipe. Thus the steps necessary for process A to write to process B through a pipe are:

Process A (parent) Process B (child)

1) Create the pipe -

2) Fork the new process -

3) Close read end 1) Close write end

4) Write to write end ──▷ 2) Read from read end

Dup() and Dup2()

To connect the output of one process (it's stdout, i.e. descriptor 1) to the input (it's stdin, i.e. descriptor 0) of another process it is necessary to replace the processes normal descriptors (normally attached to the TTY,) with the appropriate pipe descriptors.

The dup() and dup2() system calls are designed to allow us to duplicate an existing descriptor. With dup() the descriptor to be replaced must be closed prior to calling dup() to make the descriptor slot available for the duplication of the pipe descriptor (oldfd). This can be cumbersome, so dup2() takes a newfd parameter which will insure that the descriptor to be replaced (newfd) is closed if necessary and the replaced with the pipe descriptor (oldfd).

#include <unistd.h>

int dup(int oldfd);
int dup2(int oldfd, int newfd);

Pipe example:

The following implements A | B where the standard output of the parent (A) is sent to the standard input of the child (B):

int pipefd[2];

// Pipe must exist before forking:
pipe(pipefd);

pid_t pid = fork();

if (pid > 0) {
  // Parent (A):
  // Closing the read end in parent:
  close(pipefd[0]);
  // Duplicate write end as our stdout:
  dup2(pipefd[1], STDOUT_FILENO);
  // Close the now duplicated and extraneous write end as well:
  close(pipefd[1]);

  printf("This will be sent to the child process.\n");
  exit(0);
} else {
  // Child (B):
  // Closing the write end in child:
  close(pipefd[1]);
  // Duplicate the read end as our stdin:
  dup2(pipefd[0], STDIN_FILENO);
  // Close the now duplicated read end as well:
  close(pipefd[0]);

  char buf[K];

  // This will read the string printed above:
  fgets(buf, K, stdin);

  printf("The message from the parent: %s\n", buf);
  exit(0);
}

Bi-directional pipes - socketpair():

As an aside, the socketpair() can be used to create a bi-directional pipe or more accurately an already connected anonymous Unix Domain Socket, which is created and passed to another process in a way that is identical to the manner in which pipe is used. The only difference being that each descriptor in the descriptor pair is capable of both reading and writing. What is written on one descriptor is available to be read on the other and vice versa.

A bi-directional pipe could be a useful shell feature, so we'll at least review socketpair.

#include <sys/types.h>
#include <sys/socket.h>

int socketpair(int domain, int type, int protocol, int sv[2]);

Without going into too much detail with respect to sockets, a typical socketpair invocation with pipe like behavior would normally look something like:

int sv[2];
socketpair(AF_UNIX, SOCK_STREAM, 0, &sv);

After which the descriptors in sv[0] and sv[1] are the bi-directional pipe.