# Queue

A system which includes a random "input" stream of requests (customers, calls, clients) which need "service" , and a mechanism (algorithm) which provides that "service" .

Typical examples of queues are automatic telephone exchanges in which requests, calls of subscribers (the input stream of calls), occur randomly, and the service mechanism consists of a fixed number $n$ of channels (lines, servers, trunks) of communication, each of which may be busy with the service of a call for a random time, equal to the duration of the conversation. If all $n$ channels are busy, then a new call is "lost" . The service mechanism (algorithm) can also include instructions concerns the free line the next call must use, or the suggestion to wait if the required line is busy, etc.

There are systems of other types, where each request must necessarily be served, such as, for example, the flow of aircrafts arriving at an airport for landing, or a flow of problems (a program) which must be processed on a computer.

The "random" part of a queue is conveniently described by random sequences or processes. The simplest queues can be described by two-dimensional random control sequences

$$\{ {\tau _ {j} ^ {e} , \tau _ {j} ^ {s} } : {0 \leq j < \infty } \}$$

of non-negative random variables. The sequence $\{ \tau _ {j} ^ {e} \}$ defines the stream of calls $e$: it gives the random times

$$\tau _ {0} ^ {e} ,\ \tau _ {0} ^ {e} + \tau _ {1} ^ {e} ,\ \tau _ {0} ^ {e} + \tau _ {1} ^ {e} + \tau _ {2} ^ {e} \dots$$

at which requests enter the system. Equivalently, the input stream may be characterized by a random process $\{ {e ( t) } : {t \geq 0 } \}$: the value $e ( t)$ denotes the number of calls entering the system up to time $t$. The second sequence $\{ \tau _ {j} ^ {s} \}$ describes the service process $s$: the random variable $\tau _ {j} ^ {s}$ denotes the time which is spent in service of the call with index $j$. Calls which have been served are removed from the system.

The description of control sequences can be greatly extended using labelled point processes, in which $\tau _ {j} ^ {e}$ represents the interval between the points and $\tau _ {j} ^ {s}$ are the labels of the points.

The specification of a control sequence does not uniquely determine the behaviour of the system. It is necessary to give also the service algorithm — the rule which determines the start of service and the way in which the behaviour of the calls depends on the state of the system.

The various service algorithms produce quite diverse forms of queues. The simplest ones are presented below.

## I. Systems with waiting (systems with a queue, queueing systems).

Calls which enter the system and are not served immediately form a "queue" waiting for service. In what follows, calls are served in order of arrival. The system is busy at time $t$ if at that time there is a queue or there is a call in service. Otherwise the system is called free at time $t$. Two types of queueing systems are distinguished.

$I _ {1}$. Ordinary systems. If the system is free, then it begins to act (serve the queue of calls) immediately upon the arrival of a call. If the system is busy, then the service of the next call begins on termination of the service of the previous call. Such a system is also called completely accessible.

$I _ {2}$. Systems with autonomous service. Here service begins only at times $0 , \tau _ {1} ^ {s} , \tau _ {1} ^ {s} + \tau _ {2} ^ {s} ,\dots$.

## II. Systems with finite waiting room (systems with bounded queues).

Suppose that the length $q ( t)$ of the queue at time $t$ is equal to $n \geq 1$ if at this time there is one call in service and there are $n - 1$ calls awaiting service. Let $q _ {n} = q ( \tau _ {0} ^ {e} + \dots + \tau _ {n-} 1 ^ {e} )$ be the length of the queue at the time of arrival of the $n$- th call (not counting this call). In systems with finite waiting room the $n$- th call is "lost" and removed from the system if at the time of its arrival the queue length $q _ {n}$ is equal to some maximum admissible value $N \geq 1$. The number $N$ is an essential characteristic of the system. If $N = \infty$, ordinary systems with unbounded queues arise.

Systems with randomly-bounded queues and systems with randomly-bounded waiting time are also considered.

## III. Systems with refusals (loss systems).

These are systems with finite waiting room in the case $N = 1$. For a loss system autonomous service is obviously not considered.

In each of these simplest systems the assignment of a control sequence completely determines the evolution of the system. In other words, for each elementary event $\omega$ and any $t$ the state of the system at time $t$ is uniquely determined.

In addition to the above forms of service more complicated systems are possible. These are associated with more complicated control sequences and service algorithms.

## IV. Batch input stream and batch service.

The control of these systems is by a four-dimensional control sequence

$$\{ {\tau _ {j} ^ {e} , \nu _ {j} ^ {e} , \tau _ {j} ^ {s} ,\ \nu _ {j} ^ {s} } : {j \geq 0 } \} ,$$

where $\nu _ {j} ^ {e}$ and $\nu _ {j} ^ {s}$ are non-negative and integer-valued. The meaning of the new variables is as follows: calls enter in batches of sizes $\nu _ {0} ^ {e} , \nu _ {1} ^ {e} ,\dots$( at corresponding times $\tau _ {0} ^ {e} , \tau _ {0} ^ {e} + \tau _ {1} ^ {e} ,\dots$); service also occurs in batches, in the first batch $\nu _ {0} ^ {s}$ calls are served, in the second $\nu _ {1} ^ {s}$, etc. (the size of these batches may be smaller if there are not enough calls in the queue). Here a time $\tau _ {k} ^ {s}$ is spent in the service of the $k$- th batch.

For systems with batch input and batch service the same service algorithms as described above are possible.

## V. Multi-channel systems (multi-server systems).

In these systems service can occur simultaneously over $m \geq 1$ channels, so that the service of the next call (or batch of calls in batch service) can begin before completion of the service of the previous one. Service algorithms for multi-channel systems are similar for all the types of service considered (each channel acts as an independent service mechanism). It is only necessary to supplement these algorithms by indicating to which channel a call must be directed if several channels are free simultaneously. Here, as before, a time $\tau _ {i} ^ {e}$ is spent in the service of the $i$- th batch (of size $\leq \nu _ {i} ^ {s}$).

A multi-channel system is called a loss system if a call which finds all channels busy at the time of its arrival is "lost" , and is thus removed from the system.

Sometimes, in order to simplify the nature of the control sequences for multi-channel systems, it is convenient to assign not two but $m + 1$ two-dimensional control sequences

$$( \tau _ {i} ^ {e} , \nu _ {i} ^ {e} ) ,\ ( \tau _ {i} ^ {s _ {1} } , \nu _ {i} ^ {s _ {1} } ) \dots ( \tau _ {i} ^ {s _ {m} } , \nu _ {i} ^ {s _ {m} } ) ,$$

so that the $k$- th service channel is controlled by the sequence $( \tau _ {i} ^ {s _ {k} } , \nu _ {i} ^ {s _ {k} } )$, $k = 1 \dots m$. For example, $\tau _ {i} ^ {s _ {k} }$ is the service time of the $i$- th batch for calls in the $k$- th channel.

The above classification is far from being exhaustive. For example, systems in which calls are subdivided into two or more types and in which one type has priority of service over the others are widely used (such a situation arises when the cost of waiting for one type of call is higher than for the others). The characterization of such systems requires the introduction of several input streams, corresponding to different types of requests. Also related to systems with priority of service are those in which the service mechanism requires interceptions in operation. The law governing the appearance and length of the interceptions can be characterized by a special input stream.

In the literature on queueing theory other special forms of service systems are discussed. However, one must keep in mind here that:

1) the fundamental and most common types of queues are within the above classification;

2) the methods for investigating queues for various systems, as a rule, are much the same and are fairly well illustrated by the methods of study for the "fundamental" systems; underlying these methods is the apparatus of probability theory, both in its general aspects and in specially developed aspects.

The main aim is to investigate the distributions of various parameters which characterize the behaviour of the system (for example, the queue length, the waiting time until the next service, the probability of loss of a given call, etc.). Of major interest in this connection are ergodic theorems, which describe the behaviour of these characteristics over long time intervals. For example, one characteristic of the efficiency of an automatic telephone exchange is the proportion of calls which are lost, that is, the limit $p$ as $t \rightarrow \infty$( if it exists) of the ratio $r ( t) / e ( t)$ of the number $r ( t)$ of calls which are lost up to time $t$ and the number $e ( t)$ of calls which arrive in the same time period. This limit can, with good reason, be called the loss probability. The parameters characterizing a queueing system are the limits as $n \rightarrow \infty$ of the distributions ${\mathsf P} \{ w _ {n} < x \}$ and ${\mathsf P} \{ q _ {n} < x \}$ of the waiting time of the $n$- th call $w _ {n}$ and the queue length $q _ {n}$ at the time of arrival of the $n$- th call.

The method of investigation often consists of the search for Markov processes or sequences which characterize the state of the system. If, for example, the random variables $\tau _ {j} ^ {e}$ and $\tau _ {j} ^ {s}$ are exponentially distributed and are independent for different indices, then the "queue length process" $q ( t)$ will be Markovian and will have a description in terms of simple differential equations for the stationary distribution. In other cases one usually tries to construct random times $t _ {1} , t _ {2} \dots$ such that $q ( t _ {n} )$, or the values of other characteristics (for example, the waiting time), taken at times $t _ {1} , t _ {2} \dots$ will form a Markov chain. This is the so-called method of the imbedded Markov chain. This method is often used in a modified form when a semi-Markov process is constructed which describes the state of the system of interest.

In more complicated cases it is appropriate to use asymptotic methods (see Queueing theory) or to resort to simulation by a Monte-Carlo method of the random processes which describe the queue.

In the articles Queue with waiting and one service channel; Queue, multi-channel with waiting; Queue with refusals; Queue input stream of calls there are more detailed discussions of the basic service systems and input streams. In these articles the following notations are adopted.

$E$ is the class of sequences of independent random variables with exponential distribution. The notation $\{ \tau _ {j} ^ {e} \} \in E$ means that

$${\mathsf P} \{ \tau _ {j} ^ {e} > x \} = e ^ {- \alpha x } ,\ \ \alpha > 0 .$$

The notation $\{ \xi _ {j} \} \in G _ {I}$ means that the random variables $\xi _ {j}$ are independent and identically distributed (the distribution itself may be arbitrary). Relations of the form $\{ \tau _ {j} ^ {e} \} \in E$ or $\{ \tau _ {j} ^ {e} \} \in G _ {I}$ also usually suppose that the control sequence $\{ \tau _ {j} ^ {e} \}$ does not depend on the remaining control sequences.

The class of sequences which are stationary in the narrow sense is denoted by $G _ {S}$.

This notation may also be applied to multi-dimensional sequences. For example $\{ \tau _ {j} ^ {e} , \tau _ {j} ^ {s} \} \in G _ {I}$ means that the two-dimensional sequence is stationary and composed of independent vectors.

For simplicity, as a rule, the discussion is restricted to "single" input and output processes, $\nu _ {j} ^ {e} \equiv v _ {j} ^ {s} \equiv 1$( calls appear and are served one by one). The possibility of generalizations to the "multiple" case (calls appear or are served in batches: $\nu _ {j} ^ {e} \not\equiv 1$ or $v _ {j} ^ {s} \not\equiv 1$) are mentioned separately.

In addition, the nature of the control sequences will be simple and uniform if one excludes initial conditions. Namely, if a control sequence $\{ \tau _ {j} ^ {e} , \tau _ {j} ^ {e} \}$ for $1 \leq j \leq \infty$ is considered, it is supposed that $q ( 0) = 0$ and that the first call arrives in the system at the moment $\tau _ {0} ^ {e} = 0$. If the control is given by an input process $e ( t)$, then $\tau _ {0} ^ {e}$ is not fixed.

For references see Queueing theory.

There is a shorthand notation introduced by D.G. Kendall [a2] to describe various queueing situations in terms of an inter-arrival time distribution (which describes the process $\{ \tau _ {j} ^ {e} \}$) and a service time distribution (describing the process $\{ \tau _ {j} ^ {s} \}$) and the number of servers. In this notation the symbol $M$ stands for the negative exponential distribution, $H$ for the hyperexponential distribution ( $F( t)= 0$ for $t \leq 0$; $F( t) = \sum _ {k=} 1 ^ {n} a _ {k} ( 1- e ^ {- \lambda _ {k} t } )$ for $t > 0$, $\lambda _ {k} > 0$, $a _ {k} > 0$, $a _ {1} + \dots + a _ {n} = 1$), $E$ for the Erlang distribution, $D$ for the distribution degenerated at a positive value, $K$ for a distribution whose Laplace–Stieltjes transform is a rational function, and $G$ for a general distribution of a non-negative random variable that is not further specified. Thus, $M / G /1$ stands for a single-server queue with negative exponential inter-arrival distribution and a general service distribution. This notation is now in fairly general use.