## Foreword

In this post, we invited Dr. Andrea Tassi (e-mail: a.tassi@bristol.ac.uk, web: http://andreatassi.uk) to present the work he co-authored with Dr. Evgeny Tsimbalo and Dr. Robert J. Piechocki as part the manuscript “Reliability of Multicast under Random Linear Network Coding” [1], currently under review and available online at https://arxiv.org/abs/1709.05477. Andrea’s research addresses the pressing concern of how to assess reliability multicast communications. In particular, his manuscript proposes a novel accurate bound to express the probability that a multicast group as a whole successfully received an information message.

## Guest post - How Reliable are Multicast Communications? Talking to the Crowd and Being Heard

# Where Are We?

Often it is critical for network operators to be able to answer the
question *How good is good enough?* or in other words, *What is the
absolute minimum level of service reliability that can be provided?*. This
issue may appear trivial but it is not. Especially, when it comes to the
case of broadcast or multicast communications.

In the last twenty years, a significant amount of research has been carried out in the field of coding and reliability theory. There are multiple reasons why this happened but they can be more or less summarised as follows: (i) Reliable broadcast communications unleashed new business opportunities (e.g., next-generation infotainment market) and, (ii) Reliable multicast communications made possible to dispatch communications over professional cellular networks used by civilian or military authorities.

Nowadays, network coding (NC) is universally regarded as an efficient way of improving the reliability of point-to-multipoint (PtM) communications (namely, broadcast and multicast communications). In a multicast network operated under NC, the source node “dilutes” the content of a source message into several encoded packets. More formally, the source node encodes an information message of $K$ packets by combining the packets using random coefficients belonging to a finite field (ideally, it can be imagined as a special set defined containing ${0, \ldots, q}$). An encoded packet is therefore associated with a vector of coding coefficients. Each user needs to collect $K$ linearly independent vectors of coding coefficients to be able to decode the source message.

A key performance indicator (KPI) of such network can be the probability
that all users collect $K$ linearly independent vectors of coding
coefficients, which will be referred to as *probability of successful
delivery*. To come up with a mathematically exact expression for our KPI
proved to be extremely challenging. In fact, traditional models assume
large field sizes, which is far from being a realistic assumption. Field
sizes can be very limited in practice and often equal to two.

# It is “Just” a Matter of Counting…

To understand why it is so hard to estimate the probability of successful delivery of a PtM service, let us consider the simplest multicast network of all. A system consisting of one source node and one user. In this case, the expression of the probability of successful delivery after $N$ encoded packet transmissions is known and can be expressed as follows:

[\begin{equation} P(\epsilon) = \sum_{m=K}^{N} \binom{N}{m} (1-\epsilon)^{m} \epsilon^{N-m} \mathbb{P}(m,K)\label{eq:P_ptp} \end{equation}]

where $\epsilon$ is the user’s packet error rate (PER) and $\mathbb{P}(m,K)$ is the probability that an $m \times K$ matrix of elements generated uniformly at random over a finite field of $q$ elements is full rank, which is defined as

[\begin{equation} \mathbb{P}(m,K) = \prod_{i=0}^{K-1}(1-q^{i-m}).\label{eq:P_single_mat} \end{equation}]

Despite some preliminary attempts of extending \eqref{eq:P_ptp} to the case of a multicast network formed by two users [2], an exact formulation of $P(\epsilon)$ for a network of $L$ users is still unknown.

## Breaking with Tradition

Consider now the general case of an $L$-user multicast network. If the field size $q$ is sufficiently large, each user is able to recover the message with a high probability once it receives at least $K$ packets. In this case, the users will recover the message independently from each other and the probability of successful delivery can be approximated as follows:

[\begin{equation} P_L(\boldsymbol{\epsilon}) \cong \prod_{j=1}^{L}P(\epsilon_j),\label{eq:P_M_approx} \end{equation}]

where $P(\epsilon_j)$ is the probability of successful delivery of a source message over a point-to-point link with a PER $\epsilon_j$ corresponding to the $j$-th user, as calculated by \eqref{eq:P_ptp}.

Sadly, for practical field size, the accuracy of \eqref{eq:P_M_approx} is expected to decrease as the number of users grows. Essentially, \eqref{eq:P_M_approx} is not fit for link-budgeting multicast services that the network operator is expected to be liable for.

*OK, but why is that?* To answer this question, consider the following
example. The transmission of $N$ coded packets over $L$ lossy links can be
modelled as $N$ independent trials. In each trial, the packet can be
received by a single user, by a selection of at least two users or by none
of the users. Therefore, the collections of encoded packets received by the
users are *statistically correlated*. Dealing with this correlation is
usually computationally intractable.

## A Very Mathematical Hope

*Can we do something better than \eqref{eq:P_M_approx}? Yes, we can!* In
particular, in [1], we discuss the following intuition: *The packets
received simultaneously by all the users are responsible for the majority
of the correlation effect*. In [1, Theorem 3.1], this intuition led
us to formulate a new tight lower-bounded for the probability of successful
delivery in an $L$-user multicast network.

# Close Enough, Good Enough, Safe Enough

We investigated the performance of the bound as per [1, Theorem 3.1], by simulating a multicast network. Simulation results in terms of probability of successful delivery were obtained using the Kodo C++ network coding library and the Monte Carlo method. A homogeneous scenario is assumed, in which each user experiences the same PER $\epsilon$.

$(a)~\epsilon=0.01$

$(b)~\epsilon=0.1$

Figure 1: Probability of successful delivery for a binary code as a function of $N$ for $K=5$ and $L\in \left\lbrace 2,6,10\right\rbrace$.

Fig. 1 shows the probability of successful delivery $P_L(\epsilon)$ to $L\in \left\lbrace 2,6,10\right\rbrace$ users as a function of the number of coded transmissions $N$. The number of source packets is fixed to $K=5$ and two PER values common to all users are considered: $\epsilon=0.01$ and $0.1$. It can be observed that the proposed bound as per [1, Theorem 3.1] provides better approximation than bound \eqref{eq:P_M_approx}. Specifically, bound as per [1, Theorem 3.1] is particularly accurate for $\epsilon=0.01$, where it matches the simulated results for $L=2$ and closely follows them when $L\in \left\lbrace 6,10\right\rbrace$.

The tightness of the proposed bound in this scenario is explained by a large number of packets likely to be received simultaneously by all users. Indeed, when $\epsilon$ is small, the probability that a single transmitted packet is received by all users, which is equal to $(1-\epsilon)^L$, is large. This leads to a high correlation between the users coding matrices. This is in contrast with the traditional bound \eqref{eq:P_M_approx}, which does not take into account commonly received packets. As a result, bound \eqref{eq:P_M_approx} is particularly loose when $\epsilon$ is small. At the same time, as $\epsilon$ or $L$ increase, the probability that a packet will be received by all users decreases, so the accuracy of the proposed bound decreases too.

$(a)~\epsilon=0.01$

$(b)~\epsilon=0.1$

Figure 2: Probability of successful delivery for a binary code as a function of $N$ for $L=6$ and $K\in \left\lbrace 5,10,15,20\right\rbrace$.

Fig. 2 illustrates the results for the same scenario as in Fig. 1, but this time for different numbers of source packets $K\in \left\lbrace 5,10,15,20\right\rbrace$ and a fixed number of users $L=6$. In line with the previous results, the bound proposed in [1, Theorem 3.1] closely follows the simulated performance at $\epsilon=0.01$. The new bound is significantly more accurate than bound \eqref{eq:P_M_approx} at $\epsilon=0.01$. It can be observed that for this PER, the accuracy of the proposed bound somewhat decreases as $K$ grows. The reason is that longer source messages require more coded transmissions, which leads to a higher number of packet erasures for a given PER. As a result, the correlation between the users coding matrices reduces for larger $K$, which means a smaller number of packets received simultaneously by all users. For the same reason, it can be observed from Fig. 2 that the gap between the simulated results and bound \eqref{eq:P_M_approx} becomes smaller as $K$ grows. Fig. 2b demonstrates that both bounds are close to the simulated results for the worst-case scenario in terms of PER, especially when $K=20$.

To summarise, bound proposed in [1, Theorem 3.1] is tighter than the existing bound \eqref{eq:P_M_approx} traditionally used in the literature. The difference between the bounds is especially profound in realistic channel conditions when users have small values of PER.

# References

[1] E. Tsimbalo, A. Tassi, and R. J. Piechocki, “Reliability of Multicast un-
der Random Linear Network Coding,” *submitted to IEEE Transactions on
Communications.* [Online] https://arxiv.org/abs/1709.05477

[2] E. Tsimbalo, A. Tassi, and R. J. Piechocki, “Novel Performance Analy-
sis of Network Coded Communications in Single-Relay Networks,” *Proc. of
IEEE GLOBECOM 2016, Dec. 2016.* [Online] http://ieeexplore.ieee.
org/document/7842031/