An increasing number of services and products are made viable by what is essentially slow computers with connectivity provided by slow wireless networks e.g. as in IoT systems. Typically, in such systems deployment costs and battery-life are important and often communicating with such devices is expensive e.g. due to a high bandwidth cost or the drain on the device’s battery. Fleet management, equipment monitoring, metering etc. are areas where some of these costs can be substantial. One specific problem in such a system is file (or object) delivery to multiple deployed units, e.g. firmware/software updates which are occurring more and more frequently, or distribution of some data that needs to be available locally on the unit.
The problem of distributing a file or object using wireless broadcast is largely impacted by the way today’s digital communication systems operate. When data is transmitted from a sender it is divided into pieces and sent as packets, if the data is too large to fit into one packet it is spread over multiple packets. Consequently all these packets need to be received successfully for a receiver to be able to reassemble the original data.
In a system where there is only a single receiver and a good feedback channel such that the receiver can send information back to the sender, the receiver can simply request any piece that it did not receive. Such a system is called a unicast system and is how your Wi-Fi at home or your mobile phone network primarily operate. However, in a system with multiple receivers and a weak or missing feedback channel the following challenges arise.
- How to efficiently collect feedback from all receivers? - if the number of receivers is very large the system might spend more resources on feedback compared to transferring data.
- How can missing packets be efficiently repaired over a shared channel? If the receivers lost different pieces only some will benefit from a retransmitted piece.
How to Collect ‘Em All?
Fortunately, an approach that can address both of these challenges exists, namely erasure codes. Simplified, an erasure code allows for the creation of jokers over a set of original pieces, these jokers can then be used to repair any missing piece from the original set. If you are not familiar with erasure codes and their use in wireless networks, we recommend that you first read this blog post and watch the short videos it contains.
So for the challenge of efficiently repairing missing pieces an erasure code is clearly a useful tool, simply generate a joker and use that to repair different pieces at different receivers. But how can it also help us to efficiently collect feedback? Since the jokers are useful for repairing missing pieces from a set of original pieces, a receiver simply needs to communicate that it is missing pieces from that set, not which exact pieces are missing.
You may have experienced the underlying problem at hand while attempting to collect a complete set of turtles stickers, Pokémon or baseball cards - the more cards you already have the more unlikely it is that you will get a new one. This problem is called the Coupon Collector’s Problem and can be analyzed in a number of ways read more.
Let’s consider a few system architectures and how their design choices impact the problems of providing feedback and consequently repairing missing pieces, while utilizing the available network resources efficiently.
Broadcast with Dedicated Repair Channel
One approach is to transmit the data over a main broadcast channel and let all receivers request any missing packets over a secondary repair channel.
The main drawback is that a secondary two-way repair channel is needed. Additionally, the bandwidth of the repair channel is not well utilized since each receiver is served individually and consequently the necessary repair bandwidth is proportional to the number of receivers.
Broadcast with Feedback
If the receivers can send feedback over the channel on which they receive data, the requirement of a secondary channel is removed. However, this also means that the sender will serve all repair requests over a shared channel. If lost packets are simply sent again, only receivers that lost those packets will benefit. If instead an erasure code is used, any receiver that still misses a packet will benefit, since packets are never retransmitted - instead new combinations of packets are sent.
The main drawback of this approach is that a two-way channel is required - but many broadcast channels are one-way or the return direction is slow. Additionally, it can be difficult to efficiently utilize a broadcast channel for feedback - since only the sender is the intended target for the feedback.
Broadcast without Feedback (Blind Sending)
If the feedback channel is removed altogether, we have the simplest solution from a topology point of view. In such a system the data is essentially transmitted in a loop, for long enough to all receivers to receive the data. It is possible to simply send the packets one at a time repeating all packets a number of times, but this is highly inefficient especially as the size of the file and the number of packets grow - see the Coupon Collector’s Problem from before. A more practical approach addresses this challenge by using an erasure code.
The main drawback of an erasure code is that the amount of computation grows with the size of data over which the erasure code operates. Consequently, when the file or object becomes too large, it becomes necessary to split it, and we are then back to the Coupon Collector’s Problem - fortunately with a lot less coupons.
Better, Faster, Stronger
If we want to do better we can use a better erasure code, that allows us to code over a bigger piece of data without doing too much computation. One approach to achieving this is to improve the code design, e.g. by optimizing the underlying math, using simpler math or structure the codes in ways that lend them to efficient implementation. Another approach is that of concatenated codes where multiple codes are used in conjunction e.g. to mix the pieces from an erasure code operating over a file that have been split.
If you are interested in this problem and how to solve it, we look forward to sharing our solution.