Fragment Forwarding in Lossy Networks

This paper evaluates four forwarding strategies for fragmented datagrams in the IoT on top of the common CSMA/CA MAC implementation for IEEE 802.15.4: hop-wise reassembly, a minimal approach to direct forwarding of fragments, classic end-to-end fragmentation, and direct forwarding utilizing selective fragment recovery. Additionally, we evaluate congestion control mechanisms for selective fragment recovery by increasing the feature set of congestion control. Direct fragment forwarding and selective fragment recovery are challenged by the lack of forwarding information at subsequent fragments in 6LoWPAN and thus require additional data at the nodes. We compare the four approaches in extensive experiments evaluating reliability, end-to-end latency, and memory consumption. Our findings indicate that direct fragment forwarding should be deployed with care, since higher packet transmission rates on the link layer can significantly reduce its reliability, which in turn can even further reduce end-to-end latency because of highly increased link layer retransmissions. Selective fragment recovery can compensate this disadvantage but struggles with the same problem underneath, constraining its full potential. Congestion control for selective fragment recovery should be chosen so that small congestion windows that are growable together with fragment pacing are used. In case of less fragments per datagram, pacing is less of a concern, but the congestion window has an upper bound.


I. INTRODUCTION
The advent of the Internet of Things (IoT) increased deployment of resource constrained, heterogeneous, and wireless devices that join the wider Internet. In turn, this also significantly widened range of heterogeneous access networks which introduce a wide variety of maximum packet sizes in the link layer as shown in Figure 1. On the network layer, nodes predominantly speak IPv6 [6] with a mandatory transparent Maximum Transmission Unit (MTU) size of at least 1280 bytes. Hence, fragmentation is necessary in order to communicate using these link layer technologies.
One of the most popular link layer technologies, IEEE 802.15.4 [1] only support a very limited number of bytes in its base specification. For efficiency, information required to forward a packet cannot be encoded in every fragment but is only present in the first in its IPv6 adoption layer 6LoW-PAN [7]. This is in contrast to end-to-end fragmentation in higher layers such as in the IP protocols (see Figure 2a). Internet (1500 bytes) ! IEEE 802.15.4 [1] (127 bytes) ITU-T G.9903 [2] (400 bytes) Bluetooth LE [3] (≥ 1280 bytes) LoRaWAN [4] (59-250 bytes) SigFox [5] (96 bytes) NB-IoT [5] (1600 bytes) to the most current IEEE 802.15.4 specifications including 802. 15.4e not being easily accessible to the developers of such operating system. All of the MAC modes introduced above are able to form meshes. As such forwarding of packets is needed. There are two concepts for forwarding fragmented datagrams in 6LoWPAN. First, reassembly is performed at every hop (hop-wise reassembly), followed by re-fragmentation when forwarded on another constrained link (see Figure 2b). As the forwarding information only being stored in the first fragment this is the simplest solution. Second, individual fragments are forwarded (fragment forwarding) by recording the forwarding information required from the first fragment on all participating nodes. This recorded information then can be used to forward all subsequent fragments to the next hop [12, Section 2.5.2], [13] (see Figure 2c).
Since loosing a single fragment requires resending of the whole datagram in an upper layer, fragment forwarding can be extended by selective fragment recovery that allows the reassembling end-point to cumulatively acknowledge received fragments to the fragmenting end-point (see Figure 2d). In general, selective fragment recovery can also be used for hopwise reassembly, but one has to take the increased latency due to the acknowledgment mechanism into account when doing so.
Both 6LoWPAN forwarding approaches-hop-wise reassembly and fragment forwarding-have advantages and disadvantages. While direct forwarding can lead to lower latency, it also sends more packets on average over time, leading to a higher load on the medium. Selective Fragment Recovery is posed to only increase the number of packets due to the added acknowledgment messages. Hop-wise reassembly on the other hand is part of common network stacks, which can be a benefit on more constrained nodes, where program memory is scarce [14]. This can however only be guaranteed by the more coordinated MAC modes TSCH and DSME. If reliability can already not be guaranteed by the MAC layer, all fragment forwarding approaches pose to do more harm than good, as with every fragment the chance for packet loss multiplies. This is in stark contrast to the widely deployed solution of the very thin MAC layer provided in many OS implementation.
In this paper, we comparatively assess the performance and resource consumption of end-to-end fragmentation, hop-wise reassembly, and direct fragment forwarding over the very thin IEEE 802.15.4 MAC layer, as well as fragment forwarding that allows for selective fragment recovery. As part of this work, we also provide an independent implementation of both simple fragment forwarding and its selective fragment recovery variant, which we showcase to allow deeper insights into our evaluation results.
We also compare four common congestion control mechanisms with selective fragment recovery: (i) A simple mechanism as proposed in the appendices of RFC 8931, (ii) a TCP-Reno-like approach that only takes loss into account for congestion control, (iii) a TCP-Reno-like approach with Alternative Exponential Backoff (ABE) that introduces special behavior for Explicit Congestion Notification (ECN), and finally (iv) a QUIC-like approach that adds packet pacing on top of that. To evaluate the different approaches we developed CongURE, a framework for Congestion Control Utilzing Reusable Elements. CongURE allows for dropin replacements of the congestion control mechanism in a protocol.
Our findings reveal the drawbacks of using the widely deployed very thin MAC layer described above with 6LoWPAN fragment forwarding. While hop-wise reassembly manages to transmit at least a fraction of the even with a higher number  of fragments, direct fragment forwarding techniques quickly drop to 0% performance. Selective fragment recovery helps to mitigate those drawbacks, but only when it comes to packet delivery ratio and with low window sizes. Classic congestion control mechanisms based on Additive Increase, Multiplicative Decrease (AIMD), slow start, congestion avoidance, and recovery help to find a balance between latency and high packet delivery ratio with SFR over the very thin MAC layer when combined with RTT-based pacing.
In summary, our work on a comprehensive picture on fragment forwarding in lossy networks makes the following contributions: 1) Comparative evaluation of four fragment forwarding approaches (end-to-end fragmentation, hop-wise reassembly, direct fragment forwarding, and selective fragment recovery) on top of a very thin MAC layer.
2) The analysis of congestion control in the context of selective fragment recovery. To this end, we design the integration of three common congestion control mechanisms (TCP Reno, TCP ABE, and QUIC congestion control) in SFR, and implement these mechanisms. 3) Comparative evaluation of congestion control mechanisms in SFR on top of a very thin MAC layer. 4) CongURE, a lighweight congestion control framework for low-end IoT devices, which allows for the drop-in replacement of congestion control mechanisms independently of a specific protocol.
The remainder of this paper is structured as follows. In Section II, we provide background on 6LoWPAN fragmentation and forwarding as well as congestion control options for SFR. In Section III, we outline our implementations of the different fragment forwarding options and the CongURE framework. We present the results of our experiments on fragment forwarding and congestion control in Section IV and Section V, respectively. In Section VI, we summarize related work. We discuss our overall findings in Section VII, and close with a conclusion and an outlook in Section VIII. Table 1 summarizes the key abbreviations used throughout this paper.

II. BACKGROUND AND PROBLEM STATEMENT
The IETF specified the 6LoWPAN protocol [7] to allow for transmissions of IPv6 packets over IEEE 802.15.4 [1] networks, a widely used link layer technology in the IoT. While IPv6 requires a Maximum Transmission Unit (MTU) of at least 1280 bytes [6], IEEE 802.15.4 is only able to handle link layer frames of up to 127 bytes-including the link layer header-however, with link layer security enabled this leaves 81 bytes for the network layer [15]. Considering 48 bytes for the IPv6 and UDP headers this leaves only 33 bytes for application data per frame. To enable IPv6 communication in such a restrictive environment, 6LoWPAN provides both header compression [16], [17] and datagram fragmentation [7]. Header compression is applied to a datagram before it is sent, even before it is fragmented. There are two types of header compression supported: (i) the classic approach of field elision based on a bit mask as specified in [16] and (ii) Generic Header Compression as described in [17] based on LZ77-style compression [18]. (i) is expected to be supported by all 6LoWPAN nodes. (ii) is not widely deployed yet and requires a capability exchange between nodes. For both compression schemes, however, the magnitude of their impact on performance is not as large as for fragmentation: Header compression aims to add a slight CPU overhead to reduce the need for fragmentation while fragmentation multiplies the performance issues of a single frame with every fragment a datagram is separated into. As such, fragmentation will be the main focus of this paper, we will however briefly go into the details of header compression in Section II-A.
For completeness we note that the concept of 6LoWPAN (or more generally 6Lo) is not limited to IEEE 802.15.4, but VOLUME X, 2016 also can be used in other link layer technologies such as PLC [2] or Bluetooth Low Energy [3] (see Figure 1). There is also Static Context Header Compression and Fragmentation (SCHC) [19] for even more restrictive link layer technologies such as LoRaWAN [20], SigFox [21], and NB-IoT [22]. The fragmentation modes of SCHC, while incompatible to the fragmentation approaches of 6LoWPAN, are based on the same principles. As such, the same conclusions of this paper apply as we will discuss in this paper.

A. 6LOWPAN HEADER COMPRESSION
In this paper, we will focus on fragmentation. However, to understand how fragmentation decisions are made and also why we can not just carry a compressed header in every fragment, basic understanding of header compression is needed. Of the two approaches, classic 6LoWPAN header compression [16] and Generic 6LoWPAN header compression [23], we will give a short overview over the first, as our 6LoWPAN implementation currently does not support the latter. As there is little difference to be expected for the different header compression approaches in the results for the fragmentation approaches, Generic 6LoWPAN header compression was thus not considered in our evaluation and only classic 6LoWPAN header compression was used.
Classic 6LoWPAN header compression has two modes IPv6 Header Compression (IPHC) (see Figure 3a) and Next Header Compression (NHC) (see Figure 3b). Both use the elision or replacement of fields and using lower layer information and bit-fields signifying which fields are elided or replaced.
IPHC is applied at every hop in a network. Both the 4-bit version field and the 16-bit length field of the IPv6 header are elided in any case. The version field always has the value 6 and is thus redundant and the length field can be implied from the lower layers-either 6LoWPAN fragmentation headers or the link layer. The 8-bit traffic class field and the 20-bit flow label field of the IPv6 header can be elided: The TF flags encode the information which of those fields or their sub-fields are elided. Special values of the 8-bit hop limit field of the IPv6 header mapped to the 2-bit HL field: 00 means the hop-limit is carried inline, 01 means the hop limit is implied to be 1, 10 means the hop limit is implied to be 64, and 11 means the hop limit is implied to be 255. In all but the first case the hop limit field is elided.
IPHC allows for both stateless and stateful address compression for the 128-bit source and destination address fields within the IPv6 header. The combination of SAC (source address compression), SAM (source address mode), M (multicast compression), DAC (destination address compression), and DAM (destination address mode) flags notify the respective mode or if the address is carried fully inline. Please read the listing in [7] for more details. In short, however, when just looking at unicast communication, stateless address compression is used for link-local addresses (eliding the fe80::/64 prefix) and stateful address compression can be used for all other addresses. The prefix elided by stateful compression is encoded for both source and destination address as an optional 4-bit context identifier (CID) extension. If the identifiers for both source and destination address are 0 or if both addresses use stateless address compression that extension can be elided, signified by clearing the CID flag in the IPHC encoding. How the CID-to-prefix mapping is shared between the nodes is out of scope of the IPHC specification [16]. One way is provided by the Neighbor Discovery optimization for 6LoWPAN [24].
The suffixes or interface identifiers (IIDs) of addresses can be elided when they are based on the link layer address, as specified in [7] and if one of the nodes involved in the communication is the node with that link layer address. There are no special signifier bits for IID elision.
NHC to date supports IPv6 extension headers and UDP headers. If NHC is used, the NH bit in the IPHC encoding is set. Otherwise, the 8-bit next header field is carried inline and the remaining headers are uncompressed.
IPv6 extension header compression allows for the removal of padding within the extension headers and thus internal fragmentation of the headers by re-defining the length field in units of bytes and not 8 bytes as specified in [6]. If an extension header exceeds the length of 255 and thus overflows the 8-bit length field, NHC can not be used for that extension header. The type of extension header is signified by the EID (Extension Header ID) flags in the encoding. It also allows for encapsulated IPHC encodings and thus compressed IPv6-in-IPv6 tunneling. The 8-bit next header field of an extension header can also be elided, if the following header is also compressed by NHC.
UDP header compression allows for the elision of the checksum if and only if the upper layer provides other means of integrity checks of the payload. If the checksum can be elided, the C flag in the UDP compression header is set to 1. Moreover, port compression is supplied for special port ranges. These port ranges are encoded with the 2-bit P flag field. For more details about those port range encodings, please refer to [16].

B. WHY NOT IPV6 END-TO-END FRAGMENTATION (E2E)?
IPv6 fragmentation requires the lower layer to be able to support a minimum MTU of 1280 bytes. As described in Section II, however, many IoT link layer technologies, namely IEEE 802.15.4 with 127 bytes or ITU-T G.9903 with 400 bytes, only offer MTUs much smaller than 1280 bytes. Because of that, classic IPv6 fragmentation alone is not suitable for these link layer technologies. Due to the same limitations we also need to save as many bytes of packet space as possible. After all, even with header compression this forwarding information can contain up to 32 bytes-or 2 × 128 bits-of data in IPv6 in form of both IPv6 addresses in the header when compression of those addresses is not possible (see Section II-A). 32 bytes are more than a fifth of the frame SDU of IEEE 802.15.4 of 127 bytes. This justifies putting the forwarding information only in the first fragment.    IPv6 fragmentation also requires a 32-bit identifier to associate fragments to a complete datagram end-to-end [6], see Figure 5). A link-wise identifier of shorter length can help to further save space in a constraint link layer frame.
Furthermore, some extension headers have to be carried in every fragment [6] and-as we outlined in II-A-the capabilities of next header compression for extension headers is very limited.
However, as we will see in Sections II-C and II-D, other fragment forwarding approaches require overhead in form of new data structures. As such, we will use a modified version of IPv6 fragmentation that allows for such small fragments in our analysis to investigate the impact of those data structures on performance. For this we simply set the MTU for the interface so that a compressed IPv6 fragment fits the linklayer PDU at all hops in the network and configure the network stack as such to not use the 6LoWPAN module. This approach we will call end-to-end fragmentation (E2E, see Figure 8a) in the remainder of this paper.

C. BASIC FRAGMENTATION AND REASSEMBLY IN 6LOWPAN
In 6LoWPAN, datagram fragmentation implements the following common approach, as it is also applied by, e.g., IPv6 fragmentation (see Section II-B): Before sending a datagram to the underlying link layer, the network layer checks whether the data exceeds the maximum payload length (commonly referred to as SDU, Service Data Unit) of the link layer. If the data size complies with the SDU, a single datagram VOLUME X, 2016 is sent without any modification. If the data size does not comply with the SDU, a datagram is divided into multiple fragments such that the content of each fragment matches the SDU, see Figure 6. Each fragment includes a fragment header containing information to assemble the datagram [7]: The fragmentation header of the first fragment contains a 16-bit datagram tag to identify the fragment on the link and an (uncompressed) datagram size in bytes as an 11-bit number, see Figure 4a. The datagram size is able to be encoded as a 11-bit number as the MTU for IPv6 over 6LoWPAN is capped at 1280, which is lesser than 2 11 − 1 = 2047 but greater than 2 10 − 1 = 1024. This way we can keep the header as small as possible, as the 5-bit dispatch to identify the header type, the 11-bit datagram length, and the 16bit datagram tag form a header of only 32 bits or 4 bytes. All subsequent fragments after the first fragment carry in addition to the header fields of the first fragment header an offset to this fragment in units of 8 bytes, see Figure 4b. Consequently, all payloads in a fragment must be of a length that is a multiple of 8.
The receiver identifies multiple fragments that belong to the same datagram by comparing two values: the link layer source addresses and the datagram tag. Then, the receiver network stack stores all fragments of an incoming datagram in the reassembly buffer for up to 10 seconds. These identifiers to assign fragments to a datagram D we will refer to by (a, g a (D)) with a being the link layer source address and g a (D) being the datagram tag for D on the link to a in the following (also cf., Figure 4).
A brief back-of-the-envelope calculation shows that a node needs to allocate at least 1291 bytes of memory per reassembly buffer entry to reassemble a fragmented datagram: • At most 8 bytes, plus 1 byte for the link layer source address to store their length as IEEE 802.15.4 supports both 64-bit EUI-64s and a 16-bit short addresses as addressing format, • 2 bytes for the datagram tag, and • 1280 bytes for the maximum expected size of an IPv6 datagram.
1291 bytes are significant memory requirements on constrained devices, which typically offer memory within the range of several kilobytes [14]. Especially in a multihop network-a common deployment scenario in the IoTit becomes challenging to provide enough resources to store a sufficient number of reassembly buffer entries. In Section III-A, we show how to save memory in a concrete implementation.
It is worth noting, that RFC 4944 [7] specifies the link layer destination address as well as the datagram size as identifying parameters. In RFC 8930 [13], however, it is argued, that only the parameters here are necessary for the identification. As this paper compares classic hop-wise reassembly from RFC 4944 [7] to the approaches put forward in RFCs 8930 [13]   on RFC 8930 [13]-we decided to also use the reduced (a, g a (D)) identifier for classic hop-wise reassembly.

D. HOP-WISE REASSEMBLY (HWR) VS. DIRECT FRAGMENT FORWARDING (FF)
The destination address in the IPv6 header guides forwarding. In 6LoWPAN fragmentation, however, the IPv6 header is only present in the first fragment. To enable intermediate nodes in a multihop network to forward fragments without this context information, two solutions are proposed: hopwise reassembly and direct fragment forwarding.
The naive approach to handle fragmented datagrams in a multihop network is hop-wise reassembly (HWR) [7], [6]. In HWR, each intermediate hop between source and destination assembles and re-fragments the original datagram completely. This leads to three drawbacks. First, each intermediate hop needs to provide enough memory resources to store all fragments in the reassembly buffer (see Figure 8b). Second, the memory requirements are unbalanced between nodes in the network. Considering highly connected nodes (see node e in Figure 7), these nodes need to cope with the reassembly load of all their downstream nodes. Third, datagram delivery time is bound by the time needed to receive all fragments of the datagram. Papadopoulos et al. [26] underscored these problems in more detail.
Fragment forwarding (FF) [13] tackles the drawbacks of HWR by leveraging a virtual reassembly buffer (VRB) [27], see Figure 8c. In contrast to a reassembly buffer, a VRB only stores references to link the subsequent fragments to the first fragment such that intermediate nodes can determine the next hop. In detail, the VRB is applied as follows. Each entry represents the source address and the incoming datagram tag (a, g a (D)) (cf., Section II-C), the next hop link layer address b, and the outgoing datagram tag g b (D).
FIB lookup for forwarding information of datagram D resolves to address n (n, gn (D)) → Reassembly buffer entry amendment for (n, gn (D)) pair (filled 20%) This has two implications. First, an intermediate node can ensure that datagram tags are unique between a node and its neighbors. Second, all fragments belonging to the same datagram will travel the same path.

E. SELECTIVE FRAGMENT RECOVERY (SFR)
For both HWR and FF, losing fragments is costly if the complete datagram needs to be retransmitted by an upper layer. To that end, Selective Fragment Recovery (SFR) was introduced to 6LoWPAN within the IETF [25]. SFR utilizes the same mechanisms of FF, but introduces new header formats, the recoverable fragment (RFRAG) header and RFRAG acknowledgement, and is thus a completely new protocol. In addition to datagram tag and datagram offset those headers include a 5 bit sequence number, see Figure 4c which allows for lightweight cumulative acknowledgments (ACKs) in form of a 32 bit bitmap, see Figure 4d. Those acknowledgments can be requested by the fragmenting endpoint using the X flag in the RFRAG header. Using a configurable window size for setting this flag and an also provided Explicit Congestion Notification (ECN) flag E, optional congestion control can be provided for SFR. As SFR only uses the source address and the tag to identify the datagram, a reverse lookup in the VRB by next hop address and outgoing tag can be used to return the ACKs to the datagram source, see Figure 8d. Another crucial difference to HWR and FF is that datagram size and offset share the same field: while in the first fragment that field denotes the size of the datagram, in all subsequent fragments fragments it denotes the offset. As such, the true size of a datagram can only be known by the reassembling endpoint, when the first fragment arrived.
To recover fragments, the fragmenting end-point arms a timeout-the Acknowledgement Request (ARQ) timeoutwhenever it sends out a fragment marked with the ACK request flag. If an ACK is not received within that timeout or a received ACK with the same datagram tag as the ACK requesting fragment marks a previously sent fragment as not received in its 32 bit bitmap, the fragmenting end-point may resend the fragment. If a pre-configured number of resends fails, the fragmenting endpoint can either try to resend the whole datagram a pre-configured number of times or give up on sending the datagram altogether. The maximum number of fragments in flight before requesting an ACK is called the window size and can be either configured statically or controlled by an optional congestion control mechanism. To further reduce congestion within the network, [25] specifies a configurable inter-frame gap (IFG) that defines the time an SFR-capable node has to wait between sending either fragments or ACKs.

F. CONGESTION CONTROL FOR 6LOWPAN FRAGMENTATION
Since the ACK of SFR is cumulative, multiple fragments can be sent at once. However, considering the resource restrictions of the constraint devices on path, the number of fragments should to be limited by a congestion control (CC) mechanism. Moreover, as we will show, the singleantenna radio used by such devices can itself be a point of congestion for directly forwarded fragments. As such, CC should to be an integral part of SFR, to find a balance between not exhausting the resources en-route and reducing latency penalties due to having to wait for an ACK after a number of fragments have been sent.
SFR supports an Explicit Congestion Notification (ECN) mechanism. RFC 8931, however, defines only that a reassembling end-point that receives any RFRAG with the ECN flag E set must set the E flag in the ACK to acknowledge that fragment as well (cf., Figures 4c and 4d). As congestion and loss are not clearly distinguishable in LLNs, considering ECNs comes to the forefront of CC in LLNs. Furthermore, in the appendices of RFC 8931 [25] some considerations are made how actual CC can be provided for SFR, but it states that more experimentation is needed. This paper aims for this experimentation.
We consider four CC mechanisms specified by the IETF to analyze different aspects: (i) the base mechanism described in the appendices of RFC 8931, (ii) TCP Reno as described in RFC 5681 [28] to use the same mechanism for CC mostly used in the wider Internet, (iii) an extension of the latter using an Alternative Exponential Backoff (ABE) [29] with ECNs for the congestion window reduction, and (iv) the congestion control mechanism of QUIC [30] to analyze the impact of its adaptive pacing mechanism.  [10], [11]. However, many 6LoWPAN implementations opt to only implement very simple MAC primitives provided by the respective radio. For DSME, the reason is that no official IPv6-over-DSME specification exists yet. But this is also due to the most recent IEEE 802.15.4 specifications not being accessible for most developers. As such, operating systems such as Contiki, RIOT, or Zephyr opt to only provide a very thin MAC layer based around CSMA/CA, link layer retransmissions, and acknowledgements.

III. SYSTEM DESIGN AND IMPLEMENTATION IN RIOT
A thorough experimental evaluation of protocols requires sound software implementations. For the sake of comparison, the protocols under investigation should be analyzed on the same system. Unfortunately, there is no software basis available which assembles all required components for constrained devices. In this paper, therefore, we extend RIOT [31], a common IoT operating system. By selecting an open source platform and making our software publicly available we enable reproducible research [32], [33]. Based on our extensions, we gain detailed insights into system and network performance. In the remainder of this section, we present design, implementation, and configuration choices to better understand the subsequent evaluation.

A. SYSTEM DETAILS ON 6LOWPAN
RIOT provides a stable 6LoWPAN implementation as part of its default network stack, GNRC [31]. Instead of statically allocating packet space for each reassembly buffer, it uses the preconfigurable packet allocation arena of GNRC, called gnrc_pktbuf, to dynamically allocate packet buffer space of varying length within it. This allows for high resource efficiency and flexibility. By storing the major part of the IPv6 datagram (1280 bytes) only in the packet buffer, the 6LoWPAN stack requires 22 bytes (plus some additional bytes for management), instead of allocating the complete 1291 bytes (cf., Section II-C).
To provide low delays and high throughput, the fragmentation is done asynchronously. For this purpose, the reference to the datagram that needs to be fragmented is stored in a fragmentation buffer. The data of the datagram resides in gnrc_pktbuf. In addition to the datagram, the fragmentation buffer also contains meta-information needed for fragmentation, including the original datagram size and its tag.

B. FRAGMENT FORWARDING
We extend 6LoWPAN in GNRC to support direct fragment forwarding. One crucial implementation choice relates to the creation of the first fragment. The first fragment may include the compression header [16], which may change size during network traversal as compression contexts such as link-layer addresses change. Because of that, the compression may be less or more effective depending on header updates made by intermediate forwarders. In the worst case, the  packet becomes less compressed, leading to additional fragmentation. To tackle this problem, we apply a wellknown approach by keeping the first fragment as minimal as possible [27], i.e., the original sender includes only the fragment and compression headers and pushes the payload to the subsequent fragment. It is worth noting that this approach does not increase the overall number of fragments compared to a naive approach that minimizes the size of the last fragment. In fact, it will reduce the likely creation of additional fragments.
We support this mechanism not only on the original sender but also on intermediate forwarders for the case that the original sender did not provide enough space for the expanding compression header, see Figure 10. This is possible, as all subsequent fragments also contain an offset, which indicates fragmentation relating to the first fragment. Furthermore, it simplifies the implementation greatly, which in turn saves ROM. Since the fragmentation buffer is used for this, its default size of 1 needs to be increased so that the node is able to handle multiple datagrams-forwarded datagrams and datagrams sent by the node itself-at the same time.
To keep the implementation simple, we only forward fragments when the first fragment is received in order, otherwise we reassemble the packet completely. This can be considered a fall-back to hop-wise reassembly. We are able to do this in contrast to just dropping the fragments as proposed in RFC 8930 [13] since in RIOT (i) the reassembly buffer is not as expensive for static memory as assumed in RFC 8930 due to the dynamic allocation arena for packets and (ii) later incoming datagrams are preferred over incomplete reassembly buffer entries when the reassembly buffer is full.

C. SELECTIVE FRAGMENT RECOVERY
Likewise, we extend 6LoWPAN in GNRC to support selective fragment recovery (SFR). Due to the different dispatches of SFR, our implementation can be run together with classic 6LoWPAN fragmentation, i.e., also with simple fragment forwarding. To that end our SFR implementation reuses the fragmentation buffer and reassembly buffer of classic 6LoWPAN (cf., Section III-A) and the virtual reassembly VOLUME X, 2016 buffer of our direct fragment forwarding implementation (cf., Section III-B) with slight adaptations to handle the new features, such as tallying up the sequence numbers of fragments in the reassembly buffer to be able to generate the cumulative acknowledgments.
Contrary to our fragment forwarding implementation, we do not only send the compression header with the first fragment, as the specification of SFR accounts for the compression header to change. If the VRB is full however, we also fall back to reassembly, making the node on which that error case occurs the reassembling endpoint for the given communication.

D. CONGURE: CONGESTION CONTROL FRAMEWORK
To evaluate different congestion control mechanisms for SFR in a modular way, we designed the CongURE (Congestion Control Utilizing Reusable Elements) framework as part of the 2021.04 release of RIOT. CongURE is designed to be unit agnostic so it can be used in several use cases, such as TCP or QUIC where the window size is in bytes [34], [28], [30] or SFR where the window size is in fragments [25].
2 operations are provided to fetch the current congestion state: • cwnd() : congure_wnd_size_t to get the current congestion window size in user defined units. • inter_msg_interval(msg_size : unsigned) : int to get the currently calculated inter-message interval in milliseconds for pacing. The size of the next message msg_size can be used to go into the calculation of the interval. If pacing is not supported by the congestion control mechanism, the operation returns -1. There are 6 operations to report various congestion events: • report_msg_sent(msg_size : unsigned) to report a message as sent. The message size msg_size is to be provided in user-defined units. This method is used to increase the internal count of in-flight messages. • report_msg_discarded(msg_size : unsigned) to report a message as discarded for any other reason than timeout or loss. The message size msg_size is to be provided in user-defined units. This method decreases the internal count of in-flight messages. • report_msgs_timeout(msgs : congure_msg_t[]) to report a collection of messages msgs times out. A collection of messages is used, as timed out messages are reported in bulk for many congestion control mechanisms. • report_msgs_lost(msgs : congure_msg_t[]) to report a collection of messages msgs is lost. In case the congestion control mechanisms does not distinguish between loss or timeout, report_msgs_lost() is an alias of report_msgs_timeout(). • report_msg_acked(msg : congure_msg_t, ack : congure_ack_t) to report an acknowledgment ack for a previously sent message msg. • report_ecn_ce(time : time_t) to report an ECN congestion encounter for a message sent at time.  To deploy CongURE with SFR, a simple but optional adapter was written that provides a CongURE object for every fragmentation buffer entry. The reporting operations of the CongURE object are then called for each corresponding congestion event.

E. MAC LAYER
In its default configuration, GNRC only provides a very slim MAC layer that benefits from radio drivers that support CSMA/CA, link layer retransmissions, and acknowledgement handling by default. Special care has to be taken for hardware platforms that use "blocking wait on send" whenever the device is in a busy state. When deploying fragment forwarding, this may cause race conditions within the internal state machine of the device [35] because of the faster interchange of simultaneous sending and receiving events. To solve this problem, we provide a simple mechanism to queue packets whenever the device signals that it is in a busy state. As soon as the device becomes available again (and not later than 5 ms), the MAC layer tries to send the packet from the top of the queue again.

IV. COMPARISON OF FRAGMENT FORWARDING METHODS
In this section, we compare four fragment forwarding methods. Our goal is to carefully explore the behavior of the competing fragmentation schemes. Along this line we show that using fragment forwarding over the very thin MAC layer that many IoT systems deploy does more harm than good. The key MAC layer components (CSMA/CS, link layer retransmissions, and acknowledgments) are not sufficient to cope with interferences in fragmentation scenarios.
Our experiments are conducted in a real-world testbed using class-2 IoT nodes [14] and 802.15.4 radio communication. One important aspect of the experiment design is the underlying network topology, which we consider by selecting specific nodes from the testbed. We want to assure that (i) the network is widespread enough and not too crowded, but also that (ii) it contains multiple bottlenecks as described in Section II-D to stress hop-wise reassembly.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. The Lille site features a challenging multihop network. Nodes are not only distributed in a dedicated room in a grid but also located in multiple offices spread over different floors. The site therefore provides a realistic scenario for different types of heterogeneous deployment. In our experiments we focus on a static network topology. To ensure the static setting, we disabled any routing protocol.
To select nodes for our experiment, we first measure basic properties of the testbed. By correlating the geographic distance and the packet delivery ratio (PDR) between two nodes, we found that two hops should be in range of 6.5 m or less. This ensures that the PDR is at least 97.5%, which we argue is acceptable. Lower PDRs do not contribute to a better understanding of the problem space in this paper. The network is then constructed by a breadth-first search over all available nodes of the testbed site, starting at the sink s. We select node 57 as the sink as it is located centrally between the more crowded nodes in the dedicated room and the more sparse nodes in the office space at the Lille site. This ensures that a balanced set of both network deployment scenarios is included. To prevent a bias towards specific nodes, our network construction algorithm works as follows.
1) Collect all neighbors within the range of 2.2 m and 6.5 m as potential node candidates in set N . This selection expands the network as much as possible under our PDR requirement. 2) Get a randomized, uniformly distributed sample M of 1 to 3 members in N ; s always selects 2 neighbors. 3) Add M to the network, and continue for each member of M until 49 nodes are found. The selection of 1 to 3 downstream neighbors per node assures the inclusion of reassembly bottlenecks into the network, as described in Section II-D.
After constructing the network, we used the same set of nodes in all of our experiments to ensure comparability. The resulting logical and geographical topologies are visualized in Figure 11. Multiple paths have the same length. The longest path consists of 6 hops. Communication Setup. We configured all routes based on the breadth-first search described above. Except for the sink and its neighbors, we configured all other nodes as data senders to ensure the need for forwarding.
All source nodes start sending UDP packets-using the same payload-to the sink in a uniformly distributed interval between 5 s and 15 s. The experiment ends after each source has sent 50 packets. For each fragment forwarding approach we evaluated as many UDP payload length as possible to see the impact of different packet sizes. As all the fragment forwarding strategies we pick for our evaluation must contain payloads divisible by 8 in all but the last fragment except for SFR, we do not need to cover all payload sizes. As such, to further decrease the overall run-time of our experiments we decided on an increment of 16 for our range of payload size inclusively between 16 and 1024 bytes.
To evaluate the performance, our experiments measure system complexity in terms of memory usage, reliability, specifically the PDR, and the latency between the UDP sockets of source and sink.
Software Parameterization. RIOT offers a variety of compile-time configuration parameters to adapt to use cases. In most of the experiments, we can use default configurations. For the following reasons, however, we have to change some default values: (i) The default configurations assume rather small networks. This conflicts with efficient forwarding in large-scale mesh networks, such as the testbed. (ii) We originally wanted to compare our results of the fragment forwarding performance with related work that analyzed some aspects in simulation [36]. We document the changes of default values for the 2021.04 release of RIOT in the Appendix.
We also adapted the configuration parameters for selective fragment recovery to the needs of our experiments. In all we selected the number of fragment retransmissions to be four instead of two. For the direct comparison with the other approaches we deactivated congestion control and configured a window size of 1 and 5 respectively to account for different window sizes but negate potential side effects on latency by congestion control.
We will specify further parameters in the evaluation of the specific metric, when they differ from the default configuration. VOLUME X, 2016   To evaluate end-to-end transport of the forwarding information in every fragment (E2E) we use a modified version of IPv6 fragmentation (see Section II-B) with 6LoWPAN header compression (see Section II-A). To be comparable to the 6LoWPAN fragmentation approaches we configure the IPv6 MTU to be a non-standard compliant value of the link layer PDU + 22 bytes. Those 22 bytes are the bytes gained by elision of fields due to 6LoWPAN header compression [16]; see Table 2 for a detailed tally up.
The default size of the virtual reassembly buffer in GNRC is 16 entries. Since this only prefers direct forwarding and selective fragment recovery, we do not need to adapt its size. Furthermore, we have to increase the size of the common reassembly buffer of the sink. Without this adaptation the reliability decreases significantly, even for the smallest number of fragments.

B. MEMORY CONSUMPTION
Overall memory consumption. Table 3 shows both ROM and RAM usage of the 6LoWPAN layer at the source node for HWR, FF, and SFR. E2E is not included, as it is implemented in the IPv6 layer and thus not comparable. As the fragmentation buffer size is dependent on the window size in SFR, we fixed the window size for that binary to 1 fragment. When compiling the software we use arm-none-eabi-gcc v9.3.1 with -Os optimization (size-optimal) for ARM Cortex-M3, all debug information stripped, and the compile-time parameters we line out in Section IV-A. We use the size tool to extract the relevant module information. To make memory measurements compatible, we set the reassembly buffer size for HWR to the same value as the VRB size (16) for FF and SFR. The anticipated memory advantage for FF does indeed exist, even with the GNRC strategy to not allocate 1280 bytes IPv6 MTU for every reassembly buffer entry but using the central packet buffer instead (cf., Section III).
FF adds a small amount of RAM to keep the meta-data required for refragmentation our implementation utilizes (see Section III-B) in the asynchronous GNRC fragmentation buffer. More ROM is also needed for the possible refragmentation of the first fragment. The majority of the ≈ 500 bytes of additional ROM in the reassembly buffer for FF in 6LoWPAN is explained by the overhead required to distinguish whether packets need to be handled by a VRB entry creation or put into the regular reassembly buffer.
SFR of course uses more ROM and RAM (both ≈ 2 kilobytes) than FF, as the recovery mechanism adds extra complexity and requires new data structures to be stored. Packet buffer usage. Figure 12 presents our analysis of the mean utilization of the 6144 bytes packet buffer during the overall runtime of each experiment in percent on the y-axis for each evaluated UDP payload size on the x-axis. To show the extremes, for SFR we deliberately used only the window size 1 run. FF and HWR have similar usages, but a slight advantage of ≈ 1% for FF. E2E uses less space in the packet buffer compared to FF and HWR, as the overhead of 6LoWPAN is not required. SFR genuinely mostly sees single fragments instead of full datagrams on most of the nodes due to the window size being 1. As such, its packet buffer usage is the lowest of all approaches with higher fragmentation.
The high packet buffer usage for FF is mostly caused by the fallback to regular reassembly as we describe in more detail in Section IV-C. A clear correlation between this fallback and packet buffer usage was described in [37].

C. RELIABILITY AND LATENCY
Figures 13 and 14 display our results from measuring reliability and latency. Figure 13 shows the reliability as the packet delivery ratio (PDR) by UDP payload length in bytes on the x-axis and the PDR in percent on the y-axis. To provide a clearer picture, we divided the data sets into three sub-figures: Figure 13a shows the approaches that do not allow for recovery, Figure 13b 4 different configurations for SFR with window size 1, and Figure 13c those same configurations for SFR with window size 5. The window sizes were chosen to compare between two extremes: window size 1 to measure the effects of waiting for an ACK after every fragment and window size 5 for when the PDR is still good enough but more fragments are sent before the sender waits for an ACK. At higher window sizes than 6, we were able to confirm that the PDR dropped drastically in our setup (not shown).
Strikingly, FF and E2E admit poor reliability. For FF this is in contrast to previous results where simulation and a coordinated TSCH MAC layer was used [36]. Even for     a small number of fragments, both achieve less than a quarter the PDR of HWR. Values then quickly approach zero with increasing number of fragments. HWR, though also performing poorly, manages to deliver at least some packets to the more distant nodes. The PDR of SFR on the other hand shows comparable to HWR. For window size 1 it is even slightly better than HWR with higher payload lengtheven if only by 1-2 %, due to the recovery mechanisms of SFR. Figure 14 depicts the latency as a 3-dimensional CDF, with the latency in seconds on the x-axis, the CDF on the y-axis and the source-to-sink distance in hops on the z-axis. The UDP payload lengths are binned together by the respective number of fragments each protocol requires for that payload length in a dedicated plot. Due to its larger latency, for SFR the x-axis is scaled differently and shows 0 to 35 seconds. The maximum of 0.7 seconds in the other plots is marked in the SFR plots for easier comparability.
The latencies we measured for FF (see Figure 14b) are also significantly higher than in the previous simulation work. HWR (see Figure 14a) is expected to operate slower because each node needs to reassemble the entire frame prior to forwarding to the next hop. E2E (see Figure 14c) performs similar to HWR with low fragmentation, but as reliability quickly drops to 0% with higher fragmentation, the latency becomes infinite for higher payloads. SFR (see Figures 14d and 14e) has the highest latency compared to the other protocols, due to its recovery mechanisms and the induced inter-frame gap. In the given plots we only show the results for inter-frame gap 0.1ms and an ARQ timeout of 1.2s. When increasing inter-frame gap and ARQ timeout the latency grows. With a window size of 1 fragment (see Figure 14d) we see significantly higher latencies with higher fragmentation compared to a window size of 5 fragments (see Figure 14e). This is because the sender does not have to wait for an ACK for each fragment, but only every fifth fragment, thus fitting more fragments into a smaller timeframe.
In our experiments, we see significantly more link layer retransmissions per node with FF compared to HWR [37] and the same holds for E2E and SFR (not shown). This is caused by much faster send and receive triggers on the device due to immediate fragment forwarding, which increases collisions and packet loss on the single antenna radio. Moreover, this results in straining the single buffer of a device, which far more often needs to discard unacknowledged incoming packets while it is busy with either sending or receiving a different packet. This invokes link layer retransmissions and eventually contributes to packet loss. An example for these occurrences is illustrated in Figure 15. Based on local measurements using a logic analyzer on a sister device that deploys the same radio (AT86RF233 [38]), we are able to confirm that the device can remain busy for up to 4 ms. Additional measurements on the same device type show that even the shortest fragmented datagram with HWR (88 bytes of UDP payload length) requires ≈8 ms from entering the reassembly buffer to leaving the fragmentation buffer on refragmentation. This leaves both the device and the medium more relaxed with HWR. In previous evaluations [37], [39], it was also shown that packets are lost with FF and SFR when the respective reassembly buffers are full.

V. EVALUATION OF CONGESTION CONTROL WITH SFR
Our evaluation of congestion control with SFR were made in a similar but much smaller setup than we used for comparison of the fragment forwarding methods in Section IV. The forwarding bottlenecks we described in Section II-D also are a point of congestion so the basics of our network topology choices still stand.
Our goal is to explore a suitable congestion control mechanisms for SFR on top of a MAC layer consisting only of CSMA/CS, link layer retransmissions, and acknowledgments by looking into common mechanisms used in the wider Internet.

A. SETUP
Experiment Testbed. For our congestion control evaluation for SFR we picked a much smaller network than the one used for Section IV to have a better control over how and where congestion occurs. Here we used 8 nodes of the Grenoble side, which are the same type of node as the once at the Lille side we described in Section IV-A. We selected a T-shaped set of nodes (see Figure 16) to assures that the nodes are at least 1 m spaced from another and that the forwarder f would provide a bottleneck for hop-wise reassembly. Communication Setup. We configured all routes statically to form the T-shape described above. Except for the sink and its neighbors, we configured all other nodes as data senders to ensure the need for forwarding.
For our evaluation are only interested in the effects that different number of fragments have on our results. As such, To cause as much congestion on the resources of the nodes as possible, all source nodes send UDP packets-using the same payload-to the sink in a uniformly distributed interval between 250 ms and 750 ms. The experiment ends after each source has send 1200 packets. Software Parameterization. We mostly use the same software parametrization as in Section IV-A with the difference that congestion control is now activated. For the congestion control comparison, to have a good base-line for all congestion control mechanisms we set the initial window size for SFR to 2. Due to the higher frequency of packets we increase the size of the packet buffer arena of RIOT's default network stack GNRC from 6 to 40 kilobytes. This assures that loss is not happening due to packets not being able to be allocated. While a full packet buffer could be interpreted as a point of congestion, it skews the results especially with larger UDP payloads, where a small packet buffer is filled by only a 2-3 reassembling datagrams at the sink.

B. MEMORY CONSUMPTION AND SYSTEM ANALYSIS
In Figure 17, we compare the build sizes of modules relevant to SFR with CC in our implementation: 6LoWPAN SFR itself, the 6LoWPAN fragmentation buffer (containing state variables for CC), and the CongURE module. To see the cost of abstraction due to the CongURE module, we unrolled each CongURE implementation into the adapter code of the SFR implementation. The binary was generate using arm-none-eabi-gcc v9.3.1 with -Os optimization (size-optimal) for ARM Cortex-M3, all debug information stripped, and the compile-time parameters we line out in Section IV-A. We use the size tool to extract the relevant module information. All implementations use about 2.5 kbytes of RAM and 4-5 kbytes of ROM, depending on complexity. RAM mostly increases based on the state variables. ROM expectedly increases by a few hundred of bytes with each level of complexity in CC.
Abstraction using the CongURE framework only adds a few bytes of RAM compared to the unrolled version with at most 62 bytes in the QUIC implementation. In ROM the abstraction can add up to about 500 bytes, mostly due to both the indirection introduced and the way CongURE initializes constants in a reusable way.

C. RELIABILITY
To analyze the impact of congestion control, we only looked at PDR as the latency varys quite widely depending on the congestion in the network. Figure 18 shows PDR by number of fragments. Note, that we left out 1 fragment on the x-axis intentionally, as fragmentation is not triggered for a single frame and the far higher PDR would make the remaining results less visible. For reference, see Figures 13b and 13c. However, note that the actual PDRs are not comparable due to the smaller network and the higher sending rate.
While for 2 fragments the performances of all CC mechanisms is comparable around 2%, they vary quite drastically in higher fragmentations: SFR App. C only outperforms both TCP variants especially in very low fragmentations, but at 6 fragments it often is comparable or under the two TCP variants. An interesting observation is the saw-tooth like shape of its PDR plot, with even number of fragments being outperformed by the next higher odd number of fragments. This is due to the congestion window defaulting to 2 and only being able to shrink to 1 on timeout with this mechanism: Even numbered fragments oftentimes need a timeout to shrink the window for the transmission to succeed. For odd numbered fragments, on the other hand, this is not necessary for at least the last fragment, when amortizing the transmission sequence.

D. CONGESTION EVENTS
We need to look very selectively on the dataset to analyze the influence on the congestion window and inter-frame gap (IFG) of the congestion events that are reported by the CongURE API. Figure 19 shows two selected transmissions for each CC mechanism -one succeeding, one timing outfor the transfer of 12 fragments. We selected the transmission VOLUME X, 2016  based on the following criteria, in that order, (i) having at least two timeouts in the timeout plots to better show its influence on congestion window and inter-frame gap, (ii) IFG changes, if any exist, (iii) ECN events occurring, if any occurred, and (iv) similarity of the course events. We selected s 5 (see Figure 11) for the node under observation, as both s 5 and s 6 show the most ECN events.
For SFR App. C we see the expected decrease of the congestion window on timeout in Figure 19a, while on ACK, it stays at 2 in Figure 19e. As there is no support for pacing, the IFG stays at the default of 170 ms.
For TCP Reno and TCP ABE (see Figures 19b,19c,19f and 19g), we often see spikes in the congestion window, which are explained by the window being increased by ACK'd fragments and then immediately decreased by fragments marked lost within the same ACK. However, ACKs lead to a rapid increase in the window size, which is not as easily mitigated even by the multiplicative decrease due to loss, timeout or ECN. As such, the congestion windows stay rather high, risking a lot of loss, may it be due to congestion or intrinsic loss of the LLNs. Furthermore, we see that both start already with a window size of 4. This is due to the initial window size being defined by the maximum segment size (set to 1 fragment for SFR) in RFC 5681 [28], rather than having a constant initial window size as the other two approaches. This makes both TCP variants rather optimistic, but not very suitable for our scenario. However, with the erratic increase of the congestion window we see with initial window size 4, there is doubt, adopting the mechanism for 6LoWPAN, by setting the window to our default of 2 statically, would make any difference. As with SFR App. C, there is no support for pacing, so the IFG stays at the default of 170 ms.
For QUIC we saw in Section V-C, that pacing allows for a better PDR. Sadly, for 12 fragments we rarely see the IFG change, as oftentimes we only see one timeout that deletes the fragment buffer, while the other resends are already expanded by negative ACKs for the fragment. For the timeout event we were able to see it a few times, for which we present the transmission in Figure 19d. In the success case (see Figure 19h), we see similar behavior as for TCP ABE, with the special ECN-behavior giving a slight advantage over TCP Reno.

VI. RELATED WORK
In prior work [37], we compare HWR and FF. In this paper, we extend the scope and contribute the comparison of E2E fragmentation and SFR as well as the evaluation of different congestion control mechanisms in the context of fragmentation. This comprehensive analysis allows us to guide discussions why the use of direct forwarding approaches should be discouraged in scenarios that deploy very thin MAC layers.
Kent et al. [40], [41] discussed potential harms already at the beginning of the Internet and thus paved the way for proper protocol design.
Papadopoulos et al. [42] discussed the ongoing work within the IETF on 6LoWPAN fragment forwarding and selective fragment recovery already. In contrast to their work, our work provides an in-depth analysis of these approaches.
Other approaches that use similar concepts as FF mainly focus on datagram prioritization [43], [44]. Similar to SFR, Chowdhury et al. [45] proposed a standard compliant NACKbased approach for selective fragment recovery. Since those NACKs, however, are associated with id(i), this mechanism only allows for hop-wise recovery and does not cover the whole end-to-end path when using FF.
Tanaka et al. [36] used the 6TiSCH simulator [46] to analyze the performance of FF compared to HWR. The authors show that FF is a promising option in IEEE 802. 15.4e (TSCH) and that as such also SFR could yield better results.
Awwad et al. [47] also compared FF to HWR in a simulator and conducted experiments in a testbed. They used a topology consisting of 4 nodes in a line. This setup ignores challenging bottlenecks, which occur in real deployments (see Section II-D). Furthermore, they only compared their proprietary solution of fragment forwarding with HWR in the testbed evaluation. In contrast to this, we evaluate standard compliant protocols in a complex testbed setup.
In our work, we did not consider the frame delivery mode for link-layer meshes of 6LoWPAN [7]-commonly known as mesh-under [12, Section 1.2]-because it is known that such a solution falls behind HWR [45].
Hummen et al. [48] analyzed the security implications of 6LoWPAN fragmentation. They considered both hopwise reassembly and direct fragment forwarding, but also mesh-under mode. In their paper they present 2 possible attack vectors utilizing 6LoWPAN fragmentation: A fragment duplication attack and a buffer reservation attack. Buffer reservation attack only effects not performing hop-wise reassembly and the reassembling end-points. The attacker spams the victim with first fragments with changing datagram tags, so that the reassembly buffer is quickly exhausted. All fragment forwarding schemes are however susceptible to a fragment duplication attack in which the attacker sends bogus subsequent fragments that identify to belong to a different fragment chain, effectively invalidating the data within the reassembled datagram. As a solution the authors propose an extension to 6LoWPAN fragmentation utilizing content chaining against fragment duplication attacks and a split reassembly buffer approach against buffer reservation attacks.
Combining fragmentation, selective acknowledgments, and congestion control is an approach novel to 6LoWPAN Selective Fragment Recovery. As such, not much research considered that combination yet.

VII. DISCUSSION
Is fragment forwarding a viable option without a coordinated MAC layer?. Our testbed experiments clearly indicate that direct fragment forwarding is outperformed by HWR if direct fragment forwarding is based on the widely deployed, very thin CSMA/CA MAC layer. Our results systematically confirm prior assumptions [13] in practice. Radios that are busy sending a fragment are not able to listen for the next fragment, leading to high losses. In contrast to this, HWR allows for a datagram to be received in full first, causing the radio to be in receive mode first and only to switch to send mode when the node is fragmenting the datagram again. In [36] the authors show that already with direct fragment forwarding an advantage can be gained when using a coordinated MAC layer such as IEEE 802.15.4e TSCH. The only advantage of FF over CSMA/CA we could clearly identify is its reduced RAM consumption.
SFR reduces the bottleneck of HWR while providing a similar packet delivery ratio. Using lower window sizes may improve the results even further. However, a higher latency is to be expected due to the recovery mechanism.
In practice, whether fragment forwarding is applicable and if so on which MAC protocol depends on the deployment scenarios and provider use cases. Our work helps to assess the potential deployment space.
Is end-to-end fragmentation a viable solution?. In our E2E setup using a modified version of IPv6 fragmentation, we show that the performance disadvantages can not be reduced when carrying the forwarding information in every fragment. The only advantage we get is smaller memory footprint due to the VRB and the whole 6LoWPAN fragmentation layer in general not being required, on the cost of using bytes of the link layer PDU for the forwarding information. At least for classic IEEE 802.15.4 where the link layer PDU is very restricted, this cost is non-negotiable. Larger SDUs as the 400 bytes we find in the PLC protocol ITU-T G.9903 might not have to deal with these restrictions.
It should be noted, however, that we used IPv6 fragmentation in a very slim manner in our experiments. Other extension headers such as routing headers or hop-by-hop options might also need to be carried in every fragment. While the 6LoWPAN Routing Header specified in RFC 8138 [49] may be able to compress some of those, the remaining headers can quickly exceed an MTU that is smaller than the required 1280 bytes specified in [6].
What kind of congestion control should be used with SFR?. Providing the optional congestion control for SFR has the potential to provide a good compromise between packet delivery ratio and latency. We showed that a highly sophisticated congestion control mechanism utilizing pacing, such as the one used by QUIC, can yield benefits for SFR. Of course, some adaptations tailored to LLNs are required: The window size should only be increased very conservatively and the initial window size should be kept as minimal as possible.
Strikingly, special behavior based on ECN has little impact on the overall performance: Both TCP variants perform very similar, despite TCP ABE using an exponential backoff for the window decrease on ECN. This is surprising, as the usual metric, packet loss, is not very suitable for lossy and lowpower networks, as loss is inherent to the technology and not necessarily an implicit signifier of congestion. Explicit congestion signifiers such as ECN would have been expected to add a better picture of the actual congestion in the network. However, the very nature of LLNs mitigates this advantage, as the ECN in SFR is marked by the forwarders on congestion on the fragments, but delivered to the fragmenting end-point in the ACK. Both these messages can easily get lost after the ECN flag was set in the headers. In fact, we observed the ECN flag being set on the forwarders in SFR App. C, but no corresponding ECN event at the fragmenting end-point. How does fragmentation work with other link layer technologies?. Bluetooth Low Energy comes with fragmentation and reassembly in its Logical Link Control and Adaption Protocol (L2CAP) which is part of its link layer. It is comparable to a hop-wise SFR. This means we face similar bottleneck problems as with both HWR and but also the latency issues as with SFR in theory. However, the hop-wise reassembly happens much closer to the device and within the coordinated MAC layer of Bluetooth Low Energy. As such, buffer space and air time can be allocated much more tailored to the specific datagram since. Since acknowledgments, like in IEEE 802.15.4 as well, are already part of the link layer, but in SFR are a completely separate message type on top of the link layer, the latency issues we saw in this paper might be much smaller. Due to its coordinated MAC layer, all remaining latency penalties due to ACKs might also be negligible when compared e.g. to a similar approach in IEEE 802.15.4e TSCH (6TSCH) or DSME [8]. Other low latency modes of 802.15.4e such as LLDN or LECIM only allow for star topologies so they do not fit our initial use case for fragment forwarding in any case.
It should also be noted that for low-power wide area networks (LPWANs) such as LoRAWAN, SigFox, or NB-IoT we face much more extreme constraints than with most other low-power and lossy network technologies mostly discussed in RFC 8376 [5]. As such, the IETF specified a whole different suite of IPv6-over-X adaptation layer protocols-Static Context Header Compression and Fragmentation (SCHC) [19]-which is not compatible with 6LoWPAN. SCHC introduces different fragmentation schemes, tailored to the underlying link layer technologies, called SCHC Fragmentation/Reassembly (SCHC F/R). SCHC uses fragment forwarding but very tailored to the infrastructure of the underlying LPWAN. As such the forwarding can be much more reliable. In [19] three SCHC F/R modes are definetd. Those may be used for different link layer technologies (i) No-ACK mode, which is comparable to FF in this paper, (ii) ACK-Always mode, which is comparable to SFR in this paper, and (iii) ACK-on-Error mode, which is a NACK approach, comparable to the one proposed in [45]. Later SCHC extensions might add more modes.
However, as we noted, all of the proposed modes are comparable to approaches evaluated in this paper. As such, the same conclusions apply for the different SCHC modes. This is solidified by [19] as it states that the selection of the mode is dictated by the underlying link layer.

VIII. CONCLUSION AND OUTLOOK
In this paper, we compared four different fragment forwarding schemes-end-to-end fragmentation, hop-wise reassembly, direct fragment forwarding, and selective fragment recovery-using large real-world experiments, as well as congestion control mechanisms for selective fragment recovery.
We showed that with a thin, CSMA/CA-based MAC layer, hop-wise reassembly can be the better choice to achieve proper reliability and latencies. Why that is becomes clearer after careful analysis reveals that the radio on a node can't handle the quick exchange between reception and sending fragments that occurs with direct fragment forwarding but is far more relaxed with hop-wise reassembly.
Strictly coordinating MAC layer schemes such as Time-Slotted Channel Hopping (TSCH) or the Deterministic and Synchronous Multi-channel Extension (DSME) of 802.15.4e have the potential to improve reliability without sacrificing data rates. Prior work [36] focused on simulating TSCH in small topologies of ten nodes but showed promising results for that MAC layer using direct fragment forwarding. Integrating those access schemes into existing 6LoWPAN implementations to analyze whether they can cope with large-scale sender scenarios will be part of our future work. in Computer Science at Freie Universität Berlin with focus on the development and comparison of several IP-based network stacks for embedded devices. Her research interests include the Internet of Things, Information Centric Networking, and API design. Accompanying her work as researcher she is currently pursuing her PhD in those fields. Martine Sophie Lenders is also one of core developers of the operating system RIOT and maintains large parts of its networking infrastructure. VOLUME X, 2016