How to Detect Connection Loss on a LoRaWAN® Device, and Why it Matters
When writing a LoRaWAN® end device’s firmware, one feature that often gets overlooked is connection loss detection and remediation. There are several reasons why an end device could land in this stranded state where it does not receive any downlink messages from the network, including:
- The device transitioned quickly between two gateways with channel maps that don’t overlap
- The device’s RX1 and RX2 reception parameters become out of sync with the network server
- The device roamed to a different network provider’s region (if there is no roaming agreement between those two providers)
While these occurrences aren’t common, their importance is elevated when you consider that some LoRaWAN devices are designed to last a decade or more. The burden of handling these states lies 100% with the end device’s application firmware. Obviously, a network is unable to send a downstream message instructing the device to rejoin the network if that device is unable to receive downstream messages.
The LoRaWAN specification provides a few methods to detect connectivity loss, but the specification is silent on exactly when a device should consider itself stranded, and what it should do in response. These decisions are left up to the application layer. There are a few different methods of detecting a connection loss:
1. Confirmed Transmissions
The easiest-to-grasp method for detecting a connection loss is what happens when an end device transmits a confirmed message upstream that is not acknowledged by the network. Section 18.4 of the LoRaWAN1.0.2 specification includes a table and provides examples of when an end device should lower its data rate upon not receiving an acknowledgment. Basically, the end device will lower its data rate one step every time two acknowledgments are not received.
Assuming the end device does not receive an acknowledgment after seven retransmissions, MachineQ doesn’t recommend immediately considering the connection lost. Perhaps the end device is only in range of a single gateway and it was rebooted or the gateway’s backhaul went down temporarily. A longer timeframe should be used to determine connection loss since joining the network itself can take a few minutes and require multiple transmissions.
2. Unconfirmed Transmission
Devices transmitting upstream using unconfirmed messages do not normally receive any type of response to their transmission. This transmission mode saves the most downstream capacity and can even lower upstream packet loss when using a half-duplex gateway that cannot receive any messages when it is transmitting downstream.
Devices using unconfirmed messages may periodically receive a downstream mac message. When this happens, the device can know it is still connected to the network, even though the downstream message did not specifically acknowledge any upstream message.
However, after many upstream transmissions without receiving any downstream commands, the LoRaWAN 1.0.2 specification (section 18.104.22.168) provides an interesting way to determine connectivity loss. After 64 transmissions (by default), the ADRACKReq bit is set in the frame’s header. This instructs the network server to send a downstream acknowledgment, but not necessarily immediately. If the network server knows the gateway(s) that received a frame from the end device does not currently have downstream capacity, the network server can choose to not respond to the first request.
This also prevents the end device from expecting an acknowledgment, so it doesn’t start dropping its data rate nor retransmit like a normal confirmed uplink transmission. Instead, the end device continues to enable this ADRACKReq bit in all upstream transmissions for 32 more transmissions (by default). At this point, if it still hasn’t received a downstream message, the end device will lower its data rate, unless it is already at the minimum.
After this process, the end device would be at the minimum data rate, it would have requested an acknowledgment to many uplinks, and if it still didn’t receive a downlink it could consider itself disconnected from the network.
3. Link Check Request/Answer
This is a mac command that will be responded to using a link check answer. The answer will include a margin of reception (in dB) and the number of gateways that the upstream request was received by.
This method is not recommended because sending a link check request command will take an additional byte in the upstream frame. While this may not matter most of the time, at data rate 0 aka spreading factor 10, devices only have 11 bytes of payload available in the US band. Additionally, if this link check frame is sent every 10 messages as an example, then a connection state is only detected for those frames containing the link check request. Compared to using the ADRACKReq bit which would be in every upstream message (after a delay), with no additional payload requirement.
End devices could include a link check request in every single upstream frame, but at that point it would make more sense to just switch to confirmed upstream transmissions and again not be burdened by the additional payload requirement.