Overcoming ROS2 Foxy Challenges
Using ROS2 Foxy Fitzroy over Wireguard and LTE connections
In our project on building a watering robot for Schloss Pillnitz we need to connect the robot and the gardener over wide ranges, where WiFi is not sufficient. Therefore we decided for an LTE connection in combination with a Wireguard VPN.
The setup is very stable and connectivity is reliably given. However, we quickly figured out that ROS2 Foxy does not work out of the box in this setup, i.e. nodes on the gardener PC and the robot do not discover each other and hence cannot exchange information. Since ROS2 connectivity between these computers is crucial for our application, we analyzed the problem and came up with different approaches to resolve the issue.
The problem
ROS2 using the default middleware (FastRTPS or CycloneDDS) bases on DDS for data transport. The DDS discovery mechanism (i.e. that participants find each other in the network) is by default based on multicast UDP packets. However, Wireguard itself does not forward multicast UDP and hence the devices cannot find each other over the wireguard network. Moreover, we do not have root access to the wireguard gateway server and hence cannot adjust any setting on the gateway.
The solutions
There are two methods to overcome this issue: First, change DDS to not use multicast UDP but unicast for discovery. Second, allow multicast UDP over Wireguard.
- Using the initial peers list
- Using the FastDDS discovery server
- Using something ready-made, like Husarnet.
- Using SSH tunnels for forwarding multicast traffic
Both Option 1 and 2 correspond to adjusting ROS2 to use unicast. However, we could not get it to work reliably. For the initial peers, it becomes tough to integrate new devices ad-hoc into the same wifi, because one would need to add these to the initial peers as well. Solution 2 sounded really promising and was exactly what we needed. The tutorial was well written and it worked reliably with the simple talker
/ listener
examples. However, after we tested the method in our setup using the nav2
Navigation stack with tons of topics and services, the discovery server was not working reliably. Several restarts of the server were required to get the discovery to work and the discovery took very long if it worked at all. Moreover, in order to let ros2 topic list
and its colleages work, a configuration file is needed, which increases complexity of the deployment. Therefore we decided that the system was not ready for our use case. Maybe a newer version of ROS2/FastRTPS had these issues resolved, however we are fixed to ROS2 Foxy for different reasons.
Using SSH Tunnels for forwarding Multicast traffic
Given that ROS2/FastRTPS-native methods (i.e. Option 1, 2) did not work, we settled to enhance our Wireguard setup for multicast UDP. I researched a bit and came up with the ready-made solution Husarnet (Option 3 in the list above). However, following their tutorial it did not work - I could not get the simple talker/listener nodes find each other. Instead, Husarnet required root access and systemd
service installations and eventually changed my hostname and /etc/hosts
permanently and without notice, so I ditched it asap.
Next, I looked into SSH Tunnel Interfaces and they were exactly what we needed. Certainly, there are options such as OpenVPN or smcroute
, however, the SSH solution was quickest to setup and it worked out of the box. So, here's what we eventually did:
- Allow Tunneling on one device. Add
PermitTunnel yes
to/etc/ssh/sshd_config
. - Create TUN devices on both clients:
sudo ip tuntap add dev tun0 mode tun user $(logname) sudo ip addr add dev tun0 10.0.1.xx/24 sudo ip link set tun0 up
- start the tunnel from device
10.0.1.22
to the other (or vice versa)ssh -w 0:0 user@10.0.1.21
Now, any traffic is forwarded from the local address 10.0.1.aa
to the remote address 10.0.1.bb
, including UDP multicast traffic. And, accordingly, ROS2 nodes on both devices can discover each other. We tested this setup with the nav2
stack and our robots, and it worked reliably, even with many topics and services.
The interesting thing is, that after the nodes have discovered each other, the tunnel can be closed and still the nodes do communicate with each other. Hence, the DDS discovery protocol seems to negotiate the cheapest route, i.e. the point-to-point route directly over wireguard, without shifting the actual payload through the tunnel.
We are aware, that this solution does not scale well, i.e. one needs pairwise connections and TUN devices between all clients, but for our simple setup, where we have just 3 devices connected over wireguard, this works reasonably well.