how to get asynchronous publish working on ROS2 Galactic with FastRTPS
I'm observing the publisher to be block when sending over a slow link. This behavior is unlike any other pub/sub middleware I've used before (including ROS1). So just running a subscriber on a remote node causes the publish rate of the publisher to drop:
[1676142719.758977019] [test_publisher]: size: 1000000 rate: 40.00 bw: 319.995 Mbits/s
[1676142720.783972503] [test_publisher]: size: 1000000 rate: 40.00 bw: 320.002 Mbits/s
[1676142721.817557893] [test_publisher]: size: 1000000 rate: 26.12 bw: 208.979 Mbits/s
[1676142722.849145421] [test_publisher]: size: 1000000 rate: 12.60 bw: 100.823 Mbits/s
[1676142723.903613207] [test_publisher]: size: 1000000 rate: 14.22 bw: 113.796 Mbits/s
Yes, this is over a Wifi link but the wifi is working just fine. It is simply out of bandwidth. In this case I would expect the publisher to drop messages, but not block in publish(). The latter behavior is really bad if you want to e.g. transmit camera images that the robot is also using for state estimation. After first seeing this in ROS2 Galactic with the default cyclone DDS I switched over to fastrtps using asynchronous mode following these instructions. No matter what I try the publisher always blocks when there is insufficient network bandwidth. I played around with qos policies and kernel buffer memory settings but to no avail. Here is the code for creating the publisher. Looks pretty standard:
rclcpp::QoS qos(1);
pub_ = create_publisher<StringMsg>("test_string", qos);
The variables were set as follows:
export RMW_IMPLEMENTATION=rmw_fastrtps_cpp
export FASTRTPS_DEFAULT_PROFILES_FILE=`pwd`/SyncAsync.xml
export RMW_FASTRTPS_USE_QOS_FROM_XML=1
export RMW_FASTRTPS_PUBLICATION_MODE=ASYNCHRONOUS
and this is the xml config file:
<?xml version="1.0" encoding="UTF-8" ?>
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<!-- default publisher profile -->
<publisher profile_name="default_publisher" is_default_profile="true">
<historyMemoryPolicy>DYNAMIC</historyMemoryPolicy>
<qos>
<publishMode>
<kind>ASYNCHRONOUS</kind>
</publishMode>
</qos>
</publisher>
</profiles>
The code can be found in a little github repo here
[Edit] With new keywords for searching provided by one of the answers, I've discovered that this has been noticed by others already: https://github.com/ros2/rmw_fastrtps/... [/Edit]
I can't find any references right now, but isn't async publishing the default for at least FastRTPS? Afaik, Cyclone doesn't support it.
re: even with async you see 'blockages': according to this, default QoS would be: keep last, queue depth (history) of 10, reliable, volatile. Even if the RMW has an internal queue it uses to serialise messages to, QoS might still be causing the behaviour you describe.
Have you tried configuring a
sensor
QoS? That could result in dropped messages (especially with large payloads), but should not block the sender.Edit: according to ros2/rmw_fastrtps/README.md@galactic:
so the config
.xml
would not change anything.[..]
[..] Note that it has changed. The Humble default is
SYNCHRONOUS
. And an observation:pedantic perhaps, but DDS QoS is rather complex. It's perfectly possible for this to happen. I believe the default in many DDS implementations to be fully synchronous and behaviour with lossy links is entirely dependant on QoS parameters.
This seems to describe what you observe:
[..]
[..] And the (main) author of Cyclone provides an insightful overview of the behaviour of a (hypothetical)
dds_publish(..)
in this comment.I tried setting replacing the qos line above with:
rclcpp::SensorDataQoS qos;
and set RMW_FASTRTPS_PUBLICATION_MODE=ASYNCHRONOUS and for good measure also provided an XML file, but the call is still blocking. I'll next try the non_blocking_send flag as suggested in the answer.