Hi Nicolas,
It's possible to avoid that memcpy, but probably not useful.
Consider what happens in detail when you publish your image message to a separate node:
- You mempcy the data to a
sensor_msgs/Image
message. - When you publish the
sensor_msgs/Image
, ROS serializes its fields into a buffer for inter-process transport - another memcpy. - The subscriber node populates another
sensor_msgs/Image
by deserializing the received buffer. If the nodes are on the same machine (subscribed over loopback), I believe the kernel optimizes this pretty well, but it's another memcpy or two.
It's possible to reduce the first two steps to a single memcpy by defining your own custom image type, that simply points to the DSIF->getRImage()
data, and registering it with roscpp using message traits. Essentially, your custom type pretends to be sensor_msgs/Image
(by registering the same MD5 sum) and serializes to exactly the same over-the-wire format as sensor_msgs/Image
, so any subscribing nodes can't tell the difference. I can point you to examples of this if you want.
However, after doing all that we're still talking about IPC and 2-3 memcpy's at best. Another issue is that you can't directly use image_transport to provide compressed topics anymore, because image_transport currently understands only sensor_msgs/Image
.
A different approach is to write your camera driver and processing nodes as nodelets, and load them all in the same process. In that case you skip the serialization bottleneck (steps 2 and 3) entirely. The driver publishes a shared_ptr<sensor_msgs::Image>
, and that gets passed directly to the in-process subscriber nodelets. You still pay the cost of the initial memcpy, but that's it.
There's a tutorial on porting nodes to nodelets that needs a little love, but gives you the idea. For a complete example of nodelet-ized camera driver, see Diamondback camera1394.
So my advice is: live with the memcpy, and consider turning your driver into a nodelet if (and only if) message serialization is causing performance problems.
Ideally they would have a nodelet receiving the sensor_msgs::Image in a callback so the serialization would be skipped. However the initial copy from the driver is still wasted cycles. They aught to be able to create a custom allocator.