Arm Architecture with armhf port, Move_base Bus Error Alignment trap/exception
Hi Everyone,
I just ran into a very nasty bug when running Move_base on a armhf (ArmHardFloatPort) ARM CPU(armv7l). I am running the latest Fuerte code on Ubuntu 12.04.
The only things I am running are roscore, move_base, and map_server. The problem happens when I send move_base a blank map. Move_base prints "Still waiting on map..." and then when I send it a map through the map_server, it immediately crashes with the following error "Program received signal SIGBUS, Bus error".
The back-trace from GDB of move_base is as follows:
[ INFO] [1358269351.739542577]: Still waiting on map...
[ INFO] [1358269352.739512061]: Still waiting on map...
Program received signal SIGBUS, Bus error.
0xb6ee27fa in ros::SubscriptionCallbackHelperT<boost::shared_ptr<nav_msgs::OccupancyGrid_<std::allocator<void> > const> const&, void>::deserialize(ros::SubscriptionCallbackHelperDeserializeParams const&) () from /home/name/ros/navigation/costmap_2d/lib/libcostmap_2d.so
(gdb) bt
#0 0xb6ee27fa in ros::SubscriptionCallbackHelperT<boost::shared_ptr<nav_msgs::OccupancyGrid_<std::allocator<void> > const> const&, void>::deserialize(ros::SubscriptionCallbackHelperDeserializeParams const&) () from /home/name/ros/navigation/costmap_2d/lib/libcostmap_2d.so
#1 0xb6dc2d68 in ros::MessageDeserializer::deserialize() () from /opt/ros/fuerte/lib/libroscpp.so
#2 0xb6dbe2f4 in ros::SubscriptionQueue::call() () from /opt/ros/fuerte/lib/libroscpp.so
#3 0xb6d6b4e6 in ros::CallbackQueue::callOneCB(ros::CallbackQueue::TLS*) () from /opt/ros/fuerte/lib/libroscpp.so
#4 0xb6d6b0ac in ros::CallbackQueue::callAvailable(ros::WallDuration) () from /opt/ros/fuerte/lib/libroscpp.so
#5 0xb6da4a5a in ros::spinOnce() () from /opt/ros/fuerte/lib/libroscpp.so
#6 0xb6ebbbda in costmap_2d::Costmap2DROS::Costmap2DROS(std::string, tf::TransformListener&) ()
from /home/name/ros/navigation/costmap_2d/lib/libcostmap_2d.so
#7 0x00000000 in ?? ()
(gdb)
The error printed out from the kernel "dmesg" is as follows:
Alignment trap: not handling instruction ed967b05 at [<b6ee27f6>]
Unhandled fault: alignment exception (0x011) at 0xb14c202f
The problem seems to be directly related to the fact that it is a armhf system. X86 systems seem to take care of alignment problems a lot better then ARM.
Based on the "/proc/cpu/alignment" file it is set to "2" which means fixup. If I send this file a "3" it will warn me of the alignment problems. Doing this and running other ROS nodes, I know it is fixing most of the alignment bugs that ROS has in it, but it is unable to fix this one for some reason.
Does anyone have any suggestions on how to fix this bug?
Thank you
UPDATE
To fix this problem, I had to edit the CMakeLists.txt file in the costmap_2d package. I added "set(ROS_BUILD_TYPE debug)" and recompiled it. Move_base now got to the point of asking for the "tf" from /base_link to /map.
Unfortunately, after I sent a static transform with the static_transform_publisher it proceeded with a similar but different bus error.
Here is the GDB back trace of the new error:
[ WARN] [1358354683.016454689]: Waiting on transform from /base_link to /map to become available before running costmap, tf error:
[ WARN] [1358354688.094640723]: Waiting on transform from /base_link to /map to become available before running costmap, tf error:
Program received signal ...
I just found this posted on Ros-users a couple days ago. I have not tested it but it might help: http://ros-users.122217.n3.nabble.com/ros-on-armhf-beagleboard-pandaboard-raspberryPi-gumstix-error-alignment-fault-kernel-exception-possid-td4019706.htm
Seems like these guys had similar problems to this one, but no real answer: http://answers.ros.org/question/12135/bus-error-on-armv7l/ OR http://answers.ros.org/question/34976/illegal-instruction-in-serialize-on-gumstix/ OR http://ros-users.122217.n3.nabble.com/Ros-on-Arm-Hanging-td934867.html
There seems to be a patch that was put in, and should be released on the next spin of ros_comm. Here is the link: https://github.com/ros/roscpp_core/pull/8
Have you tried that patch? I haven't seen much feedback on it, I hope that means it works well ;)