is transforming (tf) a pointcloud faster than transforming the points one by one
I have about 600 points with the same time stamp and frame id. I am transforming them all one by one by multiplying the transform pre-obtained by lookupTransform with the point:
t = lookupTransform()
for ( point in points )
btVector3 p = point
btVector3 q = t*p
point = q
This is taking quite some time, which is a problem because these are points from the velodyne so I have to do it again and again for millions of them.
I am wondering if there would be an advantage in throwing them all in a point cloud and transforming the whole cloud at once.
I saw that the tf::TransformListener::transfromPointCloud function is using boost::numeric::ublas in the background, which construct one big matrix with all the points and multiplies it with the transform. see http://www.ros.org/doc/api/tf/html/c++/transform__listener_8cpp_source.html#l00288
I also saw that joq's code for his velodyne driver is using pcl_ros to transform the cloud, which ends up calling pcl::transformPointCloud that can be seen here: http://docs.pointclouds.org/trunk/transforms_8hpp_source.html#l00042
At the end, you can't beat the fact that you have to multiply each point by the transform matrix, so it's only a question of computational efficiency... Which would be more efficient then: pcl_ros or ublas, and why?
Of course I will know if I experiment all of those solutions, but I am hoping that someone will save me that pain...
Also I have been thinking about GPUs these days. Is there a GPU implementation of this? Would that make sense? I guess it would for large enough point clouds...