point cloud to depth image; optimization?
Hi everyone,
I'm generating a depth image from a point cloud by means of a pinhole camera model. For each point in the point cloud I calculate the u,v coordinates in the target image and the depth value. That's slow and generates a relatively high CPU load. Do you have an idea how I can optimize this process?
Reducing the resolution of the target depth image is not an option as I need it in its full glory. It might be possible to load the process on a GPU. But I don't know how and if it helps at all because it would require to send the full point cloud to the GPU and the return the image. I'm not sure about the additional overhead.
Do you see any possible way to optimize the process on the CPU without reducing the input and output data density?
The node generating the depth image is already a nodelet written in C++.
Thanks
If you show us your code we may be able to point out some low/medium level optimisations. Fundamentally though there's no way around doing the u,v calculation for each point. If a significant number of the points are behind the camera filtering them may help too.
I'm still optimizing and beautifying the code. Haven't had the time yet to bring it into shape so that I can publish i. I will as soon as my schedule allows. Until then, I've managed to optimize it a little by skipping points and down/upscaling the image. Not beautiful but it helped.
Fair enough. I can't say much without looking at your code, but memory allocation and usage patterns are probably key to optimising an algorithm like this. Also are you using openCV to perform the projection or your own code?
Yeah, probably... Yes, I'm using OpenCV.
If you're using
cv2.projectPoints
are you using a single call to project the whole point cloud at once, or a call for each point? The performance difference is significant.Currently I'm iterating over all points one by one, calculating u,v image coordinates for each of them. Does cv2.projectPoints consider depth/distance from the camera? I need a depth image as outcome.
You may be able to utilize tools like batch processing in TBB (assuming you're working on a reasonable computer) to do some parallelization so you don't have to go through a GPU. TBB will decide the chunk sizes for each thread and how many threads to spin up. That'll probably give you a 5x speed up.
cv2.projectPoints just calculates the UV coordinates, but you can do another pass to calculate the depth easily enough. I noticed a several times speed up when I processed a point could in a single call to this function. Give this a try for starters and see what difference it makes.