Projecting 3D points into pixel using image_geometry::Pinholecameramodel

asked 2022-01-09 17:00:30 -0600

KhalidOwlWalid
67 ●8 ●13 ●19

updated 2022-01-13 04:50:24 -0600

Currently, I have this set of problems where I am trying to simulate bounding boxes in the simulation, and the general idea that I have drawn out is basically:

Subscribe to ground truth which helps me to detect the object I am trying to find its bounding boxes inside gazebo
Currently, the position of the object as subscribed from the ground truth is with respect to the base_link. So, I want to find the transform between the base_link and the camera_link, which I could simply use a lookupTransform to find its translation.
Then, I would have to do some maths so that the position of the object is now with respect to the camera_link
From here, I am able to determine the object's 3d position with respect to the camera (camera is now the origin)
I could then project the 3d coordinates into pixel simply by using image_geometry::PinholeCameraModel::project3dToPixel

However, the problem that I am facing is that the projection from 3d coordinates into pixel coordinate is way off from the expected position (as in it is not in the frame of my camera altho the cone is right there). I cant seem to find where I did it wrongly. Here is a really simple code for me to test in my simulation

import rclpy
import time
import cv2

from cv_bridge import CvBridge, CvBridgeError
from rclpy.qos import qos_profile_sensor_data

from rclpy.node import Node
from sensor_msgs.msg import Image, CameraInfo
from image_geometry import PinholeCameraModel

class BoundingBox(Node):

     def __init__(self):
        super().__init__('boundary_box_node')

        self.model = PinholeCameraModel()

        # self.cone_position = (6.029119415283203, 2.77589750289917, 0.76)
        self.cone_position = (12.79, 3.012697458267212, 0.76)
        # self.cone_position = (0,0,1)
        self.need_info = True
        self.cam_info = self.create_subscription(CameraInfo, "zed/right/camera_info", self.info_callback, 
                              qos_profile_sensor_data)


        # while self.need_info:
        #      time.sleep(0.01)

        self.get_logger().info(f'Got info from right!')

        self.bridge = CvBridge()
        self.sub_image = self.create_subscription(Image, 'zed/right/image_rect_color', self.image_callback, 
                                  qos_profile_sensor_data)
        # self.pub_image = self.create_publisher('/image_out', Image, 10)

    def info_callback(self, info):

        if self.need_info:
            self.get_logger().info(f'Inside info callback')
            self.model.fromCameraInfo(info)
            self.need_info = False

    def image_callback(self, image):

        translated_cone_pixel = []

        u, v = self.model.project3dToPixel(self.cone_position)

        trannslated_cone_pixel.append((int(u), int(v)))

        try:
            cv_image = self.bridge.imgmsg_to_cv2(image, desired_encoding="bgr8")
            cv2.circle(cv_image, translated_cone_pixel[0], 10, (255,0,0))
           #cv2.rectangle(cv_image, corners[0],corners[1],(0,255,0),3)

            cv2.imshow('test', cv_image)
            cv2.waitKey(1)

        except CvBridgeError as e:
            print(e)


def main(args=None):
    rclpy.init(args=args)

    bounding_box = BoundingBox()

    try:
        while rclpy.ok():
            rclpy.spin_once(bounding_box)

    except KeyboardInterrupt:
        pass

    bounding_box.destroy_node()
    rclpy.shutdown()


if __name__ == '__main__':
     main()

In this code, I have extracted the cone's position myself and transform it manually, and I was just testing out only for the 3d projection part!

Note: For confidential reason, I had to omit my model in the first image, and change it to a cube.

The position of the cube inside gazebo is (-12.3647,10.2,0) and the cone ... (more)

edit retag flag offensive close merge delete

Comments

The values of self.cone_position seem rather suspect.. (12.79, 3.012697458267212, 0.76) means the point is 12.8 meters to the east and 3 meters to the south of the camera_optical_frame, is what I understand - in which case the point is way outside the view frustum of the camera.

JunTuck ( 2022-01-10 06:55:41 -0600 )edit

Hi, I have editted my question and included some diagrams to make it easier as to why I obtain those coordinates.

KhalidOwlWalid ( 2022-01-10 07:50:40 -0600 )edit

Thanks for the diagrams, they're very helpful.

In step 2, you wrote " So, I want to find the transform between the base_link and the camera_link ...". I believe the correct frame is camera_color_optical_frame, not camera_link. These are not the same.

In any case, you can obtain the correct frame by looking the CameraInfo message you subscribed to. Do $rostopic echo zed/right/camera_info; what is the frame_id of the header?

JunTuck ( 2022-01-11 14:13:36 -0600 )edit

Here is a part of the echo:

    header:
  stamp:
    sec: 23
    nanosec: 176000000
  frame_id: zed_left_camera_optical_frame
height: 720
width: 1280
distortion_model: plumb_bob

So the frame id is zed_left_camera_optical_frame

Thank you for pointing that out. The translation is still the same and testing it out still gives me a really odd pixel coordinate value.

KhalidOwlWalid ( 2022-01-12 14:46:40 -0600 )edit

I've edited and added a simple theory that I had in mind. Would be grateul if you could give some insights

KhalidOwlWalid ( 2022-01-12 15:08:03 -0600 )edit

So zed_left_camera_optical_frame is the correct frame, but I believe your math is still wrong. I'm guessing self.cone_position = (-3.012697458267212, 0.76, 12.79) gives you the correct result.

There is a simple sanity check. Can you run RVIZ and display TF? Specifically, observe zed_camera_link and zed_left_camera_optical_frame.

JunTuck ( 2022-01-12 19:58:17 -0600 )edit

Yes, you are absolutely correct about the cone_position. May I ask why is that specifically? I've edited my question to include the tf diagram

KhalidOwlWalid ( 2022-01-13 04:46:22 -0600 )edit

Ahh, I see. is it because the axis use for opencv is slightly different hence the change in axis as shown here camera info pipeline`

KhalidOwlWalid ( 2022-01-13 04:54:42 -0600 )edit

see more comments

answered 2022-01-12 18:27:37 -0600

KhalidOwlWalid
67 ●8 ●13 ●19

updated 2022-01-13 04:44:23 -0600

Actually, I manage to solve this issue. For some reason,

u, v = self.model.project3dToPixel(self.cone_position)

actually takes in (y,z,x) format instead of (x,y,z).

Now, I am able to get the right position of the pixel coordinates. I think I might have made a mistake when calculating the cone's coordinate (though I doubt it since visualizing it in gazebo confirms that I made the right calculation.

edit flag offensive delete link

Comments

No, this is incorrect. The function absolutely takes (x,y,z) as per its docstring.

JunTuck ( 2022-01-12 19:47:39 -0600 )edit

I now realises that the axis in gazebo and axes provided here is different hence the "odd" coordinate I am using CameraInfo

KhalidOwlWalid ( 2022-01-13 07:40:11 -0600 )edit

add a comment

Projecting 3D points into pixel using image_geometry::Pinholecameramodel

Comments

1 Answer

Comments

Question Tools

Stats

Related questions

Projecting 3D points into pixel using image_geometry::Pinholecameramodel edit

Comments

1 Answer

Comments

Question Tools

Stats

Related questions

Projecting 3D points into pixel using image_geometry::Pinholecameramodel