In tod_detecting (I think we tell about this transformation) after running recognizer->match you get vector of guesses. Each guess contains PoseRT (rvec, tvec) field (aligned_pose). It is transformation from some abstract coordinate system (defined by fiducial.yaml configuration file: in this system coordinates of corners of fiducial markers must be equals described ones in config) to coordinate system of camera for test image.
If you want to find projection of this guess to the test image, you should take any observation of this object from training base(recommend to choose observation with ImageId equals imageIndex from guess->imageIndices). Next you should transform 3d points related with the observation to the abstract coord. system using inverse features3d->camera->pose transformation (for example with using Tools::invert and project3dPoints). Then you should transform obtained 3d points with guess->aligned_pose and project them to the test image (for this purpose you should know intrinsics parameters of the camera, now we assume, that test and train images was photographed with the same camera; we use projectPoints function for this purpose).
Let consider a case of solvePnP or solvePnPRansac:
void solvePnP(const Mat& objectPoints, const Mat& imagePoints, const Mat& cameraMatrix, const Mat& distCoeffs, Mat& rvec, Mat& tvec, bool useExtrinsicGuess=false)
The function returns rvec and tvec. This transformation converts objectPoints to the coordinate system of the camera such a way, that re-projection error between imagePoints and projected points after transformation would be a minimal. In other words we try to find rvec-tvec pair to minimize distance between projection (u, v) of transformed point (x,y,z) calculated on (X,Y,Z) point from objectPoints and corresponding point (x, y) from imagePoints. ki and pi are from distCoeffs, fx, fy, cx, cy are from cameraMatrix.
See attached image: