Hello everyone.

We have decided to phase out the Mixed Reality Forums over the next few months in favor of other ways to connect with us.

The first way we want to connect with you is our mixed reality developer program, which you can sign up for at https://aka.ms/IWantMR.

The plan between now and the beginning of May is to clean up old, unanswered questions that are no longer relevant. The forums will remain open and usable.

On May 1st we will be locking the forums to new posts and replies. They will remain available for another three months for the purposes of searching them, and then they will be closed altogether on August 1st.

So, where does that leave our awesome community to ask questions? Well, there are a few places we want to engage with you. For technical questions, please use Stack Overflow, and tag your questions using either hololens or windows-mixed-reality. If you want to join in discussions, please do so in the HoloDevelopers Slack, which you can join by going to https://aka.ms/holodevelopers. And always feel free to hit us up on Twitter @MxdRealityDev.

Locatable Camera - Augmented Reality Rendering

Hi everybody,

I am trying to use augmented reality with HoloLens using my own marker detector, just as described in the Locatable Camera documentation https://developer.microsoft.com/en-us/windows/holographic/locatable_camera (at the paragraph Tag / Pattern / Poster / Object Tracking).

I am failing in drawing precise spot holograms at the corners of my marker (for now it’s just a chessboard).

Sequentially: I take a photo from wich I get the projection and camera2world matrices by using the PhotoCaptureFrame as:

photoCaptureFrame.TryGetCameraToWorldMatrix(out cameraToWorldMatrix); photoCaptureFrame.TryGetProjectionMatrix(out projectionMatrix);
With a solvePnp from OpenCV I would easily retrieve the rotation and translation from chessboard to camera; problem is that the projection matrix from hololens is quite strange:
1. Focal and Center Coordinates are not in pixel but normalized in range -1..1
2. Y axis is flipped
3. (3,3) element is -1 … ? what’s that????

So, what I do before calling the solvePnP is to adjust the pixel coordinates of the chessboard and to invert the sign of the ccx and ccy before computing the solvePnP (this is to take into account that -1 in position (3,3)), like this:

vector<Point2d> ptI_trick; for (size_t i = 0; i < ptI.size(); ++i) { float xp = ptI.at(i).x; float yp = ptI.at(i).y; Point2d image_pos_to_zero = Point2d(xp / w, 1.f - yp / h); Point2d image_pos_projected = Point2d(image_pos_to_zero.x * 2.f - 1.f, image_pos_to_zero.y * 2.f - 1.f); ptI_trick.push_back(image_pos_projected); } for (size_t i = 0; i < projectionMatrix.rows - 1; ++i) { projectionMatrix.at<double>(i, 2) = -1. * projectionMatrix.at<double>(i, 2); }

With rotation and translation from chessboard to camera, I get the chessboard points in camera coordinates.

The camera x axis is going right, the y axis is up, therefore the z is backward; therefore I expect the z to be negative. But it’s not… maybe it’s because of that y-flip above; anyway, I think it’s fixable changing the sign to the 3rd element of the translation vector; in this way, chessboard z coordinates from camera system are negatives (correct);

Then I apply the camera2world transformation (and correctly get positive z coordinates) and plot with unity the four corners with this:

Sphere1.transform.position = new Vector3(-0.08335f, 0.2276f, 0.9028f); //x,y,z Sphere2.transform.position = new Vector3(-0.01532f, 0.2321f, 0.9186f); //x,y,z Sphere3.transform.position = new Vector3(-0.08395f, 0.1321f, 0.9325f); //x,y,z Sphere4.transform.position = new Vector3(-0.01591f, 0.1366f, 0.9484f); //x,y,z
The result is this:

Unfortunately, the spheres are not aligned in the desired positions (red points). Does anyone have any idea of this? Do you think the problem is with the estimation of the coordinates or with the rendering in Unity?
The misalignment is quite strong, much bigger than 10 pixels; I say this because I cannot believe this error falls inside this statement:

Distortion Error
[…] however the undistortion function in the image processor may still leave an error of up to 10 pixels when using the projection matrix in the frame metadata. In many use cases, this error will not matter, but if you are aligning holograms to real world posters/markers, for example, and you notice a <10px offset (roughly 11mm for holograms positioned 2 meters away) this distortion error could be the cause.

Thank you!

Answers

  • I think I can answer some of this.

    First, make sure you're setting the focus point/plane to the area you want to optimize the alignment of.

    Anyway, I'm currently in the process of doing projective texture mapping of the camera image onto the spatial mesh, but because of the distortion error (which, for all their claims of being <10px, is noticeable to the levels you're seeing as well), I'm trying to refine the camera model by using OpenCV to find some distortion coefficients.

    To do this, I have to take the provided HoloLens projection matrix, in the form of:
    a 0 b 0
    0 c d 0
    0 0 -1 0
    0 0 -1 0

    ... and figure out what the matrix should be in the OpenCV format of:
    fx 0 cx
    0 fy cy
    0 0 1

    From working backwards through the "Painting the world" shader example (https://developer.microsoft.com/en-us/windows/holographic/locatable_camera#painting_the_world_using_a_camera_shader), I determined what the values of fx, fy, cx, and cy should be to result in the same transformation, given that we know a, b, c, and d.

    cx = half_width + half_width * b
    fx = a * half_width
    cy = half_height + half_height * d
    fy = c * half_height

    (where half_width and half_height are half of the image resolution in pixels)

    Note: The result of using this projection matrix in the typical OpenCV way will still require one more step at the end. Assuming you get a value (x,y) that is the normalized [0 to 1] coordinates of a location on the camera image, you have to finally change it to (1-x, 1-y). The reason this final transformation isn't incorporated into the definition of fx/fy/cx/cy above is that it would result in a negative value for fx and fy, which OpenCV doesn't like.

    My point with this is that I believe that the offset you are seeing is in fact the result of the projection error. I recommend seeing if the offset of your markers with the real world swims around as the camera moves left/right/up/down. If so, that may indicate radial distortion. I've found some marginal success by using the HoloLens projection matrix, converting it into an OpenCV-friendly format, then using it as an fixed value to solve for distortion coefficients.

  • Hello @xhensiladoda
    I have the same problem.
    How did you solve it?

Sign In or Register to comment.