Hello everyone.

The Mixed Reality Forums here are no longer being used or maintained.

There are a few other places we would like to direct you to for support, both from Microsoft and from the community.

The first way we want to connect with you is our mixed reality developer program, which you can sign up for at https://aka.ms/IWantMR.

For technical questions, please use Stack Overflow, and tag your questions using either hololens or windows-mixed-reality.

If you want to join in discussions, please do so in the HoloDevelopers Slack, which you can join by going to https://aka.ms/holodevelopers, or in our Microsoft Tech Communities forums at https://techcommunity.microsoft.com/t5/mixed-reality/ct-p/MicrosoftMixedReality.

And always feel free to hit us up on Twitter @MxdRealityDev.

Running complex compute shaders on HoloLens

dimatompdimatomp
edited September 2017 in Questions And Answers

Hello everyone,

I've got a compute shader that works on PC (both in Unity Editor and in Windows Store build). The binary blob size is 20608 bytes, 756 instructions.
The problem is that the shader does not run on HoloLens and I cannot figure out exact nature of the difference between HoloLens' and desktop GPUs that prevents it from running.
This certainly cannot be caused by a shader size limitation because I've got another shader with binary size of 30580 which runs successfully on the device.
Can you suggest any way to troubleshoot this kind of problem? Does HL lack support for certain shader bytecode instruction(s)?

Thanks in advance. I have attached the Unity shader code.

Best Answers

  • dimatompdimatomp
    Answer ✓

    Finally, I got it working! In case if someone else faces this problem, here are a few suggestions:
    1. It appears that HoloLens may fail to run a shader containing a 'loop' bytecode instruction. This instruction appears when a 'for' or 'while' loop in HLSL source cannot be unrolled. For example, in the following piece of code, 'scanLineCount' may be a variable instead of a constant, in which case compiler cannot decide how many times the body of 'for' loop should be actually performed:

    for (uint i = 0; i < scanLineCount - 1; i++)
        for (uint j = i + 1; j < scanLineCount; j++)
            if (scanLine[i].Y > scanLine[j].Y)
            {
                OpenCloseEvent tmp = scanLine[i];
                scanLine[i] = scanLine[j];
                scanLine[j] = tmp;
            }
    

    The easiest fix is to insert 'unroll' attributes before each affected loop as follows:

    [unroll(5)]
    for (uint i = 0; i < scanLineCount - 1; i++)
        [unroll(5)]
        for (uint j = i + 1; j < scanLineCount; j++)
            if (scanLine[i].Y > scanLine[j].Y)
            {
                OpenCloseEvent tmp = scanLine[i];
                scanLine[i] = scanLine[j];
                scanLine[j] = tmp;
            }
    

    where 5 is the number of repetitions.
    2. This may also be caused by nested conditionals, as @trzy has mentioned. There is a special kind of a nested conditional occurrence which looks like this:

    void foo(bool cond, int data)
    {
        if (cond)
            return;
        // ... here goes some complicated job ...
    }
    

    This construct will be treated by compiler as follows:

    void foo(bool cond, int data)
    {
        if (!cond)
        {
             /* ... the previous complicated job, 
             now with nested conditionals ... */
        }
    }
    

    And that's when you may experience the issue. Maybe some low-end GPUs just don't like long 'if' bodies, hence they should be avoided.

    I have also attached the fixed version of my shader.

  • dimatompdimatomp
    Answer ✓

    Finally, I've got it working! In case if someone faces this problem, here are a few suggestions:
    1) The problem may be caused by a 'loop' instruction in the shader blob, which means that either some of your 'for' loops did not get unrolled or you have a 'while' loop in the HLSL source. The simplest fix is to insert 'unroll' attribute before each problematic loop. I don't know if HL limits 'loop' instruction usage anyhow.
    2) Another issue may be nested conditionals, as @trzy has pointed out. There is a special case of a nested conditional which looks like this:

    void foo(bool cond, int data)
    {
        if (!cond)
            return;
        // ... some complex job goes here ...
    }
    

    The compiler will treat the construct above as follows:

    void foo(bool cond, int data)
    {
        if (cond) 
        {
            // ... the same complex job, now wrapped into 'if' ...
        }
    }
    

    And that's when you may experience the problem. Maybe some low-end GPUs just don't like long 'if' bodies, hence they should be avoided.

    I have also attached the fixed version of my shader.

Answers

  • The recommendation from Microsoft is to just to use the shaders in the Holotoolkit (https://github.com/kaorun55/HoloLens-Samples/tree/master/Unity/PlaneFindingDemo/Assets/HoloToolkit/Utilities/Shaders)

    We went the other way and tried to modify one of the Holotoolkit shaders to improve our performance - but it is very difficult. There is no documentation on what values are support and what defines you can or can't use.

  • @Peter_NZ said:
    The recommendation from Microsoft is to just to use the shaders in the Holotoolkit (https://github.com/kaorun55/HoloLens-Samples/tree/master/Unity/PlaneFindingDemo/Assets/HoloToolkit/Utilities/Shaders)

    Thanks for the advice, but I'm talking about compute shaders (https://docs.unity3d.com/Manual/ComputeShaders.html). I am trying to move certain parts of my general-purpose computation to GPU and perform them in parallel. HoloLens' GPU is DirectX 11 compatible and supports this functionality to some extent (see http://www.emergingexperiences.com/blog-entries/2016/6/24/hololens-gpu-experiments).

  • @trzy Great, thanks for the idea! I'll try it soon.

  • It turns out that simple condition unrolling does not help in my case. I've already split the shader in 2 parts and unrolled all loops manually in the first one. Keeping on trying...

  • dimatompdimatomp
    Answer ✓

    Finally, I got it working! In case if someone else faces this problem, here are a few suggestions:
    1. It appears that HoloLens may fail to run a shader containing a 'loop' bytecode instruction. This instruction appears when a 'for' or 'while' loop in HLSL source cannot be unrolled. For example, in the following piece of code, 'scanLineCount' may be a variable instead of a constant, in which case compiler cannot decide how many times the body of 'for' loop should be actually performed:

    for (uint i = 0; i < scanLineCount - 1; i++)
        for (uint j = i + 1; j < scanLineCount; j++)
            if (scanLine[i].Y > scanLine[j].Y)
            {
                OpenCloseEvent tmp = scanLine[i];
                scanLine[i] = scanLine[j];
                scanLine[j] = tmp;
            }
    

    The easiest fix is to insert 'unroll' attributes before each affected loop as follows:

    [unroll(5)]
    for (uint i = 0; i < scanLineCount - 1; i++)
        [unroll(5)]
        for (uint j = i + 1; j < scanLineCount; j++)
            if (scanLine[i].Y > scanLine[j].Y)
            {
                OpenCloseEvent tmp = scanLine[i];
                scanLine[i] = scanLine[j];
                scanLine[j] = tmp;
            }
    

    where 5 is the number of repetitions.
    2. This may also be caused by nested conditionals, as @trzy has mentioned. There is a special kind of a nested conditional occurrence which looks like this:

    void foo(bool cond, int data)
    {
        if (cond)
            return;
        // ... here goes some complicated job ...
    }
    

    This construct will be treated by compiler as follows:

    void foo(bool cond, int data)
    {
        if (!cond)
        {
             /* ... the previous complicated job, 
             now with nested conditionals ... */
        }
    }
    

    And that's when you may experience the issue. Maybe some low-end GPUs just don't like long 'if' bodies, hence they should be avoided.

    I have also attached the fixed version of my shader.

  • trzytrzy ✭✭✭

    Have you been able to rule out complexity as a culprit? Is there any way you can simplify the shader -- even if it becomes essentially non-functional -- to a point that it just starts to run again?

  • hey dimatomp,
    Could you post an example .cs that exercises this shader? I was able to run the shader on my HoloLens with an edit that I made in order to prove to myself that the shader was running. I'm calling BorderedTriangle_ClipShiftSpatialize_Submit from VerticalInterval_BodyTriangles_ClipShiftSpatialize because I haven't been able to synthesize data that triggers BorderedTriangle_ClipShiftSpatialize_Submit through the VerticalInterval_BodyTriangles_Shader path.

    ===
    This post provided as-is with no warranties and confers no rights. Using information provided is done at own risk.

    (Daddy, what does 'now formatting drive C:' mean?)

  • dimatompdimatomp
    Answer ✓

    Finally, I've got it working! In case if someone faces this problem, here are a few suggestions:
    1) The problem may be caused by a 'loop' instruction in the shader blob, which means that either some of your 'for' loops did not get unrolled or you have a 'while' loop in the HLSL source. The simplest fix is to insert 'unroll' attribute before each problematic loop. I don't know if HL limits 'loop' instruction usage anyhow.
    2) Another issue may be nested conditionals, as @trzy has pointed out. There is a special case of a nested conditional which looks like this:

    void foo(bool cond, int data)
    {
        if (!cond)
            return;
        // ... some complex job goes here ...
    }
    

    The compiler will treat the construct above as follows:

    void foo(bool cond, int data)
    {
        if (cond) 
        {
            // ... the same complex job, now wrapped into 'if' ...
        }
    }
    

    And that's when you may experience the problem. Maybe some low-end GPUs just don't like long 'if' bodies, hence they should be avoided.

    I have also attached the fixed version of my shader.

  • trzytrzy ✭✭✭

    Nice! Glad you got it sorted and have posted this for future reference. I do recall that in fragment shaders, if "discard" was used too often, the shader wouldn't run. Frustratingly, the shader compilers don't warn you or error out, and just silently proceed.

Sign In or Register to comment.