feat(win/video): add support for recombined YUV444 encoding #2760

ns6089 · 2024-06-26T09:39:47Z

Description

The continuation of #2533. It's possible to emulate YUV 4:4:4 on gpus that don't support it natively by doubling the YUV 4:2:0 pixel count and running custom recombination shaders on both encoding and decoding side. Like Microsoft did it in MS-RDPEGFX.

Prototype stage. Requires changes on moonlight's side: I currently have custom libplacebo mpv shader implemented for plvk backend, in the future it should be possible to add Direct3D11 and OpenGL shaders.

https://github.com/ns6089/Sunshine/compare/yuv444..yuv444in420

moonlight-common-c pull request: TBD
moonlight-qt pull request: TBD, testing branch https://github.com/ns6089/moonlight-qt/tree/yuv444in420

What works and what doesn't

First prototype, left half of U_src and V_src planes in Y_out. Good DCT, bad motion compensation.
Second prototype. U_src in Y_out. V_src is spread across U_out and V_out in a pattern that is spatially consistent with Y_out. Good motion compensation, relatively fat DCT on U_out and V_out due to high frequencies.
Third prototype, dropped. Maybe can slightly improve the DCT by running 1/4 of V through averaging low pass filter.

To Do

decide what to do with resolutions not divisible by 2
decide in which part of the protocol dimension doubling will be taking place, e.g. will the client request the doubled dimension or will it be done implicitly

Screenshot

Issues Fixed or Closed

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Dependency update (updates to dependencies)
Documentation update (changes to documentation)
Repository update (changes to repository files, e.g. .github/...)

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have added or updated the in code docstring/documentation-blocks for new or existing methods/components

Branch Updates

LizardByte requires that branches be up-to-date before merging. This means that after any PR is merged, this branch
must be updated before it can be merged. You must also
Allow edits from maintainers.

I want maintainers to keep my branch updated

mirh · 2024-06-27T12:34:09Z

Awesomely crazy.
Could there be anything worth doing with an emulated 4:2:2 stream then? Like, I don't know, slightly lower recombination overhead, or lower bandwidth requirements?

Or perhaps not hitting encoding limits at higher resolutions. Like, is my understanding correct that this pixel doubling would not allow for 1440p on (say) older VCE versions that max out at 4K?

ns6089 · 2024-06-27T13:54:21Z

I don't think anyone but Intel supports 4:2:2. About 4K limit, 1440p might still work depending on how exactly said limit is implemented, the overall pixel count stays within 4K range.

mirh · 2024-06-27T15:22:22Z

I don't think anyone but Intel supports 4:2:2.

To be honest, I was more thinking of TVs than computers here. It's a mixed bag even there, but still it's not so rare.
But now that you mention pcs, decoding is much lighter on the cpu than encoding. I don't think that would usually be a deal breaker. Or nevertheless, couldn't the client-side recombination just fake to be 4:4:4 then? Or would whatever empty padding you add ruin the image more than the results you could get with just plain 4:2:0?

About 4K limit, 1440p might still work depending on how exactly said limit is implemented, the overall pixel count stays within 4K range.

You mean if the limit is actually implemented like 4096x2160 (usual old amd) vs 4096x4096 (usual old nvidia)?
Or can you really call it a day just as long as the supported total pixel count, whatever the "shape", is 7.372.800 (2560x1440x2) or more?

ns6089 · 2024-06-27T15:43:48Z

You can't encode 4:2:2 on nvidia gpus, implementing a path exclusively for intel will be too expensive.

Or can you really call it a day just as long as the supported total pixel count, whatever the "shape", is 7.372.800 (2560x1440x2) or more?

I'm already calling it a day 😎
Doubling one dimension allows to minimize discontinuities in motion estimation, in contrast to tiling. Current half-naive implementation for example has single motion estimation vertical "seam" in U and V planes.

    //     Y       U     V
    // +-------+ +---+ +---+
    // |       | |   | |   |
    // |   Y   | |UR | |VR |
    // |       | |   | |   |
    // +---+---+ +---+ +---+
    // |   |   |
    // |UL |VL |
    // |   |   |
    // +---+---+

mirh · 2024-06-27T15:58:20Z

You can't encode 4:2:2 on nvidia gpus, implementing a path exclusively for intel will be too expensive.

You can't encode 4:4:4 on amd gpus either, and yet this is what this PR is about isn't it?

ns6089 · 2024-06-27T22:12:24Z

Personally, I don't see a point in supporting recombination into 4:2:2
It will still have visible artifacts while having computational overhead close to 4:4:4 and significant amount of additional development time. And this development time will be multiplied by the amount of distinct clients,

mirh · 2024-06-28T21:23:06Z

I mean, sure, of course this is already miraculous.
I was just trying to think outside the box (4:2:2 is still subpar, but even the worst case scenario starts to be bearable instead).

If any I guess the improvement isn't that clear cut, because unlike with a direct cable connection it's not like there aren't already compression artifacts anyway. So if 4:4:4 couldn't fit in some whatever doubled 4:2:0 4K scenario, just lowering the resolution could also be a possible (and if not any easily immediate) alternative?

sonarcloud · 2024-08-22T14:09:00Z

Quality Gate passed

Issues
24 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

ns6089 · 2024-08-25T03:59:51Z

The code in this pull request is Not a Contribution under LizardByte Individual Contributor License Agreement.
The code in this pull request is shared under GNU GENERAL PUBLIC LICENSE Version 3.

The feature itself is completed on sunshine side.

ns6089 force-pushed the yuv444in420 branch from b081409 to 3aead7d Compare June 26, 2024 17:56

ns6089 mentioned this pull request Jun 28, 2024

feat(win/video): support native YUV 4:4:4 encoding #2533

Merged

11 tasks

ns6089 force-pushed the yuv444in420 branch 4 times, most recently from fc48f22 to 3a1115d Compare June 30, 2024 08:50

ns6089 force-pushed the yuv444in420 branch 2 times, most recently from 2cc6a6a to a4ffe24 Compare August 1, 2024 07:52

ns6089 changed the title ~~Support recombined YUV 4:4:4 encoding (Prototype, Windows-only for now)~~ feat(win/video): add support for recombined YUV444 encoding Aug 22, 2024

ns6089 force-pushed the yuv444in420 branch from a4ffe24 to d396f01 Compare August 22, 2024 14:01

Initial implementation of yuv444in420 encoding

b0e6f94

ns6089 force-pushed the yuv444in420 branch from d396f01 to b0e6f94 Compare August 22, 2024 14:02

ReenigneArcher closed this Aug 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(win/video): add support for recombined YUV444 encoding #2760

feat(win/video): add support for recombined YUV444 encoding #2760

ns6089 commented Jun 26, 2024 •

edited

Loading

mirh commented Jun 27, 2024

ns6089 commented Jun 27, 2024

mirh commented Jun 27, 2024

ns6089 commented Jun 27, 2024

mirh commented Jun 27, 2024

ns6089 commented Jun 27, 2024

mirh commented Jun 28, 2024

sonarcloud bot commented Aug 22, 2024

ns6089 commented Aug 25, 2024

feat(win/video): add support for recombined YUV444 encoding #2760

feat(win/video): add support for recombined YUV444 encoding #2760

Conversation

ns6089 commented Jun 26, 2024 • edited Loading

Description

What works and what doesn't

To Do

Screenshot

Issues Fixed or Closed

Type of Change

Checklist

Branch Updates

mirh commented Jun 27, 2024

ns6089 commented Jun 27, 2024

mirh commented Jun 27, 2024

ns6089 commented Jun 27, 2024

mirh commented Jun 27, 2024

ns6089 commented Jun 27, 2024

mirh commented Jun 28, 2024

sonarcloud bot commented Aug 22, 2024

Quality Gate passed

ns6089 commented Aug 25, 2024

ns6089 commented Jun 26, 2024 •

edited

Loading