3.8 KiB
Jello
A WIP video client for jellyfin.
(Planned) Features
- Integrate with jellyfin
- HDR video playback
- Audio Track selection
- Chapter selection
Libraries and frameworks used for this
- iced -> primary gui toolkit
- gstreamer -> primary video + audio decoding library
- wgpu -> rendering the video from gstreamer in iced
HDR
I'll try to document all my findings about HDR here.
I'm making this project to mainly learn about videos, color-spaces and gpu programming. And so very obviously I'm bound to make mistakes in either the code or the fundamental understanding of a concept. Please don't take anything in this text as absolute.
let window = ... // use winnit to get a window handle, check the example in this repo
let instance = wgpu::Instance::default();
let surface = instance.create_surface(window).unwrap();
let adapter = instance
.request_adapter(&wgpu::RequestAdapterOptions {
power_preference: wgpu::PowerPreference::default(),
compatible_surface: Some(&surface),
force_fallback_adapter: false,
})
.await
.context("Failed to request wgpu adapter")?;
let caps = surface.get_capabilities();
println!("{:#?}", caps.formats);
This should print out all the texture formats that can be used by your current hardware
Among these the formats that support hdr (afaik) are
wgpu::TextureFormat::Rgba16Float
wgpu::TextureFormat::Rgba32Float
wgpu::TextureFormat::Rgb10a2Unorm
wgpu::TextureFormat::Rgb10a2Uint // (unsure)
My display supports Rgb10a2Unorm so I'll be going forward with that texture format.
Rgb10a2Unorm is still the same size as a Rgba8Unorm but data is in a different representation in each of them
Rgb10a2Unorm:
R, G, B => 10 bits each (2^10 = 1024 [0..=1023])
A => 2 bits (2^2 = 4 [0..=3])
Whereas in a normal pixel
Rgba8Unorm
R, G, B, A => 8 bits each (2^8 = 256 [0..=255])
For displaying videos the alpha components is not really used (I don't know of any) so we can use re-allocate 6 bits from the alpha channel and put them in the r,g and b components.
In the shader the components get uniformly normalized from [0..=1023] integer to [0..=1] in float so we can compute them properly
Videos however are generally not stored in this format or any rgb format in general because it is not as efficient for (lossy) compression as YUV formats.
Right now I don't want to deal with yuv formats so I'll use gstreamer caps to convert the video into Rgba10a2 format
Pixel formats and Planes
Dated: Sun Jan 4 09:09:16 AM IST 2026
| value | count | quantile | percentage | frequency |
|---|---|---|---|---|
| yuv420p | 1815 | 0.5067001675041876 | 50.67% | ************************************************** |
| yuv420p10le | 1572 | 0.4388609715242881 | 43.89% | ******************************************* |
| yuvj420p | 171 | 0.04773869346733668 | 4.77% | **** |
| rgba | 14 | 0.003908431044109436 | 0.39% | |
| yuvj444p | 10 | 0.0027917364600781687 | 0.28% |
For all of my media collection these are the pixel formats for all the videos
RGBA
Pretty self evident
8 channels for each of R, G, B and A
Hopefully shouldn't be too hard to make a function or possibly a lut that takes data from rgba and maps it to Rgb10a2Unorm
packet
title RGBA
+8: "R"
+8: "G"
+8: "B"
+8: "A"
YUV
All YUV formats
10 and 16 bit yuv formats
Y -> Luminance U,V -> Chrominance
p -> Planar sp -> semi planar
j -> full range
planar formats have each of the channels in a contiguous array one after another in semi-planar formats the y channel is seperate and uv channels are interleaved