ChanServ changed the topic of #wayland to: https://wayland.freedesktop.org | Discussion about the Wayland protocol and its implementations, plus libinput
<zamundaaa[m]>
tokyovigilante: there's unfortunately no proper way to actually get that information right now, you have to dig around in sysfs
<zamundaaa[m]>
cat /sys/kernel/debug/dri/1/crtc-0/amdgpu_current_bpc works for amdgpu
<tokyovigilante>
thanks, amd on amd - cat /sys/kernel/debug/dri/0/crtc-2/amdgpu_current_bpc
<tokyovigilante>
Current: 10
akimoto has quit [Remote host closed the connection]
<DemiMarie>
kennylevinsen: robclark also deals with virtio-GPU (both Venus and native context), and it is not possible to virtualize syncobjs right now
<robclark>
to be clear, syncobjs only exist btwn topmost layer of kernel (and in this case, guest kernel) and userspace, so they are not really an appropriate thing to virtualize... which means that dma_fence is a lot more relevant for wl passthru
<DemiMarie>
robclark: syncobj protocol requires that one be able to reconstruct a syncobj from the guest fences
<DemiMarie>
is that doable?
<robclark>
yeah, you can turn fences back into syncobjs on the host side, it is just a bunch of extra ioctls
<robclark>
so not really a blocker, but just kinda dumb
<DemiMarie>
the big advantage of syncobjs is that you can have a syncobj with zero fences
<DemiMarie>
that lets you express “this might not render in any finite amount of time”
<robclark>
it is purely a sw construct, it doesn't have any baring on how things work at hw level.. so it isn't really useful for virtualized hw either
<DemiMarie>
robclark: exposing the protocol to guest userspace is useful
<robclark>
maybe.. but I reserve the right to be skeptical until shown evidence ;-)
<robclark>
until then it seems like a premature optimization
<DemiMarie>
robclark: using syncobj vs fences might be a premature optimization, but also one that has been entrenched too long to change
<DemiMarie>
using explicit vs implicit sync is not a premature optimization
<DemiMarie>
syncobj is how one gets explicit sync on Wayland
<robclark>
maybe the ship has already saided, sure, but it seems like a bad decision.. and orthogonal to implicit vs explicit (the latter I agree is a good move)
<DemiMarie>
I am almost certain that it will be far less effort to implement syncobj in virtio-GPU than to try to get sync files back in Wayland
<DemiMarie>
robclark: do you do anything involving heavy compute tasks?
<robclark>
I don't see any way to have syncobj be a first class virtgpu thing
<DemiMarie>
robclark: it doesn’t need to be first class in the protocol, merely emulated without implicit sync
<DemiMarie>
whenever the guest adds a fence to its syncobj, and the fence came from the host, add the fence to a host syncobj
<robclark>
that you can do with various conversion back/forth.. it just doesn't buy you anything other than extra ioctls
<DemiMarie>
etc
<DemiMarie>
robclark: it buys support for the syncobj protocol
<robclark>
sure.. doesn't mean the protocol was the right design
<DemiMarie>
robclark: whether or not it was the right design, it is still worth supporting
<DemiMarie>
that said, the biggest advantage of syncobj is that it frees you from having to have everything finish in a finite time
<robclark>
yeah, vtest vdrm already supports that by converting back/forth
<DemiMarie>
You can generate a syncobj for a long running compute workload
<robclark>
_kinda_
<robclark>
by the time guest kernel can push to host kernel there needs to be a fence
<DemiMarie>
nope
<robclark>
so syncobj is just a bit of syntactic sugar
<DemiMarie>
not really
<robclark>
oh defn really
<robclark>
you need to have a fence
<DemiMarie>
that can already be signalled
<robclark>
sure, but?
<DemiMarie>
you can have the guest wait for its long running compute to end and then create an already-signalled fence to put in the syncobj
<DemiMarie>
in the future, this will allow using userspace fences in the syncobj
<DemiMarie>
the protocol will already be ready for them
<DemiMarie>
dma-fence having to complete in a finite time is a giant reason why Linux graphics can't have nice things
karenw has quit [Ping timeout: 480 seconds]
<robclark>
you can't submit rendering to (host or guest) kernel w/ an in-syncobj that isn't already backed by a fence that will signal in finite time
<DemiMarie>
The reason you need dma-fence for graphics is that the WSI requires it. syncobj doesn't need that, because you can (backwards-compatibly) add a new syncobj API that takes a userspace fence
<DemiMarie>
robclark: syncobj is the first step towards userspace fences
<DemiMarie>
you will never get there with implicit sync or with sync files
<DemiMarie>
those cannot support them because the WSI requires the finite time guarantee
<DemiMarie>
with syncobj, the WSI no longer has a finite time guarantee
<DemiMarie>
that means one can go to full explicit sync without having to change compositors or applications, only Mesa and the kernel
<robclark>
maybe.. I'm not convinced yet.. and from a kernel PoV we still need that finite guarantee.. so like I said by the time it comes to guest kernel we need a fence.. syncobj lets you push that latency a bit but it all stays above guest kernel
<DemiMarie>
robclark: in this world, there are no non-signaled fences
<DemiMarie>
all dma-fences are created already signalled
<robclark>
and it as all hypothetical afaict
<DemiMarie>
you still want to implement syncobj so that virtio-GPU explicit sync is not exclusive to CrOS/Android
<DemiMarie>
robclark: it's where people have been wanting to go for years
<robclark>
not really anything to do with cros/android
<robclark>
"wanting to go" -> /me says show me the profiling
<robclark>
until then it sounds like solving an invented problem
<DemiMarie>
nobody outside of that will support zwp_linux_explicit_synchronization_unstable_v1
<DemiMarie>
robclark: it isn’t just about performance
<robclark>
my point is the end result of baking syncobj into protocol is that it isn't really any better than zwp_linux_explicit_synchronization_unstable_v1 but it has more dependencies and complexities and extra ioctls in the implementation
<robclark>
we can make it work.. it just isn't any better than the simpler solution
<DemiMarie>
the time to complain about that has past
<DemiMarie>
passed
<DemiMarie>
It opens up a whole slew of previously impossible features. Features that Windows and game consoles already have.
<robclark>
sure, maybe.. if I was aware of it earlier I would have complained earlier
<robclark>
that statement, I think I am not clear about... maybe it pipelines a bit of ipc latency (in the non-vm case) but it has only been people claiming it matters rather than showing with evidence why it matters
<DemiMarie>
I suspect the reason was for future extensibility
<robclark>
maybe.. but the proof seems week and not clear that there aren't other options for extending wl proto
<robclark>
s/week/weak/
<robclark>
anyways, maybe a moot point now
<robclark>
I just like seeing more evidence based design
<DemiMarie>
robclark: you are in the fortunate position where rolling out compositor changes is easy
<DemiMarie>
I suspect that the reason for the design was so that no compositor changes are needed for future graphics stack advances
<robclark>
I'll stand by "I just like seeing more evidence based design".. nothing I've said is based on CrOS specifics, just that by the time you get to kernel (host or guest) you need to have fences, and no one has shown me traces that the ipc difference matters.. maybe someday it will but that seems to be hypothetical
<DemiMarie>
To me at least, it's less about IPC cost and more about being able to interoperate with long running compute workloads
<robclark>
not sure that I see that
<robclark>
remember from kernel perspective it still needs to be a fence
<DemiMarie>
there are no out-fences for these workloads
<DemiMarie>
there is the memory management fence but preemption handles that
<robclark>
beyond that, it is just a matter of whether you play the shell game in userspace or kernel
<DemiMarie>
it needs to be a fence now
<robclark>
it needs to be a fence forever for mm/residency
<DemiMarie>
that is dealt with by preemption for compute
<DemiMarie>
as in preempt the whole context
<robclark>
first class on-demand faulting for gpu maybe changes things but doesn't seem very efficient given how gpus work... you wouldn't want that to be the normal case.. and again, evidence pls
<DemiMarie>
to allow long-running compute and graphics in the same context with implicit sync, or with sync files, requires an extremely complex workaround that Simona Vetter, Christian König, and Faith Ekstrand spent years coming up with
<robclark>
I'm not big on hypotheticals\
<DemiMarie>
with syncobj, it is much simpler, because one can happily send a syncobj to the compositor that might not have any fences in it for weeks
<robclark>
yeah, nothing that couldn't be solved in userspace
<DemiMarie>
I will also add that Windows and all modern game consoles use a pure explicit sync approach with no analog of dma_fence that I am aware of
<robclark>
anyways, I'm salty because syncobj just added cve's I had to solve, and complexity.. with no gain ;-)
<DemiMarie>
I suspect they just pin everything
<DemiMarie>
which CVEs?
<DemiMarie>
If you pin everything in RAM, rather than allowing GPU memory to be pageable, then things get much simpler.
<robclark>
probably you can find it with git log on drm_syncobj.c otherwise I'll find it later
<robclark>
sure
<robclark>
not saying it is a good idea, but it does make things simpler
<DemiMarie>
how important is being able to page out GPU textures?
<DemiMarie>
I’m guessing that the answer for CrOS is “very”
<robclark>
for folks trying to run a web browser on 4GB devices.. very.. what I've seen is maybe 20-30% of gpu memory needs to be resident at any time
<robclark>
sure, give everyone 16gb and maybe it matters less
<DemiMarie>
robclark: if you mean drivers/gpu/drm/drm_syncobj.c, that file has not been updated since 2024
<DemiMarie>
robclark: I suspect the other drivers are optimized for games, where paging is (I presume) just too slow
nerdopolis has quit [Ping timeout: 480 seconds]
<robclark>
fullscreen games and ui are very different use cases
<DemiMarie>
yup
<robclark>
yeah, look for commits from me on drm_syncobj.c, it was a maybe 2-3 yrs back
<DemiMarie>
I only see 8570c27932e1 ("drm/syncobj: Add deadline support for syncobj waits"), which doesn’t look like a vulnerability fix.
<robclark>
maybe it was in i915 or somewhere else related.. I can dig it up later
<robclark>
or look for fixes tags that reference addition of syncobj support
<DemiMarie>
Ah, okay
<DemiMarie>
I see why you are salty about syncobj
<DemiMarie>
now it's obvious
<robclark>
hopefully they weren't fixes that got lost along the way... I doubt that, I would have followed them till getting into CrOS kernels
<DemiMarie>
for what it's worth, I'm salty too, though about something different (no type-1 hypervisor with a GPU driver in it, and so much of the actual driver being in closed-source firmware)
<robclark>
re fw, care about where the pgtables are controlled (ok, a bit more complicated than that, but controlling memory access is the first thing)
<DemiMarie>
good point
<DemiMarie>
is gpu fw more like CPU microcode?
<robclark>
fw is a broad range.. different gpu archs split the division of work differently.. I wouldn't call it like microcode.. but I would focus on who controls access to memory
<robclark>
I can't think of an easy blanket statement
<robclark>
well, blanket statement is who controls modifying the pgtables (hopefully kernel) and who controls updating pgtable address (usually fw to accomodate ctx switches but with some oversight from kernel)
Brainium has quit []
Moprius has quit []
garnacho has quit [Ping timeout: 480 seconds]
glennk has joined #wayland
<zzxyb[m]>
How can I distinguish which ICC configuration corresponds to which display, through EDID information?
Plasmoduck has joined #wayland
shankaru1 is now known as shankaru
naveenk2 has joined #wayland
AJ_Z0 has quit [Ping timeout: 480 seconds]
AJ_Z0 has joined #wayland
danieldg has quit [Ping timeout: 480 seconds]
danieldg has joined #wayland
dcz has joined #wayland
bindu_ has joined #wayland
bindu has quit [Ping timeout: 480 seconds]
feaneron has quit [Ping timeout: 480 seconds]
garnacho has joined #wayland
sima has joined #wayland
aelius has quit [Ping timeout: 480 seconds]
Ps1-Jack has joined #wayland
Psi-Jack has quit [Ping timeout: 480 seconds]
naveenk2 has quit [Ping timeout: 480 seconds]
tzimmermann has joined #wayland
mvlad has joined #wayland
pavlushka has joined #wayland
<MrCooper>
DemiMarie: the syncobj protocol doesn't really change anything about "dma-fences must signal in finite time" (which stems from kernel core memory management, not from WSI), it just means the client can wait indefinitely before submitting the GPU work (at which point a dma-fence is created and the clock starts ticking)
<MrCooper>
even with user-space fences, some kind of dma-fence representation is required for memory management
crombie has joined #wayland
naveenk2 has joined #wayland
rasterman has joined #wayland
kode542 has joined #wayland
kode54 has quit [Ping timeout: 480 seconds]
Tom^ has quit [Remote host closed the connection]
pavlushka has quit [Ping timeout: 480 seconds]
kasper93_ has joined #wayland
kasper93 is now known as Guest15861
kasper93_ is now known as kasper93
kasper93_ has joined #wayland
kasper93 is now known as Guest15862
kasper93_ is now known as kasper93
kasper93_ has joined #wayland
kasper93 is now known as Guest15863
kasper93_ is now known as kasper93
Guest15861 has quit [Ping timeout: 480 seconds]
kode542 is now known as kode54
Guest15862 has quit [Ping timeout: 480 seconds]
Guest15863 has quit [Ping timeout: 480 seconds]
fmuellner has joined #wayland
Tom^ has joined #wayland
crombie has quit [Ping timeout: 480 seconds]
crombie has joined #wayland
Tom^ has quit [Remote host closed the connection]
Tom^ has joined #wayland
<kennylevinsen>
zzxyb[m]: do you mean how to map an ICC profile to the corresponding output? Normally you'd apply it through some output configration, and the compositor might map configuration based on EDID model, serial, connector name, etc.
<kennylevinsen>
the profile tiself only contains some text descriptions that might not necessarily match anything
naveenk2 has quit [Ping timeout: 480 seconds]
naveenk2 has joined #wayland
kts has joined #wayland
naveenk2 has quit [Quit: Leaving.]
kts has quit [Ping timeout: 480 seconds]
<MrCooper>
is it illegal to call gbm_device_destroy before destroying the corresponding EGL context?
<daniels>
yeah
<daniels>
same as any other display type - you can't free the Xlib Display or the struct wl_display whilst the EGLDisplay is alive