<lucaceresoli>
mripard: no particular question, I was not sure lumag's R-by is enough for applying. Is it?
sguddati has quit [Ping timeout: 480 seconds]
<mripard>
lucaceresoli: it is
LeviYun has joined #dri-devel
Guest18410 has quit []
normalpan has quit []
normalpan has joined #dri-devel
<lucaceresoli>
mripard: awesome, thanks!
normalpan has quit []
jsa1 has quit [Ping timeout: 480 seconds]
<jani>
mlankhorst: I realize I just pushed a series that's bound to conflict with that series. I didn't really think anything of it at the time. but if you merge that to drm-misc now, the conflict might be pretty bad at drm-tip rebuild.
normalpan has joined #dri-devel
<jani>
mlankhorst: ISTR vsyrjala had some comments about disabling tiling, I haven't really followed the whole thing, so might be good to get an ack from vsyrjala as well
<mlankhorst>
It only disables tiling for pre-DPT platforms now
<jani>
mlankhorst: other than that, once the above are sorted out, ack for merging via whichever tree makes most sense
Daanct12 has quit [Quit: WeeChat 4.6.3]
<mlankhorst>
I think that was the main issue with tiling
epoch101 has joined #dri-devel
<jani>
like I said, I haven't followed the discussion, I'm pretty clueless here :/
davispuh has joined #dri-devel
normalpan has quit []
<sima>
hm MrCooper on vacations?
<sima>
just stumbled over it, but os_same_file_description() in mesa should use the F_DUPFD_QUERY fcntl() added in 6.10 with c62b758bae6af as the first thing since the kcmp syscall might not be available, the fcntl always is
<sima>
on new enough kernels at least
pepp_ has quit []
pepp has joined #dri-devel
<pepp>
sima: oh interesting. Might be a better fallback than epoll
<sima>
ok I got it now, I got lost in how link lookup works
<sima>
so yeah it's magic insofar it can jump through special files and get at their kernel-internal dentry and vfs_mount through file->f_path
<sima>
but crucially, it then re-opens that file through the dentry->d_inode->i_fop->fop_open implementation
<sima>
roughly
<sima>
and dma_buf doesn't have that
<emersion>
yeah, makes sense
<sima>
so yeah that one isn't dup, but it's also not a thing for all the special fd in the gpu world like dma_buf, sync_file, drm_syncobj
<emersion>
indeed
<sima>
but otoh pidfd_getfd does work like dup()
<sima>
also apparently really old linux did handle it like a dup
<sima>
the magic procfs open I mean
JRepin has quit []
JRepin has joined #dri-devel
<sima>
emersion, ok wild goose chase finished, unless you take special action you get the default no_open inode->i_fop implementation, which stops the magic proc open
<sima>
that happens in inode_init_always_gfp()
<sima>
reading vfs code is wild
<emersion>
ha
<emersion>
thanks for tracking this down :P
asrivats has joined #dri-devel
<sima>
emersion, anyway I think the F_DUPFD_QUERY thing is good and I'd say preferred on linux
<emersion>
ack
<sima>
ofc guaranteed that I'll regret this statement in 10 years :-)
Duke`` has joined #dri-devel
krumelmonster has quit [Ping timeout: 480 seconds]
<karolherbst>
zmike: ship it
<zmike>
cool just waiting on Intel then
<tnt>
Did anything change in the way mesa links or uses LLVM in the last few months ? Or in the way it does when using llvm19 vs 20 ?
krumelmonster has joined #dri-devel
idr has joined #dri-devel
JRepin has quit []
JRepin has joined #dri-devel
dsimic is now known as Guest18418
dsimic has joined #dri-devel
Guest18418 has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
JRepin has quit []
anholt has quit [Ping timeout: 480 seconds]
<karolherbst>
tnt: not specifically, why?
JRepin has joined #dri-devel
<tnt>
karolherbst: In the past intel-compute-runtime could work just fine in a GL application and the LLVM used by mesa wouldn't clash with the one used by the intel CL stack.
<tnt>
But now they seem to clash ...
<karolherbst>
mesa reworked how gallium drivers are loaded, maybe that caused something to get messed up?
<karolherbst>
But anyway.. having multiple llvm versions in one applications is kinda a known disaster
<tnt>
Yeah, I know. AFAIR mesa wouldn't even load LLVM at all but my memory might be fuzzy on that.
<tnt>
If I could, I'd have everything using the same LLVM but the intel CL stack is using LLVM15 -_- ...
vliaskov_ has quit [Remote host closed the connection]
<tnt>
Now libGLX_mesa.so links to libLLVM.so.20.1 and I don't remember that being the case before.
djbw has joined #dri-devel
zzyiwei has quit [Quit: Lost terminal]
LeviYun has quit [Ping timeout: 480 seconds]
LeviYun has joined #dri-devel
<karolherbst>
mhhh
<karolherbst>
right.. because libgallium-25.2.0-devel.so does
<karolherbst>
I think the only difference is, that instead of it getting pulled in via dlopen it's now directly?
Jeremy_Rand_Talos has quit [Remote host closed the connection]
Jeremy_Rand_Talos has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
<tnt>
Yeah, that's very possible.
<mareko>
libLLVM should be loaded indirectly via libgallium, not via libGLX_mesa
<karolherbst>
I'm wondering... can we leave LLVM symbols unresolved and dlopen llvm at runtime in drivers needing it?
<karolherbst>
mareko: libGLX_mesa links to libgallium now
<mareko>
yes but it doesn't have to link to libLLVM
<karolherbst>
libgallium contains the gallium drivers
<mareko>
and libgallium links with LLVM
<karolherbst>
yeah, so libGLX_mesa needs to load llvm
<mareko>
so why do loaders link to LLVM too
<mareko>
there is a bunch of libraries that loaders link to that they probably shouldn't, like libelf
<karolherbst>
I think it might make sense to make the loader not link against libgallium and do dlopen instead or something...
<karolherbst>
though could be quite a bit of work
<karolherbst>
we kinda want our own symbol namespace tho...
<mareko>
it's possible that ldd prints dependencies of all loaded libraries, and in fact, loaders don't link with those libraries, but are printed by ldd to show a complete list
<karolherbst>
they do with 25.1
<mareko>
we have our own private symbol namespace since the removal of dlopen(RTLD_GLOBAL)
<karolherbst>
anyway.. libegl links to libgallium_dri which contains all the drivers, and that's kinda a new thing
<mareko>
and that's fine
<karolherbst>
yeah I'd thought so as well
<karolherbst>
maybe it's something that intel is doing that makes this annoying, or maybe something in glvnd causing issues...
<mareko>
something in libGLX_mesa is using LLVM that's independent of libgallium
<karolherbst>
I don't think it does
<karolherbst>
at least not seeing anything obvious
fab has joined #dri-devel
<mareko>
yeah, libGLX doesn't seem to use LLVM at all, so nm probably shows LLVM functions because of libgallium, which gives the misleading impression that libGLX uses LLVM
<karolherbst>
yeah, it's an indirect dependency, but LLVM does things on init time
<karolherbst>
and that sometimes causes weird issues if you have multiple LLVMs
<karolherbst>
it would be kinda nice to postpone loading LLVM though, because today every GL application ends up loading LLVM because of llvmpipe even if it's not usre
<mareko>
I think it shouldn't cause weird issues if the symbol namespace is truly private
<karolherbst>
soo.. maybe we should just dlopen LLVM and just solve it for real
<karolherbst>
yeah...
<karolherbst>
I think glvnd might do something weird...
<karolherbst>
tnt: the intel runtime doesn't link to libEGL or so, correct?
<karolherbst>
though I can't see how that would even matter...
<karolherbst>
where is the crash anyway?
JRepin has quit []
JRepin has joined #dri-devel
<zmike>
it's not just llvmpipe that loads it
<zmike>
the draw module is used by core mesa
anholt has joined #dri-devel
<jenatali>
Sometimes
<jenatali>
Or rather, it's there all the time but actually used rarely
<zmike>
yes
<zmike>
very rarely
nerdopolis has joined #dri-devel
<mareko>
the TGSI interpreter might not be the worst option for draw if you have GS support and enable the GS path for the GL_SELECT mode
<jenatali>
TGSI interpreter being the alternative to the llvm version of the draw module?
<mareko>
yes
<jenatali>
That's what we ship, since Windows LLVM dynamic linking doesn't really work
<jenatali>
And static linking is just too much bloat
<mareko>
there are a few old drivers that need draw to use LLVM because the hw lacks hw vertex processing
<jenatali>
I should look at the GS path, I think that's new since we started shipping Mesa and we probably didn't flip it on, but should
croissant_ has joined #dri-devel
<mareko>
hardware_gl_select is enabled automatically if you have a few other other caps
<daniels>
genuinely quite alarmed to discover that i915g has actually had a double-digit number of fixes since amber branched
bolson_ has joined #dri-devel
croissant has quit [Ping timeout: 480 seconds]
bolson has quit [Ping timeout: 480 seconds]
<tnt>
karolherbst: No, it doesn't link to libEGL directly. The issue is when I make an application that does both GL and CL ( no interop, just using both ) then you get errors like "Attribute list does not match Module context" and a bunch of other runtimg LLVM failure/errors.
fab has quit [Ping timeout: 480 seconds]
fab_ has joined #dri-devel
fab_ is now known as Guest18421
kts has quit [Quit: Konversation terminated!]
epoch101 has quit []
<karolherbst>
right.. the usual LLVM conflict thing
<karolherbst>
tnt: khronos ICD loader or ocl-icd?
<karolherbst>
if the former, try ocl-icd instead
<karolherbst>
the khronos loader does "dlopen (libraryName, RTLD_NOW)" where ocl-ocd is doing RTLD_LAZY|RTLD_LOCAL
<karolherbst>
though the official one is RTLD_NOW|RTLD_LOCAL as local is implicit
coldfeet has joined #dri-devel
<mareko>
we could link LLVM statically into libgallium
<mareko>
it's only inconvenient to developers because of long link times
<tnt>
karolherbst: Doesn't change anything.
<karolherbst>
I'm doing static llvm locally already, meson doesn't like it and always relinks or something
<karolherbst>
quite the pain
<karolherbst>
but yeah...
<tnt>
ATM I can actually work around the problem by disabling llvm all together ...
<karolherbst>
maybe we should just go full static on LLVM...
<karolherbst>
tnt: .... yeah well...
<karolherbst>
going full static solves a couple of other problems as well
<karolherbst>
like the opencl-c-base.h header issue is then also solved, because we'd just ship that file also included (for OpenCL support)
<karolherbst>
it's just...
<karolherbst>
my rusticl build is like 500MB :)
<karolherbst>
libgallium-25.2.0-devel.so is 350MB
<karolherbst>
maybe should check with a release build
<karolherbst>
libgallium-25.2.0-devel.so is 144MB
<karolherbst>
libRusticlOpenCL.so.1.0.0 is 205MB
<karolherbst>
libvulkan_lvp.so is 110MB
<karolherbst>
ehh 95MB
<karolherbst>
forgot to strip that one
<karolherbst>
currently libgallium on fedora is 44MB
<karolherbst>
rusticl is 36MB
<karolherbst>
lavapipe is 10MB
<karolherbst>
soo.. we are talking about ~350MB more space used
<jenatali>
karolherbst: That's why I ship GL without LLVM
<karolherbst>
yeah....
diego has left #dri-devel [#dri-devel]
feaneron has joined #dri-devel
dviola has joined #dri-devel
parthi has joined #dri-devel
parthiban has quit [Read error: No route to host]
<tnt>
Just to confirm, issue did appear going from 25.0.7 to 25.1.3 so I guess that's where those changes were made.
<mareko>
ACO is 7% slower than LLVM in Furmark and we don't know why
warpme has joined #dri-devel
rasterman- has joined #dri-devel
<anholt>
mareko: do you have it drilled down to specific shaders? I've been working on a tool that's doing that with trace replays for me on tu.
<anholt>
though furmark probably doesn't have that much going on
rasterman- has quit []
rasterman has quit [Remote host closed the connection]
rasterman has joined #dri-devel
<mareko>
yes I have the exact shader, but I also have a hw trace telling me what happens in the SIMD every clock cycle, and I haven't been able to see why it's slower
epoch101 has joined #dri-devel
<mareko>
I made sure the command buffers between LLVM and ACO are completely identical
haaninjo has joined #dri-devel
epoch101 has quit [Ping timeout: 480 seconds]
iive has joined #dri-devel
epoch101 has joined #dri-devel
epoch101_ has joined #dri-devel
Guest18421 has quit [Ping timeout: 480 seconds]
epoch101 has quit [Ping timeout: 480 seconds]
JRepin has quit []
JRepin has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
anholt has quit [Quit: Leaving]
asrivats has quit []
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
asrivats has joined #dri-devel
valpackett has quit [Ping timeout: 480 seconds]
valpackett has joined #dri-devel
asrivats has quit []
<karolherbst>
tnt: mind checking with LD_DEBUG=libs if the order of loaded libs changes significantly?
LeviYun has quit [Read error: Connection reset by peer]
LeviYun has joined #dri-devel
valpackett has quit [Ping timeout: 480 seconds]
anholt has joined #dri-devel
asrivats has joined #dri-devel
<olivial>
what's the point of all the 'assert(desc); if (!desc) { ... }' in u_format.h?
<olivial>
is the if block not unreachable?
<tnt>
karolherbst: I'm not so sure about 25.0.7 -> 25.1.3 is actually the trigger, it seems random wether it works. And I think that yeah, lib load order matters. Doing LD_PRELOAD of the OpenCL lib makes it work.
<tnt>
karolherbst: And now it seems to work without it ... I swear it's almost like once I started an application once with the LD_PRELOAD, then it will work for that application without it in subsequent attempts.
<karolherbst>
...
<tnt>
Well if someone else complains, then maybe it's worth looking into but for me now it seem to have magically fixed itself and if it re-appears, I know how to work around it for now ... Who knows maybe they'll finally update to newer LLVM before it appears again.
<tnt>
Sorry for the noise :/
<zmike>
olivial: release builds
<olivial>
ah, didn't realize mesa was dropping asserts on release builds. Thanks!
warpme has quit []
<anholt>
olivial: it's a general thing in C build systems.
JRepin has quit []
JRepin has joined #dri-devel
olivial has quit [Ping timeout: 480 seconds]
mvlad has quit [Remote host closed the connection]
olivial has joined #dri-devel
sima has quit [Ping timeout: 480 seconds]
coldfeet has quit [Quit: Lost terminal]
gnarchie has quit []
gnarchie has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
<tnt>
karolherbst: Ah ... ~/.cache/neo_compiler_cache/ ... so yeah I'm not crazy ... after starting an app once with LD_PRELOAD the programs are compiled so LLVM isn't used and so it works without LD_PRELOAD. But if I wipe the cache, it bugs again.
<karolherbst>
mhhh
<tnt>
I'm headed to bed now, but at least I'm glad I got an explanation for the inconsistent behavior I was seeing.
chaos_princess has quit [Quit: chaos_princess]
Nasina has quit [Read error: Connection reset by peer]
chaos_princess has joined #dri-devel
Nasina has joined #dri-devel
pcercuei has quit [Quit: dodo]
Nasina has quit [Read error: Connection reset by peer]
feaneron has quit [Ping timeout: 480 seconds]
Nasina has joined #dri-devel
fossdd has quit [Ping timeout: 480 seconds]
haaninjo has quit [Quit: Ex-Chat]
Nasina has quit [Read error: Connection reset by peer]
Nasina has joined #dri-devel
epoch101_ has quit [Ping timeout: 480 seconds]
guludo has quit [Ping timeout: 480 seconds]
feaneron has joined #dri-devel
cef has quit [Ping timeout: 480 seconds]
JRepin has quit []
JRepin has joined #dri-devel
cef has joined #dri-devel
luc has quit [Read error: Connection reset by peer]