ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
anholt has quit [Remote host closed the connection]
anholt has joined #dri-devel
epoch101 has quit [Ping timeout: 480 seconds]
djbw_ has quit [Ping timeout: 480 seconds]
nashpa has joined #dri-devel
dliviu has quit [Ping timeout: 480 seconds]
The_Company has joined #dri-devel
Company has quit [Ping timeout: 480 seconds]
The_Company has quit []
Company has joined #dri-devel
jfalempe has quit [Read error: Connection reset by peer]
sarnex has quit []
bolson has quit [Ping timeout: 480 seconds]
sarnex has joined #dri-devel
nerdopolis has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has joined #dri-devel
Jeremy_Rand_Talos__ has quit [Remote host closed the connection]
sally has quit []
Jeremy_Rand_Talos__ has joined #dri-devel
sally has joined #dri-devel
<olivial> just unassigning marge from an MR is sufficient to cancel, right?
<olivial> ah, cancelling the CI pipeline worked
glennk has joined #dri-devel
Duke`` has joined #dri-devel
coldfeet has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has quit [Remote host closed the connection]
YuGiOhJCJ has joined #dri-devel
fab has joined #dri-devel
coldfeet has quit [Remote host closed the connection]
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
itoral has joined #dri-devel
tzimmermann has joined #dri-devel
jernej has quit [Remote host closed the connection]
jernej has joined #dri-devel
Lyude has quit [Quit: Bouncer restarting]
Duke`` has quit [Ping timeout: 480 seconds]
caitcatdev has quit [Ping timeout: 480 seconds]
Sid127 has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
warpme has joined #dri-devel
quantum5 has joined #dri-devel
Lyude has joined #dri-devel
sguddati has joined #dri-devel
Peuc has joined #dri-devel
Mangix has joined #dri-devel
Peuc_ has quit [Ping timeout: 480 seconds]
frieder has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
roboporo has quit [Ping timeout: 480 seconds]
crabbedhaloablut has quit []
crabbedhaloablut has joined #dri-devel
fab has quit [Quit: fab]
moony has quit []
ciadperle^ has joined #dri-devel
moony has joined #dri-devel
sguddati1 has joined #dri-devel
sguddati has quit [Ping timeout: 480 seconds]
warpme has quit []
roboporo has joined #dri-devel
Sid127 has joined #dri-devel
caitcatdev has joined #dri-devel
phasta has joined #dri-devel
vliaskov_ has joined #dri-devel
warpme has joined #dri-devel
lynxeye has joined #dri-devel
<phasta> tursulin: drm_sched unit tests are now also being executed by RH's quality assurance on CKI. Good work. https://datawarehouse.cki-project.org/kcidb/tests/redhat:koji-134819423-ppc64le-kernel_upt_6
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
vliaskov__ has joined #dri-devel
fab has joined #dri-devel
vliaskov_ has quit [Ping timeout: 480 seconds]
Company has quit [Quit: Leaving]
sguddati has joined #dri-devel
sguddati1 has quit [Ping timeout: 480 seconds]
sima has joined #dri-devel
jkrzyszt has joined #dri-devel
root____ has joined #dri-devel
root____ has left #dri-devel [#dri-devel]
haaninjo has joined #dri-devel
rasterman has joined #dri-devel
siak has joined #dri-devel
siak has quit []
siak has joined #dri-devel
apinheiro has joined #dri-devel
<mwalle> hi, how is devm_drm_bridge_alloc() supposed to work if the bridge is part of an encoder struct which is in turn allocated (and initialzed) by drmm_simple_encoder_alloc()?
<mwalle> lucaceresoli: see drivers/gpu/drm/tidss/tidss_encoder.c
kts has joined #dri-devel
rsalvaterra_ has joined #dri-devel
rsalvaterra_ is now known as rsalvaterra
haaninjo has quit [Quit: Ex-Chat]
MrCooper_ has joined #dri-devel
MrCooper has quit [Ping timeout: 480 seconds]
rsalvaterra_ has joined #dri-devel
rsalvaterra_ is now known as rsalvaterra
<sima> rodrigovivi, imre said that your rerere commit 5dd2d660323d78890f92809be3413a77f8e41f07 has apparently a wrong interim conflict resolution for "drm/dp: Change AUX DPCD probe address from LANE0_1_STATUS to TRAINING_PATTERN_SET" in -fixes vs -next, and imre's in 7f2bb7f564c4c is the right one
<sima> can you pls try to sort this out with imre?
<sima> airlied, ^^ also heads-up so we make sure we don't accidentally land this, or send a bogus example conflict resolution to linus in the main merge window pr
<sima> imre, did you see a mail from sfr about the conflict in linux-next fly by on dri-devel?
<sima> you should get cc'ed if you've authored/committed one of the involved commits
rasterman has quit [Remote host closed the connection]
rasterman has joined #dri-devel
<lucaceresoli> mwalle: this topic has been discussed between jani and mripard w.r.t. panels for devm_drm_panel_alloc(), but for bridges it's the same
<lucaceresoli> mwalle: TL;DR: the bridge will have to be allocated dynamically (yes, that's a bit of annoyance for drivers which currently embed it, but not quite avoidable)
<imre> sima, rodrigovivi, yes rodrigo asked me if that resolution was ok and I acked it, so my fault. The correct resolution is 'ret = drm_dp_dpcd_probe(aux, DP_TRAINING_PATTERN_SET);' in the result not 'ret = drm_dp_dpcd_probe(aux, DP_LANE0_1_STATUS);'. Sorry for that.
<lucaceresoli> mwalle: and you can either have a wrapper struct that embeds the bridge, and devm_drm_bridge_alloc() that struct, if it makes sense
sguddati has quit [Ping timeout: 480 seconds]
<sima> imre, rodrigovivi ah ok, then revert of that drm-rerere commit and retrying with dim rebuild-tip should be enough
<lucaceresoli> mwalle: or you can call the low-level function __devm_drm_bridge_alloc() as done in https://lore.kernel.org/all/13d15c1414e65ffb21944d66e2820befdab54e98.1749199013.git.jani.nikula@intel.com/
<imre> sima, rodrigovivi, I suppose reverting 5dd2d660323d from drm-rerere and perhaps also doing a 'dim rebuild-tip' would fix this.
<sima> yeah that should usually do the trick
<imre> sima, ok
<sima> it's even documented as the procedure
<imre> I'll answer now to sft as well
<imre> sfr
<sima> oh, do you have the link for that one for here?
<mwalle> lucaceresoli: thanks for the pointers, i'll have a look later. right now i'm getting a refcnt overflow warning with the latest next (as it is expected i'd guess if the bridge isn't initialzed)
<imre> sima, didn't answer yet, but his email is https://lore.kernel.org/all/20250716141832.5542b414@canb.auug.org.au
<imre> it's the correct resolution, so no need for me to answer
jfalempe has joined #dri-devel
siak_ has joined #dri-devel
sguddati has joined #dri-devel
<mwalle> lucaceresoli: I'd probably need a wrapper to get a reference the private struct of the driver (within the bridge_functs), right? Ie. struct tidss_encoder_bridge { struct drm_bridge bridge; struct tidss_encoder *encoder}. Then go from drm_bridge to tidss_encoder_brigde and use the pointer to get the original private struct
siak has quit [Ping timeout: 480 seconds]
MrCooper_ is now known as MrCooper
<sima> imre, ah yeah that's just standard adjacent line changes stuff, standard fare for linus to sort out
<sima> just need to get drm-tip fixed
<lucaceresoli> mwalle: indeed the refcnt warning is expected with current -next, because kmalloc or any other "classic" allocation process won't initialize the refcnt, thus it will start from 0 hence the warning
<lucaceresoli> mwalle: and yes, your code snippet looks like a good solution
<glehmann> eric_engestrom: ugh, marge pushed the MR after she was unassigned: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115
<eric_engestrom> yeah I just saw that, cancelled it
<eric_engestrom> it's because it picked the MR, then got unassigned while it was rebasing it, and then pushed it
<eric_engestrom> bad timing
MrCooper_ has joined #dri-devel
MrCooper is now known as Guest22083
MrCooper_ is now known as MrCooper
Guest22083 has quit [Ping timeout: 480 seconds]
warpme is now known as Guest22084
sguddati has quit [Ping timeout: 480 seconds]
K900 has quit [Remote host closed the connection]
kts has quit [Ping timeout: 480 seconds]
K900 has joined #dri-devel
sguddati has joined #dri-devel
sguddati has quit [Ping timeout: 480 seconds]
sguddati has joined #dri-devel
rasterman has quit [Remote host closed the connection]
rasterman has joined #dri-devel
sguddati has quit [Ping timeout: 480 seconds]
MrCooper_ has joined #dri-devel
karolherbst0 has joined #dri-devel
karolherbst has quit [Read error: Connection reset by peer]
MrCooper has quit [Ping timeout: 480 seconds]
karolherbst0 has quit []
karolherbst has joined #dri-devel
coldfeet has joined #dri-devel
sguddati has joined #dri-devel
Guest22084 has quit []
warpme has joined #dri-devel
coldfeet has quit [Quit: Lost terminal]
itoral has quit [Remote host closed the connection]
flto_ has joined #dri-devel
flto has quit [Ping timeout: 480 seconds]
<alyssa> do we have a "reversed" version of nir_dominance_lca?
<alyssa> query returning "first block that dominates both input blocks"
epoch101 has joined #dri-devel
<alyssa> to solve the problem of "what's the highest place in the program we can insert an instruction with given sources"
<alyssa> hmm nir_opt_sink must do that..
<alyssa> hmm it uses nir_dominance_lca, maybe I'm confused
YuGiOhJCJ has quit []
* alyssa does local version first
warpme has quit []
siak_ has quit [Ping timeout: 480 seconds]
mvlad has joined #dri-devel
guludo has joined #dri-devel
feaneron has joined #dri-devel
nerdopolis has joined #dri-devel
sguddati has quit [Remote host closed the connection]
siak has joined #dri-devel
feaneron has quit [Ping timeout: 480 seconds]
feaneron has joined #dri-devel
rsalvaterra_ has joined #dri-devel
rsalvaterra_ is now known as rsalvaterra
sgerhold has quit [Quit: :/]
minecrell has quit [Quit: :/]
MrCooper_ is now known as MrCooper
<MrCooper> zmike: just bisected GALLIUM_HUD not working anymore to "gallium: de-pointerize pipe_surface"
<zmike> uhh
<zmike> you're welcome?
* zmike panics
sgerhold has joined #dri-devel
<rodrigovivi> airlied sima, on drm_netlink for ras, what do you envision as a standard user space consumer?
Matombo has quit [Remote host closed the connection]
Matombo has joined #dri-devel
SquareWinter68_ has joined #dri-devel
SquareWinter68 has quit [Ping timeout: 480 seconds]
djbw has joined #dri-devel
fab has quit [Quit: fab]
mvlad is now known as Guest22102
Guest22102 has quit [Read error: Connection reset by peer]
mvlad has joined #dri-devel
kzd has joined #dri-devel
yshui_ has quit [Remote host closed the connection]
yshui has joined #dri-devel
<gfxstrand> dcbaker: How do you feel about adding a src/python?
<gfxstrand> And is there a good way we could make that land in the import path of every script in the tree?
rsalvaterra has quit [Ping timeout: 480 seconds]
<gfxstrand> Like, I would love it if we had a src/python that just showed up as a mesabuild module so you just do `import mesabuild` at the top of your script and you get stuff
rsalvaterra has joined #dri-devel
bolson has joined #dri-devel
rsalvaterra_ has joined #dri-devel
rsalvaterra_ is now known as rsalvaterra
minecrell has joined #dri-devel
siak has quit []
phasta has quit [Quit: WeeChat 4.6.2]
<eric_engestrom> gfxstrand: there's the sys.path.insert() thing, it's ugly but it's reliable
<gfxstrand> Yeah
<eric_engestrom> (grep for that in the tree for plenty of examples)
kts has joined #dri-devel
<gfxstrand> Yeah, I found a few
<eric_engestrom> gfxstrand: what kind of thing are you looking to put in there?
fab has joined #dri-devel
sarnex has quit []
Duke`` has joined #dri-devel
Company has joined #dri-devel
<karolherbst> alyssa: 276501755364d72b55de810e728981e78c6ee0e0 is regression some CL stuff on radeonsi
<karolherbst> maybe some weirdo spirv handling missing...
<glehmann> is it the splitting or the fusing that breaks it?
<karolherbst> wished I'd knew
<karolherbst> okay so disabling those 4 opts fixes it...
<sima> rodrigovivi, tried to not think about that, maybe airlied has an idea
<gfxstrand> eric_engestrom: We've got some utils in nouveau for rust generators
<sima> or perhaps agd5f or someone else from amd thought about it
<alyssa> karolherbst: CL should be setting `exact` everywhere
<alyssa> not my bg
<alyssa> bug
<eric_engestrom> gfxstrand: ack; I'd be curious to see the MR when you post it :)
<karolherbst> at least on the fma...
<karolherbst> but yeah.. in CL the fma can't be split.. guess I'll write a patch
<gfxstrand> eric_engestrom: I gave up and I'm doing something dumb now
<agd5f> sima, rodrigovivi I vaguely remember looking at it. I think Hawking and Lijo provided some comments at the time. Our RAS stack doesn't currently make use of it.
<sima> agd5f, it's more what should the minimal open userspace for it look like, as in how much yolo
dsimic is now known as Guest22109
dsimic has joined #dri-devel
Guest22109 has quit [Ping timeout: 480 seconds]
<karolherbst> alyssa: exact doesn't help?
<karolherbst> like I need the fma to stay a fma for like forever
<gfxstrand> exact should prevent it from being split
<gfxstrand> exact means "don't do any transform on this that isn't bit-for-bit the same output"
<gfxstrand> So splitting fma is definitely out
<karolherbst> yeah... maybe it's something else going on, but it's kinda weird..
<karolherbst> mhhhhh
<karolherbst> nope it's defo those...
<karolherbst> but it's only an issue with radeonsi
<alyssa> ok but the patterns you linked are gated on the exact bit not being set so
<alyssa> can you send me the NIR_DEBUG=print output please?
<karolherbst> something nukes the exact flags...
<karolherbst> or dunno.. mhh
<alyssa> can you send me the NIR_DEBUG=print output please?
<alyssa> karolherbst: vtn is failing to set on the exact bit on fadd/fmul instructions
<karolherbst> it's legal to merge those into ffma
<alyssa> ..right, I wrote that patch didn't I.
<karolherbst> but..
<karolherbst> all the ffma! get cf_dceed
<karolherbst> so...
<karolherbst> no idea what's going on...
frieder has quit [Remote host closed the connection]
<karolherbst> maybe just something very unfortunate
<karolherbst> maybe it's just libclc being wrong
<dcbaker> gfxstrand: I've wanted to do that for a long time but never gotten to it.
<alyssa> karolherbst: can you comment out those four lines, then send me NIR_PRINT for that too?
<dcbaker> The options other than what eric_engestrom mentioned are: 1) use the `PYTHONPATH` environment variable, 2) use a small python wrapper script instead of `prog_python` that does the path insertion automatically, and then does the python equivalent of `exec $?`
<dcbaker> I've kinda wanted to do that approach because I have this clever idea of letting that script check your python imports and generate a depfile
<eric_engestrom> ooh, depfile for python would be neat
<alyssa> karolherbst: oh that's all kinds of screwed up
<alyssa> i see the problem, gimme a minute
<karolherbst> but I should send a patch to set exact on all fmas anyway
<karolherbst> though the SPIR-V might already flag them...
<karolherbst> well CL spir-v env spec says "Correctly rounded"
<karolherbst> there is mad if you don't care
flto has joined #dri-devel
<alyssa> karolherbst: The problem, I think, is that libclc explicitly uses mad
<alyssa> which does not have the exact bit set
<karolherbst> yeah, but it's fine to do either with that
<alyssa> right but I think it expects it to be consistent which you do. maybe?
<karolherbst> mhhhhhh
<karolherbst> good question
<karolherbst> I do decide inside vtn_opencl
<karolherbst> maybe I just mark the result as exact as well then...
flto_ has quit [Ping timeout: 480 seconds]
<karolherbst> let me try that
<alyssa> what?
<alyssa> sure, but that's not necessarily good enough
<alyssa> because CLC calls its own mad function directly
<alyssa> but I do agree that's probably sane
<karolherbst> mhhh... right...
<alyssa> mad()'s description seems to be "you can pick either one", not "this is some kind of fast-math mode"
<karolherbst> yeah
<alyssa> and the lack of exact is fast-math circus
<alyssa> BUT
<alyssa> that patch won't do anything, because that ffma is already exact (-:
<karolherbst> yeah, it's great
<alyssa> the problem isn't mad(), it's libclc's internal mad
<karolherbst> `#pragma OPENCL FP_CONTRACT ON` impressive
<alyssa> OHHH
<alyssa> frick
<alyssa> wait no i misread the code
<alyssa> nvm
<alyssa> average alyssa interaction
<karolherbst> anyway, my patch on mad seems to help 🙃 or I'm going crazy
<alyssa> then we have a bug elsewhere
<karolherbst> yeah... it does fix it
<alyssa> because exact should be set on that builder
<alyssa> unless this is some nonsense where the libclc shader itself is special
<karolherbst> doubtful
<karolherbst> normally the translator sets the contraction mode stuff properly
<karolherbst> ohhh
<karolherbst> oh no
<karolherbst> no no no
<karolherbst> on the nir side the only difference is "ffma!" and "ffma" now with my patch
<karolherbst> so I guess it's needed for the AMD backend
<alyssa> now that i can believe.
<karolherbst> I'll test the patch and if that solves all the other issues, we'll just set exact on fma and mad
<alyssa> um, no, the story's not over here
<alyssa> why is b.exact not *already* set?
<alyssa> and if it's not - presumably from a FP_CONTRACT ON in libclc - why do we need to override that? libclc bug? vtn bug?
<karolherbst> in the spirv?
<karolherbst> I think the translator might not bother for the clc builtins to set it on the spirv level
<karolherbst> I should check the spirv...
<karolherbst> which uhm.. is alwyas fun
<karolherbst> "%22064 = OpExtInst %float %1 mad %22061 %float_n0_836411297 %float_1_10496962" well..
<alyssa> can you post the spirv?
<karolherbst> the entire thing?
<karolherbst> it's like 2.7MiB
<alyssa> I would like to understand why b.exact is not set
<karolherbst> if it helps, I don't see any ContractionOff
<alyssa> so..
<alyssa> so why is b.exact not set?
<karolherbst> why should it be set for everything?
<alyssa> it's OpenCL, that's the default.
<alyssa> ...apparently it is not
<karolherbst> yeah, but the spir-v should tell us, because how would we know what the frontend expects
<alyssa> / The DEFAULT value is ON.
<alyssa> #pragma OPENCL FP_CONTRACT on-off-switch
<alyssa> you have got to be kidding me
<alyssa> this feels like a libclc bug.
tzimmermann has quit [Quit: Leaving]
<karolherbst> not unlikely
<karolherbst> cos requires <= 4 ulp, but with that change we go around 5
<karolherbst> most of the code was written to be "good enough" for whatever hardware was targeted
<karolherbst> (AMD)
<alyssa> I strongly suspect the real bug here then is the libclc code explicitly asking for mad's when it should be explicitly asking for ffma's or something
<alyssa> but also I don't care we can merge your patch I want to go back to reassociating fmuls which will break CL again (:
<karolherbst> :D
<karolherbst> sounds good
<karolherbst> but anyway, on fedora the libclc spirv is at /usr/lib64/clc/spirv64-mesa3d-.spv
<karolherbst> I kinda hope we can move to the LLVM SPIR-V target at some point and deal with all sorts of breakage :)
kts has quit [Ping timeout: 480 seconds]
<dcbaker> gfxstrand, eric_engestrom: I threw together a really quick and probably full of corners runner script, but does work and allows loading modules from `src/python`, it's the `wip/2025-07/src-python` branch on my gitlab
kts has joined #dri-devel
sarnex has joined #dri-devel
OftenTimeConsuming is now known as Guest22117
OftenTimeConsuming has joined #dri-devel
Guest22117 has quit [Remote host closed the connection]
OftenTimeConsuming has quit [Remote host closed the connection]
OftenTimeConsuming has joined #dri-devel
smaeul_ has joined #dri-devel
smaeul has quit [Ping timeout: 480 seconds]
smaeul_ has quit []
smaeul has joined #dri-devel
chewitt has quit [Quit: Zzz..]
jkrzyszt has quit [Quit: Konversation terminated!]
swfrd_ has joined #dri-devel
sravn has quit []
mvlad has quit [Remote host closed the connection]
kts has quit [Remote host closed the connection]
rasterman has quit [Quit: Gettin' stinky!]
kts has joined #dri-devel
sravn has joined #dri-devel
kts has quit [Remote host closed the connection]
kts has joined #dri-devel
kts has quit []
sarnex has quit []
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
<glehmann> alyssa: do you have a branch with the insert change you tried
<alyssa> glehmann: the cursor one?
<glehmann> yes
<alyssa> let me dig thru reflog
<glehmann> there are some shaders where the new pass does really badly, like farcry5/0195cf650255e8c2/vs
<glehmann> badly == double register pressure
<alyssa> do you have a branch with radv wired up?
<alyssa> trade you :P
Sid127 has quit [Quit: ZNC - https://znc.in]
caitcatdev has quit []
flto has quit [Quit: Leaving]
<alyssa> glehmann: nir/opt-association-failed-attempt pushed
<alyssa> not tested but should be ok
flto has joined #dri-devel
<alyssa> (well it's build tested, and because my Mesa build includes a bunch of chunky AGX binaries, that smoke tests the pass hah)
<glehmann> alyssa: https://gitlab.freedesktop.org/DadSchoorse/mesa/-/tree/radv-reassoc2, you can drop the last three commits there are just further things I tried for some cmat shaders
<alyssa> fwiw I'm not convinced this pass will run to a fixed point
<glehmann> it doesn't, that's why there is a loop limit 🙃
<alyssa> :clown:
<glehmann> running it a few more times only has benefits for radv
<alyssa> interesting
<alyssa> I wonder why it's not converging
<alyssa> in one iter I mean
<alyssa> I guess CSE'ing stuff makes other chains shorter and lets us reassociate more or something
<glehmann> yeah that was the case in my cmat shader
<alyssa> ah
sarnex has joined #dri-devel
<alyssa> my suspicion is that the benefits on AGX have a lot to do with making good use of preambles
<alyssa> which is good news for ir3
<alyssa> but means we need diffeent heuristics for other ISAs
* alyssa running radv under drm-shim now
<glehmann> maybe instead of trying to fix this in the reassoc pass, we should write a basic scheduler that attempts to reduce register pressure
<alyssa> 2 things can be true :)
<glehmann> aco is pretty dumb because the input register pressure is best possible result you get, our scheduler only makes it worse
<alyssa> and yeah the AGX backend schedules for pressure
<alyssa> which might also explain why my results are so much better
<alyssa> (the AGX backend reg pressure scheduler is really dumb but it helps so who cares)
sarnex has quit []
swfrd_ has quit [Remote host closed the connection]
<glehmann> something really conservative is still better than nothing
<alyssa> the AGX thing is conservative in the sense that it is guaranteed to only help pressure
<glehmann> maybe we should even do this in NIR
<alyssa> but probably kills ILP in the process
<alyssa> I don't have a cycle model of AGX so :(
<glehmann> I think aco's backend schedulers would likely do a good enough job at recovering ILP
<alyssa> fair
<glehmann> especially for GCN, where ALU latency isn't a thing
<alyssa> ^ the really dumb thing
<alyssa> (i originally wrote that at collabora for bifrost, it's just as dumb/effective there too)
linkmauve has left #dri-devel [Error from remote client]
swfrd_ has joined #dri-devel
<alyssa> when I did this for bifrost, apparently it made 60%/70% of my spills/fills go away, lol
<glehmann> is there a good reason to do it in the backend?
<karolherbst> zmike: did you start to expose intensity formats in zink recently?
<zmike> yes?
<zmike> or at least they're native-ish now
<alyssa> I mean.. it's more accurate w.r.t accounting correctly for abs/neg/sat, for example
<alyssa> (and anything aco_optimizer.cpp fuses, etc)
<karolherbst> zmike: okay.. don't seem to work with rusticl at least
<alyssa> we could probably do a NIR one but I'd probably use it in addition to the backend one and not as a replacement
<glehmann> fair
<zmike> karolherbst: how are you using it
<zmike> also wtf cl has intensity formats?
<karolherbst> probably the wrong way, why?
<alyssa> zmike: yeah it's crazy
<glehmann> (I just really hate writing/maintaining aco passes)
<karolherbst> luminance is alsoa thing
<karolherbst> luminance works tho
<zmike> wild
<alyssa> glehmann: tbh i think that says more about aco than backends in general....
<karolherbst> maybe I have to set up the swizzle on the image/sampler views correctly or something? anyway, works wiht other drivers
<zmike> are you using an image sampler?
<zmike> or buffer
<karolherbst> image
linkmauve has joined #dri-devel
<zmike> then you probably have to set the swizzles correctly
<zmike> I think it's just RRRR though
<karolherbst> yeah.. that's doable.. I just trust the uhm.. helper to do the right thing
<karolherbst> u_sampler_view_default_template
<karolherbst> which doesn't set RRRR for intensity it seems
<zmike> I don't think that actually does anything special
<zmike> you should probably copy the mesa/st handling
<karolherbst> probably
<karolherbst> anyway, that should be a simple fix
<karolherbst> I just never bothered with swizzles, because I don't do swizzled images yet
<zmike> it won't work for buffer usage though
<karolherbst> that's not supported with write images anyway, right?
<zmike> it should be
<karolherbst> pipe_image_view doesn't have a swizzle
<zmike> oh
<zmike> huh
<karolherbst> yeah...
<karolherbst> anyway.. I can just RRRR for intensity :D
<zmike> pretty sure zink could do it, but idk
<karolherbst> I haven't checked if write intensity images are broken tho
<karolherbst> well aren't supported anyway
<glehmann> alyssa: yeah I know, aco's IR design isn't really something I would recommend
<alyssa> glehmann: ok, so some preliminary notes from poking at radv:
<alyssa> * the "skip global cse if we can preamble more stuff" is a loss if you don't have preambles, surprised pikachu
<alyssa> * the divergence-aware ranking is a toss up if you don't have preambles, I guess because you only have integer SALU
<alyssa> that doesn't account for everything, though.
<glehmann> I'm actually not sure if the divergence data is close to up to date, also gfx11.5+ has float ALU too
<alyssa> hmm ok
<glehmann> float SALU, I mean
<alyssa> running divergence at the start of the pass doesn't magically solve it tho, i tried
<alyssa> my script has gpuid hardcoded to NAVI10
<alyssa> what GPU_ID should I use for gfx11.5+?
<glehmann> gfx1201
* alyssa reruns
epoch101 has quit [Ping timeout: 480 seconds]
sarnex has joined #dri-devel
lynxeye has quit [Quit: Leaving.]
fab has quit [Quit: fab]
karolherbst has quit [Ping timeout: 480 seconds]
Kayden has quit [Quit: swap kernels]
Kayden has joined #dri-devel
zzyiwei has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
gnarchie has quit []
gnarchie has joined #dri-devel
bolson has quit [Ping timeout: 480 seconds]
gnarchie has quit []
gnarchie has joined #dri-devel
gnarchie has quit []
sarbes has joined #dri-devel
haaninjo has joined #dri-devel
<zzyiwei> robclark: Hello, Sir! one more piece to assist common AHB support: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36151
gnarchie has joined #dri-devel
apinheiro has quit [Quit: Leaving]
<sarbes> Hi. I would like to implement "normalize(vec3(xyz))" for Utgard/Lima, since there is direct HW support. Unfortunately, it seems that normalize gets lowered in NIR, so it is not entirely straight forward.
<sarbes> It seems to me that I could go three different ways. 1) Undo the lowering in lima_nir_algebraic.py. 2) Introduce a normalize() op in NIR, with optional lowering. 3) Undo the lowering in a C pass.
<sarbes> 1) Seems to require some tinkering with the search code, as swizzles are not supported. Without some hacking, I'm not able to match the NIR pattern.
<sarbes> 2) I don't think that introducing such a "legacy" op would be accepted.
<sarbes> 3) Seems to be the best solution overall, but I would like to get some confirmation.
<alyssa> sarbes: I'm not seeing why you can't use an algebraic rule for that?
<alyssa> oh because of the broadcast behaviour... bah
<alyssa> vec4 hw was a mistake
<sarbes> Yeah. The pattern is something like "('fmul', 'a', ('frsq', ('fdot3', 'a', 'a')))"
<sarbes> But I would need "('fmul', 'a', ('frsq.xxx', ('fdot3', 'a', 'a')))"
<alyssa> yeah I see what you mean
<alyssa> this is for PP?
<sarbes> Yeah.
sima has quit [Ping timeout: 480 seconds]
<alyssa> I guess #3 is my prefernce if it's all the same to you
<alyssa> I wouldn't mind plumbing the op through as long as it's not invasive to other drivers, tho
<alyssa> modifying nir_search for this is a hard no
<sarbes> I've been eying this MR since it was submitted, but there is no resolution for now.
<sarbes> If #3 is the preferred route to go, so be it.
<pac85> Are we doing lisp now :p
swfrd_ has quit []
guludo has quit [Ping timeout: 480 seconds]
gnarchie has quit []
gnarchie has joined #dri-devel
Sachiel has quit [Quit: WeeChat 4.5.2]
Sachiel has joined #dri-devel
haaninjo has quit [Quit: Ex-Chat]
V has joined #dri-devel
V has quit []
Daanct12 has joined #dri-devel
Danct12 has quit [Ping timeout: 480 seconds]
feaneron has quit [Ping timeout: 480 seconds]
<sarbes> Curiously, the op is processed by the varying unit. Same as perspective division (which I want to wire up too).
<sarbes> Anyway, thanks for the input.
<alyssa> sarbes: midgard can do perspective division when loading varyings, so that'd why be
<alyssa> i assume utgard-pp could normalize too
<alyssa> I mean if you already have a floating point divide, why not?
<sarbes> Just saying. :)
<sarbes> It does make sense to me to have it there.
OftenTimeConsuming has quit [Remote host closed the connection]
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
OftenTimeConsuming has joined #dri-devel