ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
lsntvt__ has quit [Ping timeout: 480 seconds]
Nasina has quit [Read error: Connection reset by peer]
croissant_ has joined #dri-devel
iive has quit [Quit: They came for me...]
croissant has quit [Ping timeout: 480 seconds]
Nasina has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
chewitt has joined #dri-devel
davispuh has quit [Ping timeout: 480 seconds]
The_Company has joined #dri-devel
The_Company has quit []
Nasina has quit [Read error: Connection reset by peer]
The_Company has joined #dri-devel
The_Company has quit [Remote host closed the connection]
The_Company has joined #dri-devel
Company has quit [Ping timeout: 480 seconds]
The_Company has quit []
Company has joined #dri-devel
yuq825 has quit [Remote host closed the connection]
yuq825 has joined #dri-devel
sigmaris has quit [Quit: ZNC - https://znc.in]
sigmaris has joined #dri-devel
Nasina has joined #dri-devel
Nasina has quit [Read error: Connection reset by peer]
Nasina has joined #dri-devel
croissant_ has quit []
Nasina has quit [Remote host closed the connection]
kzd has quit [Quit: kzd]
Nasina has joined #dri-devel
alarumbe has quit []
nerdopolis has quit [Ping timeout: 480 seconds]
Nasina has quit [Read error: Connection reset by peer]
kzd has joined #dri-devel
Nasina has joined #dri-devel
Duke`` has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
Duke`` has quit []
Duke`` has joined #dri-devel
glennk has joined #dri-devel
fab has joined #dri-devel
Nasina has quit [Read error: Connection reset by peer]
YuGiOhJCJ has quit []
azerov has quit [Quit: Gateway shutdown]
azerov has joined #dri-devel
itoral has joined #dri-devel
Nasina has joined #dri-devel
Nasina has quit [Read error: Connection reset by peer]
chewitt has quit [Ping timeout: 480 seconds]
Nasina has joined #dri-devel
Nasina has quit [Read error: Connection reset by peer]
Duke`` has quit [Ping timeout: 480 seconds]
Nasina has joined #dri-devel
Nasina has quit [Read error: Connection reset by peer]
Nasina has joined #dri-devel
sima has joined #dri-devel
dolphin has joined #dri-devel
<dolphin> airlied, sima: dim is not picking up anything into drm-intel-fixes so no pull request this week
warpme has joined #dri-devel
OftenTimeConsuming has quit [Remote host closed the connection]
vliaskov_ has joined #dri-devel
OftenTimeConsuming has joined #dri-devel
fab has quit [Quit: fab]
warpme has quit []
<sima> ack
simon-perretta-img has joined #dri-devel
tzimmermann has joined #dri-devel
JRepin has joined #dri-devel
JRepin has quit [Remote host closed the connection]
pcercuei has joined #dri-devel
JRepin has joined #dri-devel
phasta has joined #dri-devel
frankbinns has quit [Ping timeout: 480 seconds]
croissant has joined #dri-devel
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
JRepin has quit [Remote host closed the connection]
chewitt has joined #dri-devel
JRepin has joined #dri-devel
vliaskov__ has joined #dri-devel
fab has joined #dri-devel
coldfeet has joined #dri-devel
warpme has joined #dri-devel
vliaskov_ has quit [Ping timeout: 480 seconds]
JRepin has quit []
JRepin has joined #dri-devel
lsntvt__ has joined #dri-devel
lsntvt_ has joined #dri-devel
coldfeet has quit [Ping timeout: 480 seconds]
bolson has quit [Ping timeout: 480 seconds]
lsntvt__ has quit [Ping timeout: 480 seconds]
idr has quit [Ping timeout: 480 seconds]
coldfeet has joined #dri-devel
JRepin has quit []
JRepin has joined #dri-devel
frankbinns has joined #dri-devel
mripard has joined #dri-devel
lynxeye has joined #dri-devel
mehdi-djait3397165695212282475 has joined #dri-devel
lsntvt__ has joined #dri-devel
jkrzyszt_ has joined #dri-devel
lsntvt_ has quit [Ping timeout: 480 seconds]
apinheiro has joined #dri-devel
rasterman has joined #dri-devel
frankbinns has quit [Ping timeout: 480 seconds]
JRepin has quit []
JRepin has joined #dri-devel
<dj-death> jnoorman: do you want to land the reviewed bits from !34344 ?
JRepin has quit []
JRepin has joined #dri-devel
<karolherbst> do we have per driver documentation about queue/context priorities? Like something that describes what guarantees a low/high priority context gives to the user. I'm sure it's all per driver if at all, just wondering if there is anything I can point out to.
coldfeet has quit [Ping timeout: 480 seconds]
coldfeet has joined #dri-devel
paulk-ter has quit []
paulk has joined #dri-devel
kts has joined #dri-devel
guludo has joined #dri-devel
jkrzyszt_ has quit []
haaninjo has joined #dri-devel
jkrzyszt_ has joined #dri-devel
<pac85> Like, for gallium?
<pac85> Or uapi level?
<karolherbst> "developer documentation" rather
alarumbe has joined #dri-devel
<karolherbst> application developer I mean
<karolherbst> like "if you use a high prio context through GLX/EGL you get those guarnatees on this driver: ..."
<pac85> I see yeah
JRepin has quit []
JRepin has joined #dri-devel
<pac85> I think it falls mainly in 3 camps: GPUs that have different hw queues with different priorities (eg. Amd), GPUs that do preemption (msm, amd, Mali, agx) and GPUs that have none of those (in which case it only affects scheduling)
<karolherbst> right
<karolherbst> but I'm interested in specific documentation
<karolherbst> I map the CL stuff to PIPE_CONTEXT_HIGH_PRIORITY and PIPE_CONTEXT_LOW_PRIORITY, and just wondering if there is any driver specific docs on it
<karolherbst> rude for an ext to require you to provide docs...
mvlad has joined #dri-devel
<pac85> I see. Personally I documented it for msm but it is very low level and probably not what you are looking for https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/Documentation/gpu/msm-preemption.rst
<pac85> As in, if you read it you'll know all you need but you'll also know stuff you don't care about as an app developer :p so not sure if it counts
<karolherbst> yeah.. I'm interested in something more high level
<karolherbst> maybe "it matches EGL/GLX" semantics would fly, but I wouldn't be surprised if those are similarly vague
coldfeet has quit [Ping timeout: 480 seconds]
<pac85> Even vulkan is vague
<karolherbst> I mean this is highly platform specific, so it makes sense to point towards vendor documentation
coldfeet has joined #dri-devel
coldfeet has quit [Remote host closed the connection]
Nasina has quit [Read error: Connection reset by peer]
coldfeet has joined #dri-devel
Nasina has joined #dri-devel
lion328 has quit [Quit: Leaving]
rsalvaterra has quit []
lion328 has joined #dri-devel
rsalvaterra has joined #dri-devel
Mercury[m] has joined #dri-devel
<jnoorman> dj-death: I think I'd prefer at least one more r-b on "nir: add BASE index to load/store_ssbo" so would you mind if I give a bit more time for people to review?
warpme has quit []
JRepin has quit []
JRepin has joined #dri-devel
<dj-death> jnoorman: okay
frankbinns has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
Nasina has quit [Read error: Connection reset by peer]
guludo has quit [Ping timeout: 480 seconds]
Nasina has joined #dri-devel
itoral has quit [Quit: Leaving]
epoch101_ has joined #dri-devel
epoch101 has quit [Ping timeout: 480 seconds]
Nasina has quit [Ping timeout: 480 seconds]
Nasina has joined #dri-devel
fab has quit [Ping timeout: 480 seconds]
fab has joined #dri-devel
hikiko_ has joined #dri-devel
Nasina has quit [Ping timeout: 480 seconds]
hikiko has quit [Ping timeout: 480 seconds]
Caterpillar has quit [Remote host closed the connection]
nerdopolis has joined #dri-devel
Caterpillar has joined #dri-devel
jsa1 has joined #dri-devel
fab has quit [Quit: fab]
fab has joined #dri-devel
warpme has joined #dri-devel
coldfeet has quit [Quit: Lost terminal]
tzimmermann has quit [Quit: Leaving]
davispuh has joined #dri-devel
jsa1 has quit [Remote host closed the connection]
pixelcluster has quit [Ping timeout: 480 seconds]
pixelcluster has joined #dri-devel
guludo has joined #dri-devel
jsa1 has joined #dri-devel
jsa1 has quit [Remote host closed the connection]
orbea1 has joined #dri-devel
orbea has quit [Ping timeout: 480 seconds]
asrivats_ has joined #dri-devel
mehdi-djait3397165695212282475 has quit []
mehdi-djait3397165695212282475 has joined #dri-devel
asrivats_ has quit [Remote host closed the connection]
ao2_collabora has quit [Quit: The Lounge - https://thelounge.chat]
dbrouwer has quit []
italove8 has quit []
kusma has quit []
ndufresne has quit [Quit: The Lounge - https://thelounge.chat]
rpavlik has quit [Quit: The Lounge - https://thelounge.chat]
sre has quit [Quit: The Lounge - https://thelounge.chat]
tintou has quit [Quit: The Lounge - https://thelounge.chat]
rpavlik has joined #dri-devel
tintou has joined #dri-devel
fab has quit [Read error: No route to host]
fab has joined #dri-devel
orbea1 has quit []
orbea has joined #dri-devel
lipidserum has joined #dri-devel
hikiko has joined #dri-devel
haaninjo has quit [Ping timeout: 480 seconds]
hikiko_ has quit [Ping timeout: 480 seconds]
mvlad has quit [Ping timeout: 480 seconds]
haaninjo has joined #dri-devel
jsa1 has joined #dri-devel
dolphin has quit [Quit: Leaving]
fab has quit [Ping timeout: 480 seconds]
idr has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
frankbinns1 has joined #dri-devel
frankbinns has quit [Ping timeout: 480 seconds]
kzd has joined #dri-devel
asrivats_ has joined #dri-devel
simon-perretta-img has joined #dri-devel
fab has joined #dri-devel
phasta has quit [Ping timeout: 480 seconds]
italove86 has quit []
tintou6 has quit []
kts has quit [Quit: Konversation terminated!]
frankbinns1 is now known as frankbinns
jsa1 has quit [Remote host closed the connection]
Duke`` has joined #dri-devel
<robclark> karolherbst: any idea about this? It is "fixed" by commenting out cl_khr_fp16
<robclark> maybe a llvm spirv translator bug?
<karolherbst> mhhhh
<karolherbst> do you have the nir?
<karolherbst> there are a few bugs in the libclc as well which could cause this..
<robclark> looks like it didn't get as far as producing the nir.. is there a way to dump the spirv?
<karolherbst> CLC_DEBUG=dump_spirv
<robclark> heh, ok.. any tip on mapping "38072 bytes into the SPIR-V binary" to location in the spirv?
<robclark> output might be too big to pastebin
<karolherbst> yeah, just wanted to say...
<robclark> well, I guess opcode=SpvOpImageSampleExplicitLod gives a hint
dsimic is now known as Guest17959
dsimic has joined #dri-devel
<karolherbst> it's a bit weird tho..
<karolherbst> ohh...
<karolherbst> the CL CTS might not test read_imageh...
fab has quit [Quit: fab]
<karolherbst> mhh there are tests actually
tzimmermann has joined #dri-devel
asrivats_ has quit [Ping timeout: 480 seconds]
fab has joined #dri-devel
Guest17959 has quit [Ping timeout: 480 seconds]
<robclark> I think this is it.. maybe?
<robclark> %1397 = OpLoad %v2uint %_compoundliteral Aligned 8
<robclark> %TempSampledImage = OpSampledImage %1336 %1392 %1339
<robclark> %call15 = OpImageSampleExplicitLod %v4half %TempSampledImage %1397 Lod %float_0
fab is now known as Guest17960
<karolherbst> sure.. but what would convert the dest to 32 bit on the nir side
<karolherbst> maybe some lowering going wrong...
<karolherbst> mhh wait, this is before lowering
<karolherbst> ohh wait
<karolherbst> it's the components that is different?
<robclark> well nir def thinks it is 4 components w/ bitsize 32.. I might be looking at the wrong one
* robclark doesn't completely understand
<karolherbst> `type->type` and `*def`
<karolherbst> inside vtn_push_nir_ssa
<karolherbst> so it does expect a f16vec4
<karolherbst> wondering what def is
<karolherbst> ohh wait.. I read the last line wrong
<karolherbst> yeah you are right, it expects 32 bit
<karolherbst> you can do something simple there.. let me figure it out
<robclark> thx
<karolherbst> robclark: "p nir_print_shader(b->shader, stdout)" inside the debugger inside the vtn_push_nir_ssa frame
<robclark> heh, that segfaulted
<karolherbst> mhhh
<karolherbst> sad
<karolherbst> ohh...
<karolherbst> the 32 is hardcoded
<robclark> :facepalm:
<karolherbst> "nir_def_init(&instr->instr, &instr->def, nir_tex_instr_dest_size(instr), 32);"
<karolherbst> so now what to do about that...
<karolherbst> not in the mood of fixing every driver
<karolherbst> maybe just insert a "f2f16"
<robclark> ok, let me take a look
<karolherbst> not sure we require 32 bit dests for tex operations
<karolherbst> but also annoying that I haven't hit it with the CTS...
frankbinns has quit [Ping timeout: 480 seconds]
<robclark> \o/
lipidserum has quit [Remote host closed the connection]
<robclark> I think that looks reasonable.. it works at least
<karolherbst> yeah.. the image CL CTS tests don't test fp16 support 🙃
<karolherbst> robclark: right.. the only concern is, that it might run into issues with drivers
<robclark> /o\
<karolherbst> like if everybody assumes 32 bits, then... it's not great
<karolherbst> _however_
<karolherbst> we can just lower inside nir_lower_cl_images for now
<karolherbst> I also added lowering for CL_DEPTH images, because in CL they are single component dests
<karolherbst> and every driver expects 4 components
<robclark> hmm, maybe ir3 is the only one that uses nir_opt_16bit_tex_image
<karolherbst> ohh.. intel as well
<karolherbst> and radeonsi
<karolherbst> mhhhh
<karolherbst> what does that pass do?
<karolherbst> try to use 16 bit instead of 32?
<robclark> yeah
<karolherbst> I see..
<robclark> folds narrowing into tex op
<karolherbst> so I see two options
warpme has quit []
<karolherbst> 1. always lower to fp32 in rusticl and let driver optimize with nir_opt_16bit_tex_image
<karolherbst> 2. add a pipe_cap and rusticl lowers for drivers not supporting 16 bit tex ops
<karolherbst> or well.. 3. lower to fp32 and make drivers call a new lowering pass
<robclark> maybe 1 is better, since drivers that want it already use the nir opt pass
<robclark> or just ask alyssa .. since I guess she knows the other compilers using rusticl that aren't ir3/amd/intel
<karolherbst> at least 1. doesn't change the status quo
<karolherbst> so I think it's way easier to do
Caterpillar has quit [Quit: Konversation terminated!]
<karolherbst> if you want you can write the fix and add something to "nir_lower_cl_images", but that pass needs some cleaning up because it's starting to become a mess
Caterpillar has joined #dri-devel
<glehmann> didn't marek just add a glsl_16bit_load_dst pipe cap for this?
<robclark> so panfrost sets that to true.. (as does zink)
<robclark> so maybe my existing fix is fine
<robclark> agx does too
<robclark> i915 and r300 don't support it :-P
<robclark> oh, and I guess radeonsi pre GFX9? Not sure how old that is
<karolherbst> glehmann: mhh I wished those caps would be better documented :)
<karolherbst> GFX9 is pretty new
<karolherbst> well.. "pretty"
<karolherbst> it's the last gen before RDNA?
<karolherbst> something like that
<glehmann> anything gcn is old
<karolherbst> isn't GFX9 like 7 years old?
lipidserum has joined #dri-devel
<robclark> I guess the more relevant question is whether rusticl is a going concern on GFX9.. I guess/assume it is?
<karolherbst> yeah, should just work (tm)
<robclark> ok, let me see if I can come up with something
<karolherbst> the question is rather, what does that cap do...
<karolherbst> seems to force high precision in a few places
<karolherbst> mhhh
<karolherbst> anyway yeah.. rusticl probably will have to do lowering inside "nir_lower_cl_images" then if it's false
<karolherbst> not the worst lowering
<robclark> why not just promote the dest type to 32b and add f2f16/etc?
<robclark> why would that need something in nir_lower_cl_images?
<glehmann> how does cl define int16 shader load for 32bit images? UB, trunc, or clamp?
dbrouwer has joined #dri-devel
dofingert has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
hikiko_ has joined #dri-devel
<robclark> I assume in this case the src img is f16
<robclark> karolherbst: also, is cl_khr_fp16 exposed on GFX8 and earlier? If not then my fix would be fine
<karolherbst> yes, on GFX8
<robclark> :-(
<karolherbst> GFX8 has slow fp16, but it does have fp16
hikiko has quit [Ping timeout: 480 seconds]
<karolherbst> "slow" as in "as fast as fp32"
<robclark> hmm, that seems mostly pointless
<karolherbst> saves on memory bandwidth
<karolherbst> though
<karolherbst> loading from/storing to fp16 data is always supported
<karolherbst> but it can also remove some pointless conversions
<karolherbst> anyway.. "prop stack supports it", so users asked
<karolherbst> I mean.. the lowering will be like 10 loc or something
<karolherbst> not the worst
cishl^ has quit [Remote host closed the connection]
Jeremy_Rand_Talos_ has quit [Remote host closed the connection]
Jeremy_Rand_Talos_ has joined #dri-devel
<mareko> gfx8 fp16 is as fast as fp32 but consumes less power
<glehmann> sub dword register addressing really sucks on gfx8 though, so in practice it's not even 1:1 alu usage with fp32 (ignoring conversions)
<mareko> I think it was designed and expected to use only low bits of VGPRs as a power optimization
<karolherbst> the other question is, if changing it in vtn, what to do about vulkan drivers
<glehmann> I prefer the gfx11 design :)
<karolherbst> or maybe it's fine for everybody
<glehmann> karolherbst: Vulkan only allows 32bit tex instructions
<karolherbst> fun
<jenatali> D3D also only allows 32 bit sampling results
<karolherbst> okay, so you would need lowering inside nir_lower_cl_images then anyway
<jenatali> Yeah
frankbinns has joined #dri-devel
<robclark> karolherbst: why does it need to be lowered in nir_lower_cl_images instead of nir_opt_16bit_tex_image?
<karolherbst> if vtn always uses fp32 then drivers can also use nir_opt_16bit_tex_image
<karolherbst> it's just that if vtn emits fp16 tex ops, then we need some lowering somewhere
<robclark> I am doing the munging into 32b in vtn
<karolherbst> okay
<karolherbst> should be easier to handle then
<robclark> is there a convenient helper for "give me the right nir alu opt for dst type"?
<karolherbst> "nir alu opt"?
<glehmann> nir_type_conversion_op
<karolherbst> ohhh you meant "op"
<robclark> ahh, thx glehmann
<glehmann> fwiw, doing the lowering in vtn and optimizing back is not possible under all conditions
<glehmann> it's going to depend on float rounding mode, and whether your hw clamps 32bit integers
Surkow|laptop has quit [Ping timeout: 480 seconds]
<robclark> I _could_ make this dependent on pipe cap.. but since we are widening an already 16b tex to 32b I think we can use undef rounding mode
tobiasjakobi has joined #dri-devel
<glehmann> undef rounding mode in this context just means use the global one in the shader, so if that's set it's not completely undefined
tobiasjakobi has quit []
lcn has quit []
lcn has joined #dri-devel
mriesch has quit [Quit: No Ping reply in 180 seconds.]
mriesch has joined #dri-devel
asrivats_ has joined #dri-devel
valpackett has quit [Ping timeout: 480 seconds]
Surkow|laptop has joined #dri-devel
bolson has joined #dri-devel
bbrezill1 has joined #dri-devel
hikiko has joined #dri-devel
epoch101_ has quit []
chewitt has quit [Read error: No route to host]
bolson has quit []
chewitt has joined #dri-devel
bbrezillon has quit [Ping timeout: 480 seconds]
Company has quit [Quit: Leaving]
hikiko_ has quit [Ping timeout: 480 seconds]
chewitt has quit []
lsntvt__ has quit [Ping timeout: 480 seconds]
unerlige has joined #dri-devel
unerlige has quit []
dofingert has quit [Read error: Connection reset by peer]
jkrzyszt_ has quit []
<karolherbst> robclark: mhhhh what CL query is that based on?
<karolherbst> ohhw ait
<karolherbst> it's already a workaround for a broken driver
<karolherbst> pain
coldfeet has joined #dri-devel
<karolherbst> robclark: add a test to the CL CTS for that 🙃
<karolherbst> but yeah...
mehdi-djait3397165695212282475 has quit []
mehdi-djait3397165695212282475 has joined #dri-devel
<karolherbst> this is about CL_DEVICE_IMAGE_MAX_BUFFER_SIZE right?
<robclark> about the pitch alignment
<karolherbst> ehh CL_DEVICE_IMAGE_PITCH_ALIGNMENT
<robclark> looks like closed driver expects units of pixels instead of bytes
<karolherbst> "The row pitch alignment size in pixels for 2D images created from a buffer."
<robclark> so rusticl is correct here
<karolherbst> yeah..
<robclark> the cl api is a bit awkward because it mixes units of pixels and bytes
epoch101 has joined #dri-devel
<karolherbst> yeah....
ecaoctr^ has joined #dri-devel
<karolherbst> the main reason I didn't want to enable drivers in rusticl from the start was, that I'm aware that others would just workaround a broken rusticl instead of reporting isses 🙃
<karolherbst> I was considering driconf for rusticl a few times to workaround broken applications tho...
mehdi-djait3397165695212282475 has quit []
<robclark> we _might_ need to driconf, but I'd prefer fixing tensorflow... the question of which of the pile of gpu_info.IsAdreno() are related to hw (which would apply to us to) vs sw
<karolherbst> beats me :)
<robclark> yeah, no worries, not your problem ;-)
<karolherbst> robclark: you probably want to check for the platform there as well tho
<robclark> right
lipidserum has quit [Remote host closed the connection]
lipidserum has joined #dri-devel
lipidserum has quit [Remote host closed the connection]
lipidserum has joined #dri-devel
lipidserum has quit [Remote host closed the connection]
lipidserum has joined #dri-devel
lipidserum has quit [Read error: Connection reset by peer]
lipidserum has joined #dri-devel
kzd has quit [Quit: kzd]
epoch101_ has joined #dri-devel
kzd has joined #dri-devel
epoch101 has quit [Ping timeout: 480 seconds]
lsntvt__ has joined #dri-devel
epoch101 has joined #dri-devel
epoch101_ has quit [Ping timeout: 480 seconds]
lipidserum has quit [Read error: Connection reset by peer]
JRepin has quit []
lipidserum has joined #dri-devel
unerlige has joined #dri-devel
<karolherbst> I think I have a GPU memory leak somewhere :')
rasterman has quit [Quit: Gettin' stinky!]
JRepin has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
alarumbe has quit []
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
Guest17960 has quit []
iive has joined #dri-devel
<zmike> can I get an ack on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21738 ? I think this function has actually been subtly broken all this time but the obfuscation hid it
asrivats_ has quit [Ping timeout: 480 seconds]
epoch101_ has joined #dri-devel
valpackett has joined #dri-devel
epoch101 has quit [Ping timeout: 480 seconds]
hikiko_ has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
hikiko has quit [Ping timeout: 480 seconds]
kasper93 has quit [Remote host closed the connection]
psykose has quit [Remote host closed the connection]
psykose has joined #dri-devel
coldfeet has quit [Quit: Lost terminal]
<idr> zmike: LGTM.
<zmike> ty
epoch101_ has quit []
<robclark> karolherbst: can CL_DEVICE_MAX_WORK_GROUP_SIZE (or the kernel equiv) be greater than the MAX_WORK_ITEMS_SIZES in any dimension? This isn't clear to me from the spec
<robclark> tensorflow seems to expect not.. but idk if it is wrong here
<karolherbst> "MAX_WORK_ITEMS_SIZES"?
<robclark> CL_DEVICE_MAX_WORK_ITEM_SIZES
<karolherbst> I don't see why CL_DEVICE_MAX_WORK_GROUP_SIZE couldn't be greater than CL_DEVICE_MAX_WORK_ITEM_SIZES
<robclark> ok.. that was my expectation
<karolherbst> applications shouldn't rely on those anyway
<karolherbst> but rather use the values from clGetKernelWorkGroupInfo
<robclark> it is using the kernel value.. but the # of threads is greater than max_work_item_sizes[0]..
<robclark> it is trying to calc a localsize and ends up picking something with too large an x dimension
<karolherbst> mhhh
<karolherbst> would have to see the code, but MAX_WORK_ITEM_SIZES simply specify the total amount of threads in a work group, and work_group_size specify the limit per dimension
<karolherbst> ehh wait
<karolherbst> the terms are so confusing
<karolherbst> MAX_WORK_ITEM_SIZES is per dim 🙃
<karolherbst> anyway
<karolherbst> I don't see a restriction of any sorts, and the code might just do assumptions
<karolherbst> _though_ I know that the CTS was also broken in this sense
<robclark> yeah, the situation where the max threads is greater than a single dimension, ie. you couldn't max out of dims==1
<robclark> but I can fix it in tensorflow
<karolherbst> on v3d I ran into very funky issues in the CTS
<karolherbst> max_grid_Size of 65535 isn't fun :D
<karolherbst> what was the hardware with a subgroup size of 128 again...
<karolherbst> that was funky
kasper93 has joined #dri-devel
<jenatali> Adreno has a subgroup of 128
sima has quit [Ping timeout: 480 seconds]
Duke`` has quit [Ping timeout: 480 seconds]
asrivats_ has joined #dri-devel
JLP_ has joined #dri-devel
JLP has quit [Ping timeout: 480 seconds]
epoch101 has joined #dri-devel
Calandracas has quit [Remote host closed the connection]
epoch101 has quit []
Calandracas has joined #dri-devel
guludo has quit [Ping timeout: 480 seconds]
feaneron has joined #dri-devel
pcercuei has quit [Quit: dodo]
<karolherbst> robclark: if you have a bit of time: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35448 and "integer_ops/test_integer_ops extended_bit_ops_extract extended_bit_ops_insert extended_bit_ops_reverse"
<karolherbst> there is lowering for everything inside mesa
Calandracas_ has joined #dri-devel
Calandracas has quit [Read error: Connection reset by peer]
asrivats_ has quit [Ping timeout: 480 seconds]
haaninjo has quit [Quit: Ex-Chat]
lipidserum has quit [Ping timeout: 480 seconds]
apinheiro has quit [Quit: Leaving]
odrling has quit [Remote host closed the connection]
odrling has joined #dri-devel
lynxeye has quit [Quit: Leaving.]
anholt has quit [Remote host closed the connection]
anholt has joined #dri-devel
nerdopolis has quit [Ping timeout: 480 seconds]
ndufresne has joined #dri-devel
vliaskov__ has quit [Ping timeout: 480 seconds]
epoch101 has joined #dri-devel
<robclark> karolherbst: hmm, I'm getting a ibitfield_extract in the backend (which you could probably repo w/ drm_shim)
<karolherbst> robclark nir_compiler_options has flags to lower it
<karolherbst> lower_bitfield_extract{,8,16}
<karolherbst> and a 64 bit one as well
<karolherbst> and "lower_bitfield_insert" or "lower_bitfield_reverse" if needed, might also need adjustement of your lower_bit_size cb depending on things
<karolherbst> but yeah.. I could try to write a patch blind
<karolherbst> with that it all passes? Do you want to push a change or should I just copy it?
<robclark> hmm, maybe we shouldn't lower bitfield_reverse except for 8b
<karolherbst> yeah.. that should be done inside the lower_bit_size cb then