#lima on 2025-06-24 — irc logs at oftc.catirclogs.org

2024-07-16 04:51 ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel driver has landed in mainline, userspace driver is part of mesa - Logs at https://oftc.irclog.whitequark.org/lima/

02:18 hexdump01 has joined #lima

02:20 hexdump0815 has quit [Ping timeout: 480 seconds]

04:26 Daanct12 has joined #lima

07:18 yuq825 has quit [Remote host closed the connection]

07:19 yuq825 has joined #lima

13:42 Daanct12 has quit [Quit: WeeChat 4.6.3]

15:31 dsimic is now known as Guest18684

15:31 dsimic has joined #lima

15:33 Guest18684 has quit [Ping timeout: 480 seconds]

20:30 <sarbes> I've rewritten my scalar constant multiplier MR to go the NIR route, but I got worse numbers on Shader DB. It looks like the ffma op prevents some optimization here.

20:34 <anarsoul> sounds like your first approach to lower it in ppir is better :)

20:36 <anarsoul> But ffma support in ppir is a hack. It's just a consequence of lacking corresponding optimization pass in ppir

20:37 <sarbes> Yeah... I'll try to hack around the hack :)

20:38 <sarbes> I'll break down the ffma if I can sneak a multiplier in.

20:39 <anarsoul> or just handle constant multipliers in ppir?

20:40 <anarsoul> ffma allows to save a register when passing mul result to add

20:40 <anarsoul> so breaking down ffma may result in increased register pressure

20:44 <sarbes> ...or I'll add an ffma opcode which can accept two scalars. But this sounds even more hacky.

20:44 <sarbes> But it is doable.

21:20 <sarbes> On further thought, no.

21:23 <sarbes> But it shouldn't be too hard to rearrange ffmas by hand, I think.

21:59 <anarsoul> sarbes: there is nothing wrong in folding constant multipliers in ppir

21:59 <anarsoul> it is utgard-specific feature, so I don't think that it will be useful for nir in general