#freedesktop on 2025-05-20 — irc logs at oftc.catirclogs.org

2024-07-16 04:52 ChanServ changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org

00:14 balrog has quit [Ping timeout: 480 seconds]

00:19 konstantin has quit [Remote host closed the connection]

00:20 konstantin has joined #freedesktop

00:33 scrumplex has joined #freedesktop

00:37 scrumplex_ has quit [Ping timeout: 480 seconds]

00:48 alarumbe has quit []

00:51 guludo has quit [Ping timeout: 480 seconds]

01:28 snetry has joined #freedesktop

01:34 sentry has quit [Ping timeout: 480 seconds]

01:37 sentry has joined #freedesktop

01:40 snetry has quit [Ping timeout: 480 seconds]

02:01 olivial has quit [Read error: Connection reset by peer]

02:02 olivial has joined #freedesktop

02:17 c137 has quit [Ping timeout: 480 seconds]

02:33 MrCooper_ has joined #freedesktop

02:36 infernixx has joined #freedesktop

02:37 MrCooper has quit [Ping timeout: 480 seconds]

02:38 infernix has quit [Ping timeout: 480 seconds]

02:40 balrog has joined #freedesktop

03:17 alpernebbi has quit [Ping timeout: 480 seconds]

03:28 alpernebbi has joined #freedesktop

03:30 MrCooper__ has joined #freedesktop

03:34 MrCooper_ has quit [Ping timeout: 480 seconds]

03:34 D-HUND has joined #freedesktop

03:37 debdog has quit [Ping timeout: 480 seconds]

03:51 GNUmoon has quit [Ping timeout: 480 seconds]

03:51 GNUmoon has joined #freedesktop

03:54 D-HUND is now known as debdog

04:09 MrCooper_ has joined #freedesktop

04:13 MrCooper__ has quit [Ping timeout: 480 seconds]

04:38 rudi_s is now known as Guest16345

04:38 rudi_s has joined #freedesktop

04:38 Guest16345 has quit [Remote host closed the connection]

04:39 ximion has quit [Remote host closed the connection]

05:09 olivial has quit [Read error: Connection reset by peer]

05:09 olivial has joined #freedesktop

05:18 jsa1 has joined #freedesktop

05:27 pjakobsson has quit []

05:40 swatish2 has joined #freedesktop

05:40 tzimmermann has joined #freedesktop

05:43 sima has joined #freedesktop

06:12 <eric_engestrom> bentiss, daniels, mupuf: the x86 shared runners went *poof*?

06:12 <eric_engestrom> https://gitlab.freedesktop.org/mesa/mesa/-/jobs/76698592 for instance, says "There are no active runners online"

06:14 <eric_engestrom> (with or without the priority tag, btw, it's not just nightly that's stuck)

06:29 Eighth_Doctor has quit []

06:29 tomeu has quit [Quit: Bridge terminating on SIGTERM]

06:29 dcbaker has quit []

06:29 jasuarez has quit [Quit: Bridge terminating on SIGTERM]

06:29 swick[m] has quit [Quit: Bridge terminating on SIGTERM]

06:29 kusma has quit [Quit: Bridge terminating on SIGTERM]

06:29 nirbheek_ has quit [Quit: Bridge terminating on SIGTERM]

06:29 sergi has quit [Quit: Bridge terminating on SIGTERM]

06:29 geobang[m]1 has quit [Quit: Bridge terminating on SIGTERM]

06:29 Trevinho has quit [Quit: Bridge terminating on SIGTERM]

06:29 alatiera[m] has quit [Quit: Bridge terminating on SIGTERM]

06:29 ErikReider[m] has quit [Quit: Bridge terminating on SIGTERM]

06:29 thaytan[m] has quit [Quit: Bridge terminating on SIGTERM]

06:29 DanLee[m] has quit []

06:29 dabrain34[m] has quit []

06:29 cadubentzen[m] has quit [Quit: Bridge terminating on SIGTERM]

06:29 MathieuBridon[m] has quit [Remote host closed the connection]

06:29 mimimyh[m] has quit [Read error: Connection reset by peer]

06:29 ndufresne[m] has quit [Write error: connection closed]

06:29 talion_809[m] has quit [Read error: Connection reset by peer]

06:29 sberz[m] has quit [Read error: Connection reset by peer]

06:29 little932[m] has quit [Write error: connection closed]

06:29 wontfix[m] has quit [Write error: connection closed]

06:29 jenatali has quit [Write error: connection closed]

06:29 ojuschugh1[m] has quit [Write error: connection closed]

06:29 mj-talbot[m] has quit [Write error: connection closed]

06:29 Salamancalasa[m] has quit [Remote host closed the connection]

06:29 colinmarc has quit [Write error: connection closed]

06:29 heftig has quit [Remote host closed the connection]

06:29 Ian[m] has quit [Read error: Connection reset by peer]

06:29 Hantz[m] has quit [Remote host closed the connection]

06:29 hellfire7734club[m] has quit [Remote host closed the connection]

06:29 gallo[m] has quit [Remote host closed the connection]

06:29 ich2022[m] has quit [Remote host closed the connection]

06:29 Berenguer1931[m] has quit [Write error: connection closed]

06:29 Auyer[m] has quit [Read error: Connection reset by peer]

06:29 valentine has quit [Remote host closed the connection]

06:29 bilboed has quit [Write error: connection closed]

06:31 <bentiss> eric_engestrom: looking into it

06:34 <bentiss> eric_engestrom: I see a storm of libcamera jobs

06:35 <bentiss> WTF???? https://gitlab.freedesktop.org/camera/libcamera/-/pipelines?page=7&scope=all -> seems like the libcamera folks enabled a patchwork bridge and requested CI for all of them

06:37 <bentiss> disabled project runners on this project

06:38 <bentiss> pinchartl: ^^ FYI, I'm raising an issue

06:40 <bentiss> damn, there is no issues enabled on this project

06:40 <bentiss> pinchartl: FWIW, I can see that you've got 32 pages of pipelines that need to be run -> https://gitlab.freedesktop.org/camera/libcamera/-/pipelines?page=33&scope=all this is not fair use of our resources

06:44 <bentiss> eric_engestrom: so the issue is because of the libcamera storm, the runner wasn't able to pick/request a single priority:low job and got marked as offline. Once the current jobs will terminate, they'll be able to recontact gitlab and should be marked as ready

06:44 <bentiss> but first, we need to clear the queue :(

06:47 <bentiss> pinchartl: one way I would accept that is if your patchwork jobs were tagged with priority:low, so it's not impacting the rest of the instance. Please forward the info to Kieran

06:49 <bentiss> yay, found where I could put the issue :)

06:52 <eric_engestrom> bentiss: ack, thanks for investigating!

06:53 <eric_engestrom> I guess with that many pipelines in the libcamera queue, letting them run is not an option, so they need to all be cancelled?

06:54 <bentiss> yeah, but I'm not doing that for them

06:56 <svuorela> bentiss: if you can give me a simple list of things to do and the rights to do it, I can do the cancelling.

06:56 <bentiss> svuorela: I don't see how it would be faster for me to look into the docs than you TBH

06:57 <svuorela> bentiss: oh. I thought it was because it needed 33 pages of clicky clicky you didn't feel like doing it.

06:58 <bentiss> svuorela: that, but I think there should be either a graphql request that can be done or an API call which would definitely be faster that clicky clicky on all 495 pipelines still pending

06:59 <bentiss> (but if you feel like spending your day on that, feel free do click on all of them one by one :-P )

06:59 <svuorela> project = Project.find_by_full_path('<project_path>')

06:59 <svuorela> Ci::Pipeline.where(project_id: project.id).where(status: 'pending').each {|p| p.cancel if p.stuck?}

06:59 <bentiss> svuorela: do you have gitlab rails console access?

07:00 <svuorela> but just remove the "if p.stuck?"

07:00 <svuorela> I don't

07:00 <svuorela> I'm just a normal user

07:00 <bentiss> svuorela: are you even part of libcamera?

07:00 <svuorela> I'm not

07:01 <svuorela> I was just volunteering to get shared CI resources unstuck

07:01 <bentiss> svuorela: then I really appreciate the effort, but I think someone needs to clean his/her own mess

07:01 <svuorela> I'm just stuck waiting behind it

07:01 <bentiss> svuorela: they are, I disabled shared runners on this project, meaning that the normal queue restarted. But we are 8 hours behind, so it can take a while before we get back to normal

07:02 <svuorela> aha.

07:02 <svuorela> great.

07:03 <svuorela> I will jsut sit back and wait then.

07:03 <bentiss> for now, there are a lot of libinput jobs taking off, and that's excepted because whot worked over the night (his day in .au)

07:03 <bentiss> svuorela: sounds like a plan

07:03 <svuorela> (and queue more jobs up ...)

07:03 <bentiss> heh

07:07 andy-turner has joined #freedesktop

07:08 karolherbst has quit [Quit: Ping timeout (120 seconds)]

07:08 sentry has quit [Quit: left OFTC]

07:08 karolherbst has joined #freedesktop

07:11 sentry has joined #freedesktop

07:12 tonitch has quit []

07:13 tonitch has joined #freedesktop

07:25 mvlad has joined #freedesktop

07:39 mripard_ has joined #freedesktop

07:40 mripard has quit [Ping timeout: 480 seconds]

08:02 <pinchartl> bentiss: I don't know what happened, I'm not aware of any change on our side. I'll figure it out

08:02 <pinchartl> sorry for what the impact :-(

08:02 dcbaker has joined #freedesktop

08:03 <bentiss> pinchartl: that's fine. Just don't re-enabled the shared runners without cleaning up the current pipelines

08:03 <pinchartl> absolutely

08:03 <bentiss> FWIW, backlog seems to be 1h and 30 minutes of waiting now

08:04 <pinchartl> I'm talking with Kieran at the moment

08:04 <bentiss> good :)

08:04 <pinchartl> we'll make sure not only to clean up, but to make sure it will never happen again

08:05 <bentiss> looks like at least there was some timeouts, and the jobs are now all marked as failed

08:05 <bentiss> someone got a shitload of failed pipeline gitlab emails :)

08:06 <whot> is that an imperial or metric shitload?

08:07 <bentiss> :)

08:08 <bentiss> whot: that reminds me that I should really get libinput to make marge pipelines pick up priority:high tags instead of none

08:09 <kbingham> ayeeee

08:10 <kbingham> bentiss, sorry - that script has been running for a long time ... not sure what broke ... it's only supposed to send 'new' patches ... not the entire patchwork history :-(

08:12 <pinchartl> kbingham: let's figure out what happened first, and then put measures in place to make sure it won't happen again

08:12 <bentiss> kbingham: looks like the cache got reset -> patchwork/5182 was triggered 8 hours ago when it already ran on page 50 https://gitlab.freedesktop.org/camera/libcamera/-/pipelines?page=50&scope=all

08:12 bochecha[m] has joined #freedesktop

08:13 <bentiss> pinchartl, kbingham: one way of preventing the DDoS is to ensure your patchwork jobs are making use of the tag `priority:low`, this way, they'll get picked up when there is room

08:14 <bentiss> but given that we now somehow vet every users, it would be nice if the patchwork bridge has some guards to not run pipelines from random users as well

08:15 <bentiss> FWIW, I think the queue is now cleared \o/

08:16 <pinchartl> bentiss: I'd prefer having to trigger the patchwork integration manually really

08:16 <bentiss> pinchartl: if you have a process where a developper manually triggers the pipeline, that's even better :)

08:17 bochecha[m] is now known as MathieuBridon[m]

08:17 <bentiss> could be automated if one of the maintainers gives a rev-by or something like that that patchwork recognizes

08:19 <kbingham> can confirm I have a metric shitload of failure emails .. :S

08:19 <bentiss> pinchartl, kbingham: I was a little bit annoyed a couple of hours ago, but don't take this personnally, we all make mistakes, and the situation is now back under control. So take your time, and if you need shared runners, you can probably re-enable them, or I can spin up dedicated VMs for you in the menatime

08:20 <kbingham> bentiss, What I really need to do here (as well,instead?) is setup a gitlab runner so libcamera build resources are done here without consuming compute. I have a build pc here - jsut don't know how to link it in yet.

08:20 <kbingham> Manual triggering is what I started with and it was a real pain.

08:21 <pinchartl> kbingham: I'm sure bentiss would love if we brought our shared runners :-)

08:21 <pinchartl> s/shared/own/

08:21 MrCooper_ is now known as MrCooper

08:21 <kbingham> 2 or 3 times a day I just ran the script ... and after a while it was pointless me pretending to be 'cron' - so I set up a systemd timer on that about 2 months ago I think.

08:21 <kbingham> So it's been fine until *some recent event* ...

08:22 <bentiss> kbingham: unless I messed up my time zones, but this seemed to happened at midnight UTC, so some cron job, reboot or something happened

08:24 <bentiss> for shared runners, you can easily add some to your group (usually you just install gitlab-runner or run it under a podman/docker container). If you want you can also add one to the entire instance, but then you'll need an admin to give you a token

08:25 <pinchartl> bentiss: is there an easy way to route pipelines to specific runners based on their priority (or another other option we can set when pushing) ?

08:26 <bentiss> kbingham: and for manually running the script, I completely understand. But you probably need your script to check if the branch already exists before pushing, that should prevent the DDoS

08:27 <bentiss> pinchartl: add the tag `priority:low` or `priority:low-aarch64` (there are variants for kvm as weel)

08:27 <kbingham> bentiss, The problem is the branches /didn't/ exist - so it was successfully supplying everything that had never been run.

08:27 <kbingham> https://paste.debian.net/1375652/ (for the curious about my script)

08:27 <kbingham> git -C libcamera/ branch -a | \

08:27 <kbingham> grep remotes/gl.fdo/patchwork | \

08:27 <kbingham> sed 's#.*gl.fdo/patchwork/##g' | sort -hr \

08:27 <kbingham> | uniq | head -n1

08:28 <pinchartl> bentiss: I meant configuring the pipeline to route jobs to our runners for low-priority jobs and to the shared runners for normal jobs for instance (so we can start testing our own runners without blocking everything)

08:28 <kbingham> That was supposed to identify the 'newest' series already tested and *only* test series newer than that ... but somethign broke - and it seemed to go back to the beginning :(

08:30 <bentiss> pinchartl: if you register your runners with a specific tag like `patchwork` (and probably allow them to run untags jobs as well), every patchwork job will only be run on your runners, when normal ones will either be picked by the shared ones or your own if they have capacity

08:39 <bentiss> kbingham: something was really off: https://gitlab.freedesktop.org/camera/libcamera/-/commit/0069b9ceb1e03d5887ac614e35d79602b003ff27/pipelines shows 432 pipelines created for that single commit

08:39 <kbingham> 2+ hours of log entries of the script attempting to apply patches from patchwork - create a branch and push it to gitlab :-(

08:40 <pinchartl> kbingham: you had nothing planned for today, right ? :-)

08:41 <kbingham> bentiss, Those will be all the patches that were already merged - so I can 'improve' the script so it detects if there was nothing applied ...

08:41 <kbingham> but yes - this was awful :_( - testing on prod is never great ...

08:41 <bentiss> kbingham: yeah I finally figured it as well, but I guess that's not on your side to fix this, more on the gitlab side (upstream gitlab)

08:42 <kbingham> The script has already run 2 more times 'successfully' since 'the event' without regenerating - so the issue was a glitch ... but I'll still disable the job for now.

08:44 <bentiss> kbingham: I have some doubts on the last_tested() function, the fact that it strips the output with `head -n 1` means that if there is a weird branch appearing, then you probably lose the current index

08:45 <kbingham> bentiss, indeed.

08:47 <bentiss> also, is there any reasons to `git-libcamera branch -D $BRANCH`? if the remote keeps the various patchwork/* branches, keeping them locally wouldn't add more space, and so you can then detect that the patch doesn't apply locally

08:47 <kbingham> Found when it happened in the logs but not a lot of insight ... ;-( https://paste.debian.net/1375656/ ..

08:47 <kbingham> https://paste.debian.net/1375657/ (without clipping)

08:48 <kbingham> I have to go to the dentist ... so I'll resume this investigation after ...

08:49 <kbingham> meanwhile: `systemctl --user stop libcamera-ci.timer`

08:49 <bentiss> heh, thanks.

08:50 * kbingham shudders : https://i.imgur.com/BNZsZfD.png ... I've got ... cleanup to do ...

08:52 swatish2 has quit [Ping timeout: 480 seconds]

09:11 <bentiss> kbingham: [when you get back from dentist] the problem also is that you do a `git-libcamera push gl.fdo -f $BRANCH` in your script, meaning that you do not trust the integrity of gitlab. I would remove the `-f`, and add a `|| true` (or put an error), which means that you'll ensure your script never mess with upstream branches

09:12 <bentiss> for the rare cases you need to force push, you can manually remove the branches on gitlab and on git.libcamera.org IMO

09:29 tomeu has joined #freedesktop

09:30 alatiera[m] has joined #freedesktop

09:30 bilboed has joined #freedesktop

09:31 swatish2 has joined #freedesktop

09:41 <kbingham> So ... the storm (aside from the ultimate blame being my script i.e. 'me' not validating parameters sufficiently) was that my dns broke - the script launched ... started running "May 19 21:28:40 Monstersaurus regular.sh[1095810]: Testing between 5184 and" (note the unchecked / failed target number) and then proceeded to run seq 0...5184 instead of seq 5184...5184 ;_( ... now while the dns was broken the script was happily churning

09:41 <kbingham> through and /failing/ to do any work - but at some point an hour later I fixed the DNS ... which then opened the flood gates and the background job started actually pushing jobs ...

09:53 olivial has quit [Read error: Connection reset by peer]

09:53 olivial has joined #freedesktop

10:06 <emersion> i'm trying to rename a project, but i'm running into this weird error:

10:06 <emersion> > Cannot rename project, the container registry path rename validation failed: Not Found

10:26 alarumbe has joined #freedesktop

11:00 swatish2 has quit [Ping timeout: 480 seconds]

11:03 swatish2 has joined #freedesktop

11:37 guludo has joined #freedesktop

11:43 swatish2 has quit [Remote host closed the connection]

12:02 snetry has joined #freedesktop

12:06 sentry has quit [Ping timeout: 480 seconds]

12:08 AbleBacon has quit [Remote host closed the connection]

12:19 swatish2 has joined #freedesktop

12:20 c137 has joined #freedesktop

12:55 andy-turner has quit []

12:58 swatish2 has quit [Ping timeout: 480 seconds]

13:14 ximion has joined #freedesktop

13:35 enyc has joined #freedesktop

13:46 sima has quit [Remote host closed the connection]

14:03 <bentiss> FWIW, gitaly-2 is running out of space. That means pushes to drm repos are not working properly :(

14:22 haaninjo has joined #freedesktop

14:25 swatish2 has joined #freedesktop

14:30 balrog has quit []

14:34 balrog has joined #freedesktop

14:37 dianders has joined #freedesktop

14:40 <bentiss> I've downloaded more disk for gitaly-2 (sorry, I had to do this joke), and we are back in business. Though we should probably have someone check on the usage of gitaly-2 and split the data into the other gitaly pods, or create a new pod, or consider that we just need to pay for more storage for this one

14:43 swatish2 has quit [Ping timeout: 480 seconds]

15:22 swatish2 has joined #freedesktop

15:26 daniels has quit [Read error: Network is unreachable]

15:26 jnoorman has quit [Remote host closed the connection]

15:26 dianders has quit [Remote host closed the connection]

15:26 jnoorman has joined #freedesktop

15:26 zmike has quit [Remote host closed the connection]

15:27 daniels has joined #freedesktop

15:27 dianders has joined #freedesktop

15:27 zmike has joined #freedesktop

15:27 jsa1 has quit [Ping timeout: 480 seconds]

15:31 c137 has quit [Ping timeout: 480 seconds]

15:34 cascardo has quit [Remote host closed the connection]

15:35 sentry has joined #freedesktop

15:36 ids1024 has joined #freedesktop

15:36 cascardo has joined #freedesktop

15:39 snetry has quit [Ping timeout: 480 seconds]

15:41 swatish2 has quit [Ping timeout: 480 seconds]

15:47 c137 has joined #freedesktop

15:55 sima has joined #freedesktop

15:56 sima is now known as Guest16401

15:56 sima has joined #freedesktop

16:01 jsa1 has joined #freedesktop

16:15 tzimmermann has quit [Quit: Leaving]

16:27 airlied_ has joined #freedesktop

16:29 airlied has quit [Ping timeout: 480 seconds]

17:05 Traneptora has joined #freedesktop

17:13 c137 has quit [Remote host closed the connection]

17:29 AbleBacon has joined #freedesktop

17:41 soreau has quit [Ping timeout: 480 seconds]

17:54 jsa1 has quit [Ping timeout: 480 seconds]

18:08 soreau has joined #freedesktop

18:11 FAQ_ has joined #freedesktop

18:15 alanc has quit [Remote host closed the connection]

18:15 alanc has joined #freedesktop

18:22 <zmike> can someone help me understand these errors? https://gitlab.freedesktop.org/mesa/mesa/-/jobs/76751030

18:22 <zmike> they don't seem to correspond to anything in the cited file

18:22 <zmike> oh nvm wrong branch

18:24 f_ is now known as Guest16412

19:04 olivial has quit [Read error: Connection reset by peer]

19:04 olivial has joined #freedesktop

19:22 misyl_ has quit [Remote host closed the connection]

19:23 mupuf has quit [Read error: Connection reset by peer]

19:23 mupuf has joined #freedesktop

19:23 agomez has joined #freedesktop

19:27 dj-death_ has joined #freedesktop

19:27 dj-death has quit [Read error: Connection reset by peer]

19:27 tanty has quit [Remote host closed the connection]

19:31 airlied_ is now known as airlied

19:38 misyl has joined #freedesktop

19:45 FAQ_ has quit []

19:57 snetry has joined #freedesktop

19:57 mvlad has quit [Remote host closed the connection]

20:02 sentry has quit [Ping timeout: 480 seconds]

20:15 agomez is now known as tanty

20:56 jsa1 has joined #freedesktop

21:03 dj-death_ has left #freedesktop [#freedesktop]

21:21 haaninjo has quit [Quit: Ex-Chat]

21:31 sima has quit [Ping timeout: 480 seconds]

21:31 Guest16401 has quit [Ping timeout: 480 seconds]

21:40 sentry has joined #freedesktop

21:43 snetry has quit [Ping timeout: 480 seconds]

22:04 guludo has quit [Quit: WeeChat 4.6.3]

22:04 jsa1 has quit [Ping timeout: 480 seconds]

23:36 <alanc> another day we're glad to control our git hosting: https://pivot-to-ai.com/2025/05/20/github-wants-to-spam-open-source-projects-with-ai-slop/

23:38 Consolatis_ has joined #freedesktop

23:38 Consolatis is now known as Guest16438

23:38 Consolatis_ is now known as Consolatis

23:41 <whot> we all know that saving time for the bug reporters is the most important thing for any open source project...

23:42 Guest16438 has quit [Ping timeout: 480 seconds]