<Anarchos>
I have this error when launching qemu : Signature in rsrc doesn't match constructor arg. (application/x-vnd.mmlr.QEMU, application/x-vnd.qemu-system-x86_64)
erysdren has quit [Quit: Konversation terminated!]
HaikuUser has joined #haiku
HaikuUser is now known as nephele_mobile
<nephele_mobile>
waddlesplash: I wanted to ask about _kern_send, is it possible that that syscall doesn't return? I have some problem with Renga where it (eronously) sends a xmpp ping in the ui thread, this will eventually hang the UI thread forever when no network is connected intermittently
<waddlesplash>
send() usually blocks, yeah
<nephele_mobile>
(this should of course not be in the UI thread, but if i move it as it is now the hang will just be somewhere else, so want to figure out why)
<waddlesplash>
if you don't want it to block you need to set O_NONBLOCK
<waddlesplash>
it will then return EAGAIN/B_WOULD_BLOCK to inform you that nothing can be sent
<waddlesplash>
and you will have to retry later
<nephele_mobile>
how long is it supposed to block?
<waddlesplash>
until it manages to send
<waddlesplash>
this is all standard POSIX stuff
<nephele_mobile>
Okay. But it never manages to it seems
<nephele_mobile>
That is at 3ish my NAT ip adress changes, and afterwards Renga will be stuck forever in that syscall
<waddlesplash>
"If space is not available at the sending socket to hold the message to be transmitted, and the socket file descriptor does not have O_NONBLOCK set, send() shall block until space is available or a timeout occurs (see SO_SNDTIMEO in 2.10.16 Use of Options)."
<waddlesplash>
I think SO_SNDTIMEO defaults to "no timeout"
<waddlesplash>
if this is a TCP socket then it will only block if the send queue is full
<waddlesplash>
if a UDP socket it should never block, or maybe block only if the hardware send queue is full
<waddlesplash>
but the latter basically never happens
<nephele_mobile>
that seems very wierd tbh. The manpage has many errorcodes for the network beeing down, why don't we return those?
<nephele_mobile>
i.e why do you have to wait for a timeout if you know the message cannot be delivered?
<waddlesplash>
we do, if it's actually down
<waddlesplash>
nephele_mobile: if this is a TCP socket, and the TCP send queue is full, and the network is alive meaning we are sending packets, but we aren't getting back any ACKs from the other end, then this state will continue indefinitely unless some timeout is set somewhere
<nephele_mobile>
okay, why does this not work for this case of my NAT getting a new IP?
<waddlesplash>
if the network is actually disconnected then we will get errors
<waddlesplash>
good question, I don't know
<waddlesplash>
presumably if you got a new IP, your existing TCP connections should get reset
<nephele_mobile>
disconnected to where? my LAN ip adress doesn't change, the gateway does not manage DHCP
<waddlesplash>
if the remote end of the connection sees a different IP from you, then when you try to send traffic on the existing connection, the server won't recognize it
<waddlesplash>
and it is supposed to send an RST (reset) in that case
<waddlesplash>
and then the local end will see that, and the TCP socket will switch to disconnected state
<waddlesplash>
and then the send() returns with errors
<nephele_mobile>
okay, that sounds like it should trigger for this case
<waddlesplash>
so, the first thing to check here is to grab a packet capture and see if the RST is coming through
<waddlesplash>
if it is, then we indeed have a Haiku bug
<waddlesplash>
if it's not, then this is all behaving as expected
<nephele_mobile>
how should the remote server see this though? it sounds like a race condition to me
<waddlesplash>
see what?
<nephele_mobile>
if send blocks, the router resets it
<waddlesplash>
the remote server will get packets from you with a different IP address
<nephele_mobile>
it's NAT ip adress
<waddlesplash>
you mean your IP address on LAN?
<waddlesplash>
or your IP address "to the world"?
<nephele_mobile>
The IP adress on the NAT, i.e "to the world" changes
<waddlesplash>
right, so the remote server sees a different IP
<waddlesplash>
and it doesn't know about any connections on that IP
<waddlesplash>
so the default behavior when seeing incoming TCP packets on unrecognized IPs, is to send RST
<nephele_mobile>
Hmm, okay. I see my confusion, this is only true for TCP?
<waddlesplash>
yes, other protocols have to deal with this separately
<nephele_mobile>
ah okay
<waddlesplash>
this is in fact the major reason "TCP multi-homing" is a protocol extension (and not widely used)
<waddlesplash>
because if the packets came from some other IP, what connection do they really belong to?
<waddlesplash>
TCP entirely manages state around a unique IP+port remote address
<waddlesplash>
(and a unique IP+port local one too)
<nephele_mobile>
Okay so to debug this, use some "easy" tcp client using send that just sends stuff in a loop, capture that (wireshark?) and reset the router, and see what the server does
<waddlesplash>
yes
<waddlesplash>
if you don't get the RST on Haiku in the packet capture, double-check with Linux of course
<waddlesplash>
there are some rare conditions where a packet could get dropped/ignored before it winds up in the packet capture
<waddlesplash>
I don't really suspect those though, most likely either the RST doesn't get sent, or it does but our TCP implementation fails to process it correctly somehow
<nephele_mobile>
Well, i'd expect this to be "properly" implemented on the remote side, in my case the server is disroot, and i think they use linux
tuaris has joined #haiku
<nephele_mobile>
though, in the same vain Vision sees that the connection dropped (no self ping anymore) but doesn't do anything about it, though i'm not sure if it just is coded to not do anything in that situation
<waddlesplash>
nephele_mobile: here's where we handle TCP reset:
<Habbie>
it currently feels.. not entirely true :D
<Habbie>
but also you're right
<Habbie>
12:14:18* @waddlesplash | I certainly haven't learned to debug DOS bootloaders lol
<Habbie>
s/DOS/BIOS/ i guess
<Habbie>
DOS is not currently involved
<waddlesplash>
ah, OK
<waddlesplash>
well, I have debugged our BIOS bootloader just a bit
<waddlesplash>
not very much
<waddlesplash>
anyway, as long as my assumptions hold true, today I get to do a "break kernel ABI, adjust all drivers, and implement 2 new features all at once" commit :D
<Habbie>
(this distinction is not important, although i have an open question for myself: why does stage2 cause 'invalid opcode' when trying to go to protected(?) mode when started from freedos)
<Habbie>
waddlesplash, woop
<waddlesplash>
I'd like to break some of the change up but there's really not a good way to do that
<waddlesplash>
hm
<waddlesplash>
actually I might be able to by disabling functionality that isn't used at the moment
<Habbie>
hmm. my hang (intentionally) is now at the -end- of platform_video_init
<waddlesplash>
then it would just be "break kernel ABI" and another commit to "use new API in all (FS) drivers"
jmairboeck has joined #haiku
<Habbie>
oh, the ABI you might be breaking is "just" for FS drivers?
<waddlesplash>
yes
Guest14842 has left #haiku [#haiku]
<Habbie>
and only on 32 bit x86
<waddlesplash>
I'm refactoring some obscure features of queries
_justin_kelly16668682448544652 has quit [Ping timeout: 480 seconds]
<waddlesplash>
so that Tracker can use them, and quit needing tens of thousands of node monitors
Guest14842 has joined #haiku
<Habbie>
is there a reason the project holds on to that 32 bit abi? i haven't kept track of what original binary only BeOS software people might still be trying to run
<Habbie>
ah yes, i spotted an earlier conversation about that :)
<waddlesplash>
well, there is a significant amount of code we inherited from BeOS era that we can recompile, and do
<waddlesplash>
and this API break could break that, it's not a pure ABI break
<waddlesplash>
if it was just kernel ABI I wouldn't care
<Habbie>
ack
<waddlesplash>
we maintain some stability there, but only between/within releases
<Habbie>
waddlesplash, remember the slow logging you gave me
<Habbie>
it is ON MY CHROMEBOOK SCREEN
<Habbie>
it also has filled the screen now and is not scrolling ;)
<Habbie>
- set_text_mode();
<Habbie>
+ write_char('I');
<Habbie>
+ // set_text_mode();
<Habbie>
this does a -lot-
<waddlesplash>
:DDD
<Habbie>
(not the write_char. that somehow does nothing)
<waddlesplash>
what was the fix?
<Habbie>
that
<Habbie>
if you're wondering 'why is that an I', I already printed H and A earlier ;)
<Habbie>
LILO style
<waddlesplash>
oh, just commenting out set_text_mode?
<Habbie>
yes
<zelectric>
pardon me, but I see "vision" quit messages and found a page for the vision irc client for haiku and at the bottom left of the page there's a photo of someone with their tongue out.. it appears it hasn't been updated in awhile, is that official or...?
<Habbie>
the same int10 set text mode call runs fine from freedos, so i don't yet know why
<Habbie>
so uhm. after a bunch of logging. what should be happening?
<waddlesplash>
are you SURE it's the same INT 10?
<Habbie>
define 'same'
<waddlesplash>
well
<waddlesplash>
it looks like we don't clear the registers
<waddlesplash>
besides eax
<waddlesplash>
so they'll have stack garbage
<waddlesplash>
zelectric: ... what page is that?
<zelectric>
waddlesplash, one minute...
<Habbie>
hmm. i did not explicitly clear anything when calling from freedos either, but debug.com might be a cleaner environment
<Habbie>
oh fuck me i can use the arrow keys in the boot loader thing
<zelectric>
waddlesplash, thanks :0
<Habbie>
'load kernel'
<Habbie>
ucode_load
<Habbie>
4 icons
<Habbie>
KDL
Xe has joined #haiku
<Habbie>
this is -awesome-
<waddlesplash>
NICE
<zelectric>
I can't find the page ATM but thanks for the official link
<Habbie>
'did not find any boot partitions'
<waddlesplash>
what are you booting from? USB?
<Habbie>
no, i did a dd skip=6 from anyboot.iso to gpt partition 19
<waddlesplash>
ah
<Habbie>
booting with set root=(hd0,gpt19); chainloader +1; boot, from GRUB
<waddlesplash>
if you can type at KDL prompt, just read the syslog
<waddlesplash>
"syslog" command at the prompt (it has a pager)
<Habbie>
i can type
Guest14842 has left #haiku [#haiku]
<waddlesplash>
should include the disk probing
<Habbie>
the pager is displaying a bit weirdly, but yes, i can read it
<waddlesplash>
Habbie: anyway this is far into the boot process and a completely different stage, we don't call the BIOS anymore except via x86emu at that point
<Habbie>
right
<waddlesplash>
and it can catch faults, so
<waddlesplash>
time to just figure out what we have to change and then commit a fix for the first bit :P
<Habbie>
i think i can see it finding 18 partitions, maybe 19
<Habbie>
but yeah
<waddlesplash>
if the Haiku partition is in there, then the kernel just failed to identify which one it is supposed to use
<Habbie>
this is clearly a point that says "this is going to work"
<waddlesplash>
yes :D
<Habbie>
waddlesplash, got any hints off the top of your head about what i might have missed there? a gpt partition type?
Guest14842 has joined #haiku
<waddlesplash>
how did you create the GPT partition?
<Habbie>
gparted i think. don't recall if i set a type
<waddlesplash>
then that's likely it
<Habbie>
cool
<Habbie>
going to just try that
mmu_man has quit [Ping timeout: 480 seconds]
<Habbie>
we are, indeed, booted far enough that the power button does not simply turn the device off
* Anarchos
thinks to refactor the code dealing with command line options of RemoteDesktop, to use "getopt" instead of the actual code not working with default values.