jonas’lovetox, I don’t see any issue here, so unlikely: https://github.com/horazont/muchopper/issues
ЖокирDo any popular servers actually implement XEP-0368? If yes, could anyone give point to any such servers?
jonas’Жокир, https://compliance.conversations.im/ any suggested as "compliant" servers should do, at least for c2s
jonas’I don’t know of any s2s implementation except maybe https://github.com/surevine/Metre , which isn’t quite a server.
GuusLooking at my server log, I'm noticing that I'm getting a lot of connection timeouts on s2s in bursts - presumably x minutes after a user that caused the federation to be set up sent its last presence update.
GuusI wonder if it'd be good to introduce a small factor of randomness to the timeout interval, to avoid staggered behavior.
flowGuus, maybe controversal counter-question: that does sounds like an timeout enforced on the application layer. if so, then why would you have an application layer timeout for s2s connections and not simply let the tcp connection timeout
jonas’flow, save resources.
jonas’the tcp connection will also not time out ever
flowis it worth it?
jonas’because both peers can see each other (in this scenario)
jonas’file descriptors are limited and when you notice you’re running out of them, it’s too late
jonas’being a bit proactive about preserving them is generally a good idea
flowok so kill idle connections based on the amount of available file descriptions, but not based on time
jonas’you don’t know the amount of available file descriptors
flow(or, to be precise, only as second criteria based on time)
jonas’you know the limit, but you don’t know how many are open in your process
jonas’you can estimate, but you can be wrong in the bad direction.
jonas’(or in both directions, depending on how you estimate)
flowls /proc/$pid/fd/ | wc -l
jonas’that’s at least a rather expensive way to do it
jonas’but true, that works, on systems where procfs has that feature
flowwhy is it expensive?
Guusflow I don't mind much either way - I'm just noticing that I get a lot of disconnects. I'm thinking Prosody does this? Openfire probably does so as well, though.
jonas’flow, that’s many syscalls
Guus(at the very least, it's configurable)
jonas’I can’t see immediately in man 5 procfs whether /proc/$pid/fd is a linux or a posix thing
flowjonas’, I wouldn't be surpised if there is a more efficient way to get that number
GuusI don't mind much closing idle connections (although it does feel like premature optimization a bit.)
flowespecially on linux
jonas’I think I looked into that already and found that it’s not possible
jonas’there’s surely a reason why sudo does a for i in 0..MAXFD do close($i); done
Guusas Openfire is a multi-platform solution, depending on any platform specific thingy is going to be a pain.
jonas’I think there are two timeouts, one ~15min (the linear curve down) and one ~30min (which also looks randomized, because of the slight exp-y behaviour at the end)
jonas’and yes, this is the connection stats of search.jabber.network, and the spikes you see is the hourly scan :)
Guus(maybe randomize your scan!)
flownow we only need to identify the implementations with the 15m and 30m (randomized) timeout
jonas’Guus, it’s already shuffled :)
flowand what is keeping the baseline of 1.5k connections
jonas’flow, compare the ratios with https://search.jabber.network/stats#software :)
jonas’assuming that many "unknowns" are in fact prosody MUCs, because prosody doesn’t report version on MUC by default IIRC
flowahh, so it is probably prosody which keeps the connections
jonas’flow, very likely
flowbut the amount of 15m and 30m timeout connections appears to be nearly equal
jonas’I experimented with loading mod_s2s_idle_timeout or whatsitcalled on s.j.n, but then I disabled it to reduce the codebase to the minimum for some unrelated testing
jonas’flow, I’ll have to dig deeper into it, it’s also possible that the different behaviours there are an artifact of how the scanner works
jonas’flow, I’ll have to dig deeper into it, it’s also possible that the two different falloff behaviours there are an artifact of how the scanner works
jonas’since there are two scanning components, and one finishes much quicker than the other; it’s possible that the quicker one is causing the additional tip of the initial spike, while the slower one is what causes the slow fall off at the end
jonas’since the quicker one also tends to touch more domains
jonas’oh yeah, that’s very plausible
jonas’that may also explain the exp falloff due to shuffling
jonas’if there’s really just a 15m or something timeout involved
flowjonas’, are you aware that 'German' appears twice in the room languages table?