Zash, so wait. if i reboot prosody everyone stays in the room? as long as they don’t send errors which i guess under normal circumstances they shouldn't
Kev
That's what's happening in M-Link too in MUC rework we've got underway.
COM8has joined
Daniel
right. so i guess this would cover a lot of the cases except for when during downtime I try to (re)connect to a muc, get 'server not found' and then don’t know when it is fine to try again
COM8has left
Daniel
but it it would at least solve the ghost room problem
jonas’
I’m not sure how well it works in practice
jonas’
but I guess that’s the point of it (that you don’t notice a graceful restart) anymore✎
jonas’
but I guess that’s the point of it (that you don’t notice a graceful restart anymore) ✏
Daniel
i think it might even break in the case where i have a reconnect (with the same resource) during down time. i will get a server not found back from my own server. thus my client thinks i'm not in the room. but after restart the room thinks i'm in it
Daniel
and then messages sent to me won’t get bounced (by my server) because my resource is the same
rtq3has left
jonas’
but then you’re still confused on the client side (which you can fix up)
Douglas Terabytehas left
Douglas Terabytehas joined
rtq3has joined
jonas’
with a rejoin probably, because you threw state away
neshtaxmpphas left
Daniel
yes. so clients need extra logic that a message (or presence) received to a room that seems to be offline should trigger a ping/join
Daniel
or something
Daniel
which is ok i guess
Daniel
but needs to be handled (and probably documented)
Kacperhas left
neshtaxmpphas joined
Kacperhas joined
Kev
Restarts are usually so quick that you don't notice them.
Kev
(Unless you're silently dropped from the room)
kokonoehas left
Kacperhas left
Kacperhas joined
remkohas joined
jonas’
Kev, depends on what is restarted ;)
jonas’
if it’s your own server, you’ll definitely notice. if the server is being fully rebooted, it can take minutes
rtq3has left
Holger
Don't we already have enough "works most of the time" cases?
remkohas left
rtq3has joined
Daniel
in what cases would this break?
goffihas left
Zashhas left
neshtaxmpphas left
neshtaxmpphas joined
Zashhas joined
Link Mauve
Daniel, I’m not sure I understand your last example, in most cases if your client reuses a previous resource, it’ll have the MUC in bookmarks and join it again afterwards, right?
Link Mauve
In doing so, it sends a MUC join and the service then knows it has to send the full room state again, and can consider the previous full JID as out of the room.
Link Mauve
And when your new client doesn’t know it should join the MUC, it can send back an error to any groupchat message it receives from that room.
goffihas joined
Link Mauve
Or am I missing something?
jonas’
Link Mauve, daniels scenario was that the muc is currently rebooting while the client reconnects
jonas’
thus the rejoin gets bounced with remote-server-not-found
jonas’
and never reaches the MUC service
Link Mauve
Ah right.
Kacperhas left
Link Mauve
Yeah, then you have to try again with exponential back off, like in any current case of remote-server-not-found.
Kacperhas joined
jonas’
Link Mauve, but then the MUC service comes back and starts sending you type="groupchat" and presence
Zashhas left
kokonoehas joined
Zashhas joined
Nekithas left
Nekithas joined
debaclehas joined
pdurbinhas left
delehas joined
delehas left
neshtaxmpphas left
delehas joined
MattJ
Probably the MUC service should ping persisted occupants after a restart
delehas left
delehas joined
MattJ
and by ping, probably I mean probe
Zash
In Prosody, that's already what sorta happens since rooms are usually restored from storage by some event that results in a broadcast.
Zash
Daniel Yes, everyone stays in the room. Rooms can be saved to disk and removed from memory and then brought back at any time for a few different reasons, of which graceful shutdown is only one.
Zash
When all goes well, nobody notices.
Nekithas left
Nekithas joined
kokonoehas left
delehas left
Zash
The situation where the room thinks you're stil there but the client doesn't think so only happens because the move to long-term stable resources. If you get a new resource every time you connect, this takes care of itself eventually via kick-causing error bounces.
Daniel
the move to long term stable resources only happened because otherwise we have no ability to kick the old one
Zash
I don't think that's true
Daniel
that this was the reason or that we have no way of kicking the old?
Zash
Removal of stale sessions can be done, dwd has written stuff about this before.
Kacperhas left
Zash
And they should get removed eventually anyways
Zash
Like, the server could ping existing sessions when a new session connects.
Daniel
fwiw if that's what it takes i'm fine with moving to random resources
delehas joined
Daniel
users will hate it
Zash
As always, there are tradeoffs
kokonoehas joined
Holger
Ping existing sessions sounds ugly to me.
Kacperhas joined
Holger
A problem in practice I see is the delay. Stanzas queued for the old session won't be resent before the ping times out.
Daniel
Zash, is this prosody 0.11 or current development?
Zash
Everyone being kicked from rooms all the time because "Disconnected: Replaced by new connection" is also ugly
Zash
Daniel Theoretical
Daniel
Zash, the storing muc state i mean
Daniel
not the pinging of resources
Link Mauve
Daniel, 0.11 this one.
Zash
Yes, 0.11
Zash
Rooms can be saved to disk on graceful shutdown, module unload (and reload) or when they are evicted from a LRU cache.
Daniel
can or will?
Daniel
does this need to be configured?
valohas left
Zash
Will. Enabled by default. I'd have to check docs or code to remember details of what can be configured.
valohas joined
Holger
You're not worried about the init system killing the graceful shutdown due to timeout on servers with large/many rooms?
Daniel
do you have any grasp on how well that works in practice? because i still have countless users telling me about ghost mucs. but of course it might be that they are all on ejabberd
Daniel
countless ~= 3
Zash
Holger: Dunno, should we be?
Daniel
but they are very annoying about it :-)
Zash
Loud minority?
Holger
Zash: That's the main reason that made me hesitate to implement thing.
Holger
*the same thing.
Zash
I suspect that ejabberds closing of idle s2s connections isn't helpful here.
Daniel
why? saving state doesn’t require sending something over s2s does it?
delehas left
Zash
I mean about ghost rooms/users. s2s connection gets closed and then fails to be reestablished for something, and then ghosts.
Link Mauve
Yeah, I’ve often been kicked out from (old) Ejabberd rooms without being notified, this doesn’t happen much lately.
Holger
If reconnecting fails I'd assume the old connection would've been lost as well.
delehas joined
Zash
There's more likely that an unavailable presence can be delivered over an established s2s connection than if it has to reestablish it again.
Zash
Prosody in some configurations doesn't even manage to send anything when shutting down, making this worse.
Holger
Either way, personally I'd still prefer MUC Push over all these solutions that try to work around all these problems with MUC relying on presence.
Holger
The only real corner case I see with this is the first participant who'd like to write a groupchat message after MUC service restart.
Daniel
Holger, the question is if muc push really becomes the go to thing and all clients enable 1-2 push targets on every join wouldn’t the load on the db be the same as persisting presence?
Holger
Fixing that might require some client-side hack, or waiting for MIX.
Zash
I did start on an experimental hack that would make MUC joining account based
Holger
Daniel: Yes it's just a more robust solution, in my book. As you can't do the presence thing for clients without persistent connection anyway.
Holger
(Except with the super-ugly hack of faking their presence state while they're disconnected.)