XMPP Service Operators

Licaon_Kter 16:15:01
`[warning] <0.1758.0>@ejabberd_s2s_out:handle_auth_failure/3:233 (tls|<0.1758.0>) Failed outbound s2s EXTERNAL authentication mydomain.tld -> uc.future.cmpdev.com (hidden_by_ejabberd): Authentication failed: Peer responded with error: certificate has expired (not-authorized)` Funny thing is that I don't have contacts with that domain, nor public MUCs they could join.
ernst.on.tour 16:20:07
Maybe one of your customers ? I've got possible same entries in log if customers have contacts of foreign domain in their roster
Licaon_Kter 16:23:04
~~oh riiiight,~~ ✎
Licaon_Kter 16:23:44
oh riiiight, to the grep-machine! ✏
diane 18:25:09
I had an operator question, does anyone have documentation about failover configurations for prosody? Not necessarily clustered, but the building my prosody server had a multi hour power outage and didn't automatically power back up after the outage ended. It'd have been convenient if another instance had started up within a couple minutes of the outage.
Licaon_Kter 18:38:41
diane: better ask in xmpp:prosody@conference.prosody.im?join
diane 18:39:44
Ah ok
diane 18:39:46
thanks
Licaon_Kter 18:48:18
> oh riiiight, to the grep-machine! No results...
rozzin 19:40:45
Can you get that kind of failover just by having duplicate servers and setting the SRV records up with the correct priorities, or is it more complicated than that? I guess other hosts probably don't just let go of the lower-priority service when then higher-priority one comes back online..., and keeping modern features like MAM synchronized between the servers is a whole other layer of complexity....
moparisthebest 19:44:18
rozzin: not just modern features, if bob@your.domain is connected to server 1 and tom@your.domain server 2 they can't communicate either
rozzin 19:46:25
moparisthebest: that's what I meant by the first point—that you could wind up with netsplits affecting both groups of c2s connections and groups of s2s connections.
rozzin 19:48:22
So..., maybe there basically is fundamentally no usable failover without clustering?
MattJ 19:49:49
But even clustered solutions don't automatically support this, as most clustering protocols require nodes to be on the same LAN
moparisthebest 19:52:02
You can spend billions on that last 9 of reliability and then still sometimes accidently take down your entire network like Facebook
rozzin 19:52:09
Yeah, IIRC the ejabberd clustering is basically intended for multiple nodes in the same datacenter and "requires very low latency between nodes", and not meeting those latency requirements results in netsplits.
moparisthebest 19:52:31
Or you can just have a simple setup and accept a bit of downtime now and then
rozzin 19:56:29
I'd also love to find a way of doing something like "my friend and I each host a server for the same XMPP service at our houses in different towns for redundancy to guard against power/network outages". I've gone looking, but never found anything.
jonas’ 20:08:46
well, what you can do, but this is by no means easy: - put all data in postgres or similar - do active/hot standby replication from A (primary) to B (secondary) - run both servers, have SRV priorities - let B healthcheck A in some sane way (a simple (xmpp) ping is not sufficient as there could be an outage between A and B, while the rest of the world can still reach A) or have a manual failover - unless B *knows* that A is down, B redirects all incoming connections using <see-other-host/> to A - when B *knows* that A is down, the postgres instance needs to be promoted to active and then B can accept connections
moparisthebest 20:10:24
Because it's a very hard problem not even Facebook with it's billions have managed to adequately solve
jonas’ 20:11:20
moparisthebest, I wouldn't say that. I think they solved it adequately
Licaon_Kter 20:12:46
jonas’: yeah, by having their ~prisoners~users not know of any alternative and just ignoring the downtime
moparisthebest 20:12:50
jonas’: Facebook recently took down everything so bad that they had to send people driving to data centers to recover
jonas’ 20:13:50
I know
jonas’ 20:14:00
shit happens
jonas’ 20:14:31
that's the fourth or sixth nine I guess.
jonas’ 20:14:35
more than most XMPP operators have ;)
jonas’ 20:14:43
even DC operators
TheCoffeMaker 20:27:19
I was thinking on something similar to rozzin, I mean having two or more locations ( mine and a few friends of mine ) clustering ejabberd over a vpn to let them share a network segment ... didnt have time to build the setup yet, and really dont know hot it can perform ✎
TheCoffeMaker 20:27:27
I was thinking on something similar to rozzin, I mean having two or more locations ( mine and a few friends of mine ) clustering ejabberd over a vpn to let them share a network segment ... didnt have time to build the setup yet, and really dont know how it can perform ✏
TheCoffeMaker 20:28:45
besides that ... if we are talking abt Self-hosting ... everyone needs to understand and embrace the outage ... not everything is under our control
TheCoffeMaker 20:28:56
nor even our budget 😂
Licaon_Kter 20:56:31
404.city do PM me so I can report a spammer
TheCoffeMaker 20:58:50
what Im working over this lattitudes, is on promoting the raise of XMPP instaces by local tech savvy communities (hackerspaces/hacklabs/*labs/*spaces), and provide accounts only to trusted people, ie. i give accounts only to my family and friends and their family and friends. sorry if it bothers, just wanted to socialize it
Licaon_Kter 21:02:40
👍
MattJ 21:02:45
TheCoffeMaker, sounds very similar to what I'm promoting ;)
TheCoffeMaker 21:07:23
great! we are not the only ones!!!
TheCoffeMaker 21:55:07
for big instances with public registration, riseup.net has a good rule to handle spam (but it's a shame that they gave up on xmpp) ... they only give accounts by users invites and if someone abuses the service the account that originated the invite will be closed too ... dont know how deep can it go.