XMPP Service Operators - 2021-11-21


  1. Licaon_Kter

    `[warning] <0.1758.0>@ejabberd_s2s_out:handle_auth_failure/3:233 (tls|<0.1758.0>) Failed outbound s2s EXTERNAL authentication mydomain.tld -> uc.future.cmpdev.com (hidden_by_ejabberd): Authentication failed: Peer responded with error: certificate has expired (not-authorized)` Funny thing is that I don't have contacts with that domain, nor public MUCs they could join.

  2. ernst.on.tour

    Maybe one of your customers ? I've got possible same entries in log if customers have contacts of foreign domain in their roster

  3. Licaon_Kter

    oh riiiight,

  4. Licaon_Kter

    oh riiiight, to the grep-machine!

  5. diane

    I had an operator question, does anyone have documentation about failover configurations for prosody? Not necessarily clustered, but the building my prosody server had a multi hour power outage and didn't automatically power back up after the outage ended. It'd have been convenient if another instance had started up within a couple minutes of the outage.

  6. Licaon_Kter

    diane: better ask in xmpp:prosody@conference.prosody.im?join

  7. diane

    Ah ok

  8. diane

    thanks

  9. Licaon_Kter

    > oh riiiight, to the grep-machine! No results...

  10. rozzin

    Can you get that kind of failover just by having duplicate servers and setting the SRV records up with the correct priorities, or is it more complicated than that? I guess other hosts probably don't just let go of the lower-priority service when then higher-priority one comes back online..., and keeping modern features like MAM synchronized between the servers is a whole other layer of complexity....

  11. moparisthebest

    rozzin: not just modern features, if bob@your.domain is connected to server 1 and tom@your.domain server 2 they can't communicate either

  12. rozzin

    moparisthebest: that's what I meant by the first point—that you could wind up with netsplits affecting both groups of c2s connections and groups of s2s connections.

  13. rozzin

    So..., maybe there basically is fundamentally no usable failover without clustering?

  14. MattJ

    But even clustered solutions don't automatically support this, as most clustering protocols require nodes to be on the same LAN

  15. moparisthebest

    You can spend billions on that last 9 of reliability and then still sometimes accidently take down your entire network like Facebook

  16. rozzin

    Yeah, IIRC the ejabberd clustering is basically intended for multiple nodes in the same datacenter and "requires very low latency between nodes", and not meeting those latency requirements results in netsplits.

  17. moparisthebest

    Or you can just have a simple setup and accept a bit of downtime now and then

  18. rozzin

    I'd also love to find a way of doing something like "my friend and I each host a server for the same XMPP service at our houses in different towns for redundancy to guard against power/network outages". I've gone looking, but never found anything.

  19. jonas’

    well, what you can do, but this is by no means easy: - put all data in postgres or similar - do active/hot standby replication from A (primary) to B (secondary) - run both servers, have SRV priorities - let B healthcheck A in some sane way (a simple (xmpp) ping is not sufficient as there could be an outage between A and B, while the rest of the world can still reach A) or have a manual failover - unless B *knows* that A is down, B redirects all incoming connections using <see-other-host/> to A - when B *knows* that A is down, the postgres instance needs to be promoted to active and then B can accept connections

  20. moparisthebest

    Because it's a very hard problem not even Facebook with it's billions have managed to adequately solve

  21. jonas’

    moparisthebest, I wouldn't say that. I think they solved it adequately

  22. Licaon_Kter

    jonas’: yeah, by having their ~prisoners~users not know of any alternative and just ignoring the downtime

  23. moparisthebest

    jonas’: Facebook recently took down everything so bad that they had to send people driving to data centers to recover

  24. jonas’

    I know

  25. jonas’

    shit happens

  26. jonas’

    that's the fourth or sixth nine I guess.

  27. jonas’

    more than most XMPP operators have ;)

  28. jonas’

    even DC operators

  29. TheCoffeMaker

    I was thinking on something similar to rozzin, I mean having two or more locations ( mine and a few friends of mine ) clustering ejabberd over a vpn to let them share a network segment ... didnt have time to build the setup yet, and really dont know hot it can perform

  30. TheCoffeMaker

    I was thinking on something similar to rozzin, I mean having two or more locations ( mine and a few friends of mine ) clustering ejabberd over a vpn to let them share a network segment ... didnt have time to build the setup yet, and really dont know how it can perform

  31. TheCoffeMaker

    besides that ... if we are talking abt Self-hosting ... everyone needs to understand and embrace the outage ... not everything is under our control

  32. TheCoffeMaker

    nor even our budget 😂

  33. Licaon_Kter

    404.city do PM me so I can report a spammer

  34. TheCoffeMaker

    what Im working over this lattitudes, is on promoting the raise of XMPP instaces by local tech savvy communities (hackerspaces/hacklabs/*labs/*spaces), and provide accounts only to trusted people, ie. i give accounts only to my family and friends and their family and friends. sorry if it bothers, just wanted to socialize it

  35. Licaon_Kter

    👍

  36. MattJ

    TheCoffeMaker, sounds very similar to what I'm promoting ;)

  37. TheCoffeMaker

    great! we are not the only ones!!!

  38. TheCoffeMaker

    for big instances with public registration, riseup.net has a good rule to handle spam (but it's a shame that they gave up on xmpp) ... they only give accounts by users invites and if someone abuses the service the account that originated the invite will be closed too ... dont know how deep can it go.