XSF Discussion - 2020-03-26


  1. Ge0rG

    jonas’: looks like we have all the votes needed to issue two Last Calls :D

  2. moparisthebest

    I've just been informed that TCP connections are supposed to be short lived, opened, used to send or recieve data, then immediatly closed, and any application that does not use them in this manner is doing it wrong and shouldn't be expected to work

  3. Daniel

    by your ISP?

  4. MattJ

    Wow, thanks. Wish I'd known that sooner

  5. jonas’

    by whom?

  6. moparisthebest

    network team who installed some middleboxes that enforce this corporate-wide and is breaking a ton of stuff

  7. moparisthebest

    I too wish I had known sooner...

  8. jonas’

    my doctor just said that people with this type of opinion are a danger to public health due to high blood pressure and should be put in internet-restricted quarantine.

  9. moparisthebest

    I can't disagree jonas’

  10. pep.

    module:disable('s2s')

  11. pep.

    ah it's :unload

  12. Zash

    My sanity just said that those people need to be quarantined for life, preferably in Darvaza.

  13. Zash

    The middleboxes too.

  14. flow

    moparisthebest, did you ask them *why* those middleboxes enforce this?

  15. moparisthebest

    it's in the name of "network optimization" and "because that's the right way to use TCP"

  16. moparisthebest

    so far this has just been over email but I have a meeting in a few hours... -.-

  17. Zash

    Isn't this going to mess up email access too?

  18. moparisthebest

    our middleboxes are both riverbed and telari , I don't know which or both are doing this in particular, but might be something to watch out for

  19. Zash

    Or has everyone gone JMAP already?

  20. MattJ

    JMAP IDLE

  21. moparisthebest

    not sure what outlook uses actually

  22. Zash

    OWA maybe :(

  23. Zash

    Web eats everything :(

  24. moparisthebest

    I use the browser version and http does "use tcp correctly" so I guess I'm ok?

  25. Zash

    moparisthebest, you could tell them it breaks all your (paying customers) websocket applications

  26. moparisthebest

    we also have, nfs and cifs file servers too, guess we better turn them off

  27. Guus

    kid just soiled itself, unsure if I can make board meeting

  28. pep.

    !

  29. pep.

    It's time indeed

  30. pep.

    Seve, MattJ ^

  31. Seve says hi

  32. MattJ

    o/

  33. pep.

    So we've got Guus and ralph excused

  34. pep.

    0. Agenda

  35. pep.

    Do we have anything for today?

  36. MattJ

    There were some items added to Trello

  37. pep.

    I'll leave ralph items for next week

  38. pep.

    1. Hire communications person

  39. pep.

    Quick update: I've poked peter, waiting for advice on contracting. Nothing else yet

  40. pep.

    2. AOB?

  41. Seve

    None for me, just thanks to iteam for bringing everything back up :)

  42. pep.

    Yep thanks a lot to iteam (MattJ) for handling the outage :)

  43. pep.

    There's an item for discussion in the board, "Iteam budget"

  44. pep.

    MattJ, if you want to talk about it?

  45. Seve

    MattJ, might want to expan that

  46. Seve

    MattJ, might want to expand that

  47. MattJ

    Yeah. So I opened a DigitalOcean account for the XSF, as a place to bring up a replacement server

  48. MattJ

    This is being billed from the XSF account since yesterday (thanks stpeter)

  49. MattJ

    Cost is pretty low, I don't remember which exact server I selected

  50. MattJ

    But I thought it would be good to have a clear budget for iteam

  51. pep.

    How much do you think you'd need?

  52. MattJ

    Because we may want to add additional servers as we modernize the infrastructure

  53. Zash

    And then jump on the k8s train‽

  54. MattJ

    I was thinking something like $1200/year

  55. MattJ

    As an upper limit, not a goal

  56. Seve

    Modernizing sounds very appealing to me

  57. pep.

    How many machines do you have with this weekend already? / cost

  58. MattJ

    Just the one server right now, checking the cost

  59. pep.

    Otherwise I'm happy to trust your decisions. $1200 seems very much fine

  60. pep.

    Previous hosting was being sponsored right?

  61. pep.

    Any news of them btw? Maybe different topic

  62. MattJ

    No news yet. Different but very related topic. Obviously their donation for all these years has basically kept our infrastructure costs at close to zero (we currently pay for tarsnap for backups)

  63. MattJ

    And moving stuff back there may make sense in the long run, but only if we can be sure this situation won't happen again

  64. Zash

    (AOB turned into an iteam report?)

  65. pep.

    Zash, kinda

  66. MattJ

    E.g. by providing us with remote access to cycle machines (which we used to have), and extra emergency contact details

  67. pep.

    But it's really another item :)

  68. pep.

    MattJ, for the budget, I agree, and I guess we'd all agree but we can talk about it again. Maybe bring that up on list? It seems we've got board members absent again in meetings

  69. MattJ

    Ok, current replacement server is $20.month

  70. pep.

    Okay

  71. Seve

    Setting a budged is a good idea anyway, so thanks for bringing this up MattJ

  72. MattJ

    If we're to expand, I can imagine adding a couple of smaller (cheaper) servers as well as this

  73. MattJ

    and maybe some third-party monitoring which would also have a cost

  74. MattJ

    But that's about it in my mind right now

  75. pep.

    Right. I don't think we'll get to $1200 right away, but it's also not really difficult if we wanted to :)

  76. MattJ

    I'm fine if we want to set the limit lower than that, and while we still have other servers donated, I think that would be fine

  77. pep.

    Ok, should we put this for next week to see if everybody agrees?

  78. MattJ

    wfm

  79. MattJ

    There was another item

  80. pep.

    k

  81. MattJ

    I thought it was on trello, but apparently not... it's to approve jonas’ joining iteam

  82. pep.

    That was 3. Iteam budget then

  83. pep.

    And 4. Another iteam

  84. pep.

    item, sorry.

  85. MattJ

    :)

  86. pep.

    Yep also sounds good to me

  87. MattJ

    jonas’ has kindly offered to help out with some of the modernization work

  88. pep.

    That team member list is also getting quite long :p

  89. MattJ

    Yeah, that's something that will be under review

  90. pep.

    Seve, thoughts?

  91. MattJ

    Figure out what servers we have -> who needs access to what -> who should/shouldn't be in the team

  92. pep.

    I really wished we were doing votes on-list as well (somewhat like council). Now we're just deferring until we get a fullhouse kinda

  93. pep.

    Anyway

  94. Seve

    Oh, I think I got disconnected

  95. Seve

    I'm happy for us moving in that direction, so nothing to complain about. Just maybe, I would prefer to have a "shorter" budget and increase it depending on our needs, just to make wiser decisions. But I completely trust Mat so all is good. Let's keep going on with this!

  96. pep.

    One doesn't have to use all the budget anyway :)

  97. pep.

    Ok, MattJ we're moving that also for next week then?

  98. pep.

    Or on-list if you can bring it there

  99. MattJ

    wfm

  100. pep.

    5. Any other AOB?

  101. MattJ

    I may do some planning/pricing and request a more specific budget, then post on list

  102. pep.

    Thanks

  103. pep.

    6. Close

  104. pep.

    Next: +1w

  105. pep.

    Thanks all

  106. MattJ

    Thanks pep. :)

  107. moparisthebest

    > One doesn't have to use all the budget anyway :) maybe you should talk to the government

  108. Guus

    I blame Frozen. She couldn't get out of her princess dress fast enough. Second time I had to clean the restroom today. The other kid was sick in it earlier.

  109. Guus

    MattJ: thanks for taking care of things. I'm not exactly sure on what happened in the first place. Would you mind sharing some details when you have some time?

  110. MattJ

    We have no details :(

  111. MattJ

    I don't even know what time the server went down

  112. Guus

    So, we 'just' have an unresponsive server and no way to get to it?

  113. MattJ

    Yes

  114. MattJ

    For what it's worth I found in my email archive that this also happened in March 2019

  115. MattJ

    And a power cycle brought it back to life

  116. MattJ

    But the guy who did that last year has not been responsive so far (and many DCs are locked down right now anyway)

  117. Guus

    To what extent do we have backups from which we can restore data, if need be?

  118. MattJ

    You're currently chatting on a server restored from backups

  119. MattJ

    The backups are nightly, so we potentially lost up to 24h of chats/wiki edits, depending on exactly when it went down

  120. Guus

    Mail archives?

  121. MattJ

    Mail is on a different server

  122. MattJ

    (and that is also backed up)

  123. Guus

    In the same data center though?

  124. MattJ

    Yes

  125. Guus

    So the data center didn't explode. 😁

  126. MattJ

    Yeah, other stuff is working fine there (and the other servers are in the same rack)

  127. Kev

    Frankly, I think it's a wonder that Matt has managed to get 'everything' back up from a backup, given backups invariably turn out to not work when you need them. So despite being an inconvenience, I think this is a win!

  128. MattJ

    Yeah, this could have been a different story :) (one we've lived through before)

  129. Guus

    Hehe, glass half full kind of guy you are.

  130. MattJ

    So thanks Kev for leading that initiative back then

  131. MattJ

    Also, it's not necessarily the case that data on the down server is lost

  132. MattJ

    Just unreachable until it's powered back on

  133. Kev

    Although at least for wiki edits it effectively is.

  134. MattJ

    I don't plan to merge anything if it comes back, if that's what you mean

  135. Kev

    Because chances of us mananging to work out and reapply deltas between two DBs are limited.

  136. Kev

    Chat logs for a single day are probably easier to reinsert.

  137. MattJ

    But I wouldn't mind checking 'Recent changes' on the old server and comparing

  138. Kev

    Yeah, that wouldn't be daft.

  139. MattJ

    and a manual copy/paste in that case would suffice, likely

  140. Guus

    Thanks for taking care of everything. Let's find a way to prevent this from happening again in the future. If budget is needed for that, I think we should consider spending that.

  141. MattJ

    It's a difficult situation - if we can continue to use donated hosting, I'm totally fine with that

  142. Zash

    Given that there were days between the crash and there store it should be relatively easy to merge the chat logs

  143. Zash

    Says me with my Prosody hat on, knowing how that's stored.

  144. Zash

    (hats!)

  145. MattJ

    We just need to ensure that things like this don't happen, and that we have an action plan when they do

  146. Kev

    I think I suggested something about this? Or did I dream that conversation?

  147. Guus

    Obviously, but I'd like to have a reasonable assurance that we can reach our servers when they're out. Last time, this was an issue too.

  148. MattJ

    I'm pretty sure if the XSF wanted to e.g. pay for a power strip that allows remote power cycling, that's already an improvement

  149. MattJ

    Kev, you did say similar in a private conversation with me

  150. pep.

    MattJ, I doubt you'd need this tbh

  151. pep.

    IPMI/ME should do, if available

  152. pep.

    and I guess that is, on a server

  153. pep.

    Now it does need to be plugged :p

  154. MattJ

    I /think/ the server in question doesn't have that, but I may be wrong

  155. MattJ

    Some of them do, but it's not exactly documented (except in ancient email threads)

  156. Kev

    Typically we've bought hardware that does have ILO-ish things, but this box was different so I don't know.

  157. Guus

    There's probably a bunch of options that we can look at. I'm happy for iteam to first finish the disaster recovery, then come up with a plan for the future.

  158. Guus

    What's confusing to me is that in a DC, we're still limited to one contact person.

  159. Guus

    Isnt this a commercial DC, or are we serviced off the books or something?

  160. MattJ

    Our relationship with them pre-dates my involvement with the XSF, so I don't know the details

  161. Kev

    This is all done out of the goodness of Jerry's heart, and has been since the year Dot. We're not a client or anything.

  162. Guus

    Although I'm very happy for Jerry's efforts, some kind of more robust service might be preferable, given that our hardware seems to become less reliable.

  163. Guus

    I'm thinking we are running on aging hardware?

  164. Kev

    That's also because we live off donated and/or ancient hardware etc., mind.

  165. Guus

    Sure, it is totally understandable

  166. Guus

    We have a pretty good deal on all this

  167. Kev

    I remember exactly one issue with the hosting, as opposed to the machines -and far more instances of the opposite, where Jerry's just found 'random' hard drives/other hardware to patch up our servers to get them straight back up.

  168. Kev

    (And that issue with the hosting was because we were getting DoSd so Jerry had to DoS protection stuff)

  169. MattJ

    Another part of me thinks that we don't /need/ dedicated hardware for what we do these days. Maybe in 2004, but virtual servers are cheap and easy these days.

  170. Guus

    But maybe it's time to improve reliability by spending a bit more money

  171. Kev

    So I'm very happy to recommend various changes, but I don't think our hosting environment is a significant part of our issues.

  172. Kev

    Yes, we could probably (other than xmpp.net) cope with a £5/mo servers4u VPS or something.

  173. Guus

    If an important server is unreachable for several days, I think it is somewhat of a problem.

  174. pep.

    MattJ, depends what we want to do with them

  175. MattJ

    It's been a week now

  176. MattJ

    (well, it went offline some time between Thu night and Fri morning)

  177. João Duarte

    I just realized now, after 4 days, that my clients werent updating any messages

  178. João Duarte

    Was coming here to give Manav some feedback, but now if i understood correctly, it was some server issue that is causing trouble to the chat rooms?

  179. flow

    João Duarte, yep

  180. João Duarte

    Ah okay. Anyway to get things back updating in the clients? Or just wait?

  181. Zash

    Depends on the client, but leaving and rejoining should do the trick

  182. João Duarte

    Im using pidgin on desktop and xabber on mobile

  183. João Duarte

    Already tried rejoining to no avail

  184. João Duarte

    But maybe i can troubleshoot my way through, just wanted to check it was within my reach and not something purely on the server

  185. Zash

    The server that served this chat went down and we don't know why. This chat is now running on a new server.

  186. João Duarte

    Same thing for prosody's chat?

  187. Zash

    No

  188. João Duarte

    I guess so, because the last messages I have were sent in the 22th

  189. João Duarte

    hm

  190. Zash

    Then it was probably not the XSF server issue.

  191. Link Mauve

    jonas’, you once told me not to include real client names in XEPs right? I’m fixing MIX’s examples which were violating XEP-0030 MUSTs.

  192. Link Mauve

    Ah no, in the end you accepted it in XEP-0402.

  193. Link Mauve

    “A nick MAY be specified as a <nick/> child elements of the <join/> element.”, what happens if the user’s server didn’t specify a nick and the MIX channel requires one?

  194. Ge0rG

    Link Mauve: yes.

  195. Link Mauve

    Ah, “If nick is missing on a channel where nick is mandatory, the join MUST be rejected.”, nvm.

  196. Link Mauve

    Again in MIX, why is it <update-subscription/> but <setnick/>?

  197. Link Mauve

    In section 7.3.4, this is the first time there is no <{mix}destroy/> in the iq result, is this wanted?

  198. Link Mauve

    MIX is missing error cases.