XSF Discussion - 2021-03-31


  1. jcbrand

    Has anyone here though about or come up with a way to identify messages from bots? Seems like one can just include a special tag and namespace for it. Is there anything like that in the wild already?

  2. jonas’

    jcbrand, bots should publish a bot identity in disco#info and clients should use that

  3. jonas’

    jcbrand, https://xmpp.org/registrar/disco-categories.html#client

  4. jcbrand

    so you should send a disco query for every single message in a MUC?

  5. jcbrand

    not really practical in certain use-cases

  6. jonas’

    '115 caches

  7. jonas’

    or if that’s not feasible because MUC, you can just query when the participant enters the MUC

  8. jonas’

    and cache the information until they leave it

  9. jcbrand

    You can really see that XMPP was designed without ever thinking about web clients

  10. jonas’

    attaching such metadata to each and every message is incredible overhead

  11. Zash

    Hats

  12. jcbrand

    oh, I like that thanks Zash

  13. jcbrand

    We already have hats

  14. Zash

    <hat>🤖️</hat>

  15. jcbrand

    problem solved

  16. Zash

    😀

  17. jonas’

    ... *sigh*

  18. jcbrand

    lol

  19. jonas’

    web making everything terrible again.

  20. jcbrand

    enjoy your ivory tower

  21. jonas’

    jcbrand, to be clear, I’m not blaming *you*

  22. jonas’

    I’m blaming the web.

  23. flow

    jcbrand, out of curiosity: why doesn't the approach suggested by jonas’ work for web clients?

  24. Zash

    jcbrand, isn't this MUC being web-unfriendly rather than XMPP itself?

  25. jcbrand

    jonas’ yeah sorry, I don't want to be hostile, but the anti-web attitude in the XMPP community does grate a little sometimes. No big deal tho

  26. jonas’

    jcbrand, fair. Let me state for the record that I am really glad for your work on converse.js, because despite the web being terrible, XMPP-IM needs a web client which is *not* terrible and you seem to be doing a fairly great job at that.

  27. Ge0rG

    what jonas’ said, 👍 for jcbrand

  28. jcbrand

    aw shucks, thanks guys

  29. Zash

    It's really the anti-everything-except-the-web attitude of the web that irks me.

  30. jcbrand

    Sure, I get that

  31. jonas’

    web-pacman

  32. jonas’

    web-pacman.gif

  33. jcbrand

    flow: my knee-jerk response is that in various situations you can't cache/store everything across sessions like you can with desktop clients

  34. Zash

    jonas’, now I need you to drop everything and make that gif. you may edit the systemd-om-nom-nom.gif 😉

  35. jonas’

    jcbrand, but hypothetically, if you did the disco#info on the first message from a participant and cached that for a session (or until that participant leaves), that would be feasible and good enough, wouldn’t it?

  36. jcbrand

    jonas’: yes, but that is then not XEP-0115 compatible right?

  37. jonas’

    I’m not sure '115 even works in MUCs

  38. Zash

    In theory you could cache caps-hash → disco response forever

  39. jonas’

    if it did, you could sometimes skip disco#info lookups if your session has already seen the hash though if '115 worked

  40. jcbrand

    Yes, but in web clients you can't

  41. jcbrand

    Zash: Yes, but in web clients you can't

  42. jonas’

    even that can do a lot (e.g. if the bot only hops into the room briefly and you don’t know the real JID)

  43. jcbrand

    You say that including a tag is wasteful, but in a MUC with an occasional bot post and lots of posts by other users, it seems wasteful to me to do a disco-info for each new poster

  44. jcbrand

    it could be a tag on the presence, not the messages themselves

  45. jcbrand

    But it's ok, the hats thing will work for my current use-case and for the open Jabber network doing a disco query for each new occupant is probably also fine

  46. jcbrand

    not each new occupant but each new message author

  47. flow

    jcbrand, you probably can re-use the disco#info information for other purposes too, so I am not sure if it is that wasteful

  48. jcbrand

    Unless you want to identify bots in your occupants list...

  49. jcbrand

    seems very chatty to me. On a desktop client you can cache all this fairly aggressively, but on web you need to do potentially hundreds or thousands of disco info queries for every session

  50. jcbrand

    On web there is IndexedDB which doesn't have a size limit, so in theory you can use that and then your webapp is more like a desktop app, but having integrated that into Converse (via localForage), I've seen that writes are suuuuper slow.

  51. jcbrand

    And you can't use IndexedDB in all cases

  52. jcbrand

    I have some ideas on how to improve performance on Converse (basically batching writes), but haven't had the time to iron out the kinks

  53. Ge0rG

    jcbrand: what about localStorage?

  54. jcbrand

    localStorage has a 5MB limit, so if you have lots of MUCs it fills up pretty quickly

  55. jcbrand

    Now you could regularly delete older messages, but then you lose your OMEMO history

  56. jcbrand

    So far that has made me reluctant to aggressively get rid of older messages

  57. Zash

    I can't actually use Converse.js anymore, it runs into those limits in a few minutes 😕

  58. jcbrand

    If it wasn't for OMEMO then it would be much easier

  59. Zash

    Not while joining all MUCs anyway

  60. jcbrand

    Zash: If you're up for it, you can try the master branch with IndexedDB

  61. jcbrand

    performance has improved a lot recently

  62. jcbrand

    Still quite a bit slower than localStorage, but usable IMO (although not usable in some use-cases like having it integrated into an existing resource-hungry site)

  63. jcbrand

    I might even make IndexedDB the default in the 8.0.0 release (especially if I can get the batched writes to work well)

  64. jcbrand

    One advantage that disco has over hats is that hats are per MUC, so you have to assign them for every new MUC the bot enters

  65. Zash

    Hm, reminds me of the issue of MUC-component wide affiliations and such.

  66. jcbrand

    yeah, would be great to have

  67. jcbrand

    is any of this fixed or improved by MIX?

  68. jcbrand

    Not that I expect MIX to supplant MUC any time soon

  69. eric

    > eric: for which > screen size is certainly still a problem for KDE apps, like with plasma mobile the wifi setup dialog, no one had tested it with WPA2 Enterprise, the OK/Cancel buttons were off-screen, I had to hook it up to a monitor to set it up moparisthebest:

  70. moparisthebest

    I thought web apps could request more localStorage space nowadays?

  71. moparisthebest

    eric: ah, no, but in general there are many more bugs than things that actually work in Linux phone land so I'm just waiting it out :)

  72. Sam

    Reminder that XMPP Office Hours for this week are today, not Friday! "Soprani.ca: bridging us all together" by Stephen Paul Weber (singpolyma) on Wednesday, 31st March 17:00 UTC

  73. Sam

    Office Hours for this week (about jmp.chat) is starting now! https://socialcoop.meet.coop/sam-pku-dud-niv

  74. flow

    Ge0rG, why are groupchat messages in an user archive a bad idea?

  75. Ge0rG

    flow: because the client needs to support that, be aware that those come from a MUC, and treat them differently. I think that the MUC MAM approach has worked better.

  76. Ge0rG

    Most clients today will probably do weird things when receiving groupchat messages from MAM

  77. flow

    Ge0rG, thanks for the reply. But I've to admit that this appears to be more like a client issue

  78. Zash

    There's also the problem with how a MUC will send one message for each joined resource, which complicates server processing.

  79. Zash

    Way simpler for the MUC to do it, as it already knows when processing that it's a single message, before it broadcasts it.

  80. Ge0rG

    flow: a MUC will only send messages to you while at least one client is connected. If you only have one client, you will only receive from MAM what you already had before. If you have two clients, you will receive a random subset of the room history with no way to know where there are holes

  81. Zash

    There are scenarios where it's desirable that the user only sees history from when they're joined, but it does get weird and should probably be dealt with on the basis of affiliation.

  82. Ge0rG

    It's getting even more weird with MIX MAM, and I've been telling that for years.

  83. Kev

    BTW, saying “no-one does this” isn’t correct :)

  84. Zash

    This is the point where I mention that experiment I did in rewriting outgoing MUC joins so that they come from the bare JID?

  85. Kev

    But you’re right that it doesn’t work terribly well.

  86. Ge0rG

    Kev: I'm eager to read your implementation and interop story on the ML

  87. Zash

    Ge0rG, what about MIX MAM ?

  88. Ge0rG

    Zash: you could convert that from a 20% implementation to an 80% implementation, and I'd actually test it

  89. Ge0rG

    Zash: s2s isn't flawless, it will cause holes in MAM

  90. Zash

    What's those 60% then?

  91. Ge0rG

    Zash: tracking outgoing presence, leaving rooms, nickname management, bookmark sync

  92. Kev

    If we have a MAM meta-search that searches each archive, we could probably get away without having mix history in personal archives.

  93. Zash

    Ge0rG, but how will I be motivated to work on it without at least someone testing it?

  94. Ge0rG

    Zash: well, how am I supposed to test it if it breaks everything?

  95. Zash

    Someone needs to go push the stalemate resolution button!

  96. Zash

    And there's like 3 different bookmarks methods atm, that seems fun to deal with

  97. Zash

    Ge0rG, track what outgoing presence?

  98. Ge0rG

    Zash: room joins from clients?

  99. Zash

    That part should work afaik

  100. Ge0rG

    What about error handling?

  101. Zash

    That goes under "leave"

  102. Zash

    which is under TODO

  103. Zash

    I think it even dedups additional joins, tho I'm not sure that's optimal.

  104. Ge0rG

    Zash: it's a server module in a state I can't possibly deploy on a production server, and I'm only using one single room from my other server

  105. Zash

    Should set up that bleeding edge Prosody instance for client devs ... one day.

  106. Ge0rG

    Yeah, crash me!

  107. Kev

    What rules would be needed? Just rewrite anything going to/from a MUC so that the user part is bare?

  108. Kev

    Determining it’s a MUC by the first thing that’s sent being a presence with muc in it?

  109. Ge0rG

    Kev: you need to implement a bouncer that will multiplex the room between all your clients

  110. Kev

    Oh yeah, that.

  111. Kev

    So, basically imprementing MIX-PAM, in other words.

  112. Zash

    Pretty much

  113. Ge0rG

    Kev: kinda sorta

  114. Ge0rG

    Well, it's probably not so much to implement, and doesn't need to persist anything. But it might have to do sophisticated things with room history generation on join

  115. Zash

    Keep track of joined rooms, joined resources to deal with rewriting in- and outgoing stanzas. Then subject and participants which lets you add new local resources and send them a synthesized join presence flood.

  116. Kev

    So, it’s much worse that just implementing MIX-PAM.

  117. Zash

    No, it's better because it works with MUC, which is what Everyone™ is using today.

  118. Zash

    With zero changes required of clients

  119. Kev

    I mean it’s much worse to implement.

  120. Ge0rG

    And with a manageable overhead on the server

  121. Zash

    Sure, it's MUC

  122. Kev

    You know what it is like implementing, actually?

  123. Kev

    It’s basically implementing FMUC.

  124. Kev

    Which is a MUC implementation that can join another MUC.

  125. Zash

    FMUC-on-your-JID

  126. Zash

    Because we need more things-on-your-JID! 😀

  127. Kev

    It’s the bastard child of MIX-PAM and FMUC.

  128. Ge0rG

    And implementing it will give a significant immediate benefits

  129. Ge0rG

    Mobile clients don't need to self ping any more

  130. Zash

    Now the server will need to self-ping! 😀

  131. Ge0rG

    You'd get reasonable push integration

  132. Ge0rG

    Less presence flooding of rooms

  133. Ge0rG

    Zash: the server knows about its s2s, so it only need to self ping after s2s went down

  134. Zash

    Could probably do the presence versioning thing, closer to the user.

  135. Ge0rG

    Zash: needs new protocol

  136. Zash

    Hm?

  137. Ge0rG

    Kev: that solution is 100% transparent to existing implementations both on the client and on the MUC

  138. Zash

    In theory you could do the same thing but have it speak MIX

  139. Kev

    In theory you can do anything with enough monkeys and typewriters, yes :)

  140. Zash

    No idea if MUC-MIX translation is easier on the users' server or the MIX host.

  141. Kev

    MIX. Much :)

  142. Zash

    Ge0rG, FWIW I think it's okay to leave the door open for storing groupchat messages in the users local archive, but Something Needs To Be Done first.

  143. Ge0rG

    Zash: yes, a new XEP needs to be written.

  144. Ge0rG

    No place in 313 for this

  145. Kev

    Is it worth having to bump the namespace for the sake of changing the SHOULD to the MUST?

  146. Zash

    Ugh

  147. Ge0rG

    I remember having to explain to another developer not to do it like written in the XEP

  148. Ge0rG

    Kev: which one?

  149. Kev

    MAM.

  150. Zash

    Let's not bump any namespaces

  151. Kev

    I think changing a SHOULD to a MUST would require that.

  152. Zash

    SHOULD NOT then?

  153. Zash

    or perhaps a PLZ DONT

  154. Ge0rG

    Kev: there are 14 SHOULD and a dozen should.

  155. Ge0rG

    So which one are you talking about?

  156. Zash

    The one closest to the word 'groupchat' I would have presumed?

  157. Ge0rG

    So... > A server ~SHOULD~ MUST also include messages of type 'groupchat' that have a <body> I really hope that's not the intention

  158. Kev

    You proposed onlist changing into MUST NOT store gcs.

  159. Kev

    I think.

  160. Ge0rG

    Kev: last time I checked, MUST NOT was the opposite of MUST, so please excuse my confusion. But yes, I suggested that.

  161. Ge0rG

    As an escalation of "SHOULD NOT", but I like strong words.

  162. Ge0rG

    I also looked up that the first time I wrote "MAM subscription" on list was in 2015, and probably even earlier in here.

  163. Zash

    This is a wordification of what future (currently unreleased) Prosody will do, if nothing changes before the next major release: - **Do not** store messages of type headline since these are supposed to be for transient notifications, most often PEP events. - **Do store** messages of type error, since if you sent a message it is of interest to know that delivery failed. - **Do not** store messages of type groupchat, as MUCs will send one message per joined resource and most often provides their own MAM. - Follow \[XEP-0334: Message Processing Hints\] advising for or against storage. - **Do store** messages with a `<body>` and/or `<subject>` element, as these carry messages for users. - **Do store** encrypted messages for the same reason as with `<body>`, as indicated by \[XEP-0380: Explicit Message Encryption\]. - **Do store** messages with \[XEP-0184: Message Delivery Receipts\] requests, and the receipts themselves, as if something is important enough to need such a receipt it is probably important enough to archive. - **Do store** messages with \[XEP-0333: Chat Markers\], for the same reasons as with receipts. - **Do store** MUC invites, both mediated and direct. - **Do store** messages with an \[XEP-0353: Jingle Message Initiation\] payload, as users will want to know if they had missed calls. - Anything not covered by this point is either something new that may or may not warrant adding to the archive or something that should not be archived.

  164. Kev

    And I don’t think you’re even the first person to suggest mamsub ;)

  165. flow

    Ge0rG, I like you response on the ML, but you may want to consider adding reasons against groupchat messages in a user archive. I think your mail contains none, and the reasons are good. I feel we often discuss on the mailing list stating "everyone knows that X is a bad idea", without actually providing any reasons/arguments why it is so. And that probably deters people from participating.

  166. Ge0rG

    flow: thanks! I'm not going to be at a proper computer for the next days, so feel free to summarize, link or copy paste what we discussed here today

  167. Kev

    If you were to do dedupe on the server, or whatever, you could also achieve reasonable groupchats in the archive behaviour.

  168. Kev

    And probably for less work than trying to implement the MUC-PAM discussed earlier.

  169. Zash

    That's a big and messy if tho

  170. Kev

    Unlike finding a way to do dudupe and everything else involved in MUC-PAM? :)

  171. Kev

    I’m just giving another reason that I think MUST NOT would be excessive.

  172. Zash

    Just Unfork Stuff™

  173. Zash

    Kev, I agree tho, if you do go and solve that and other issues, storing groupchats in the users archive is probably okay.

  174. Ge0rG

    Kev: also please solve reliable s2s while you are at it. And World Peace. Whatever is easier.

  175. Zash

    You know what, let's flip that default

  176. Ge0rG

    Kev: you need additional protocol to get MUC messages into the user account, even when no client is connected, which is surprisingly close to MUC-PAM

  177. Ge0rG

    Without that, you get invisible holes in MUC history, which is the opposite of what xmpp needs

  178. flow

    not sure if we MUST keywordify here, it appears a simple "we found that user archives with type=groupchat messages cause trouble if not taken special care of, and clients may want to consider ignoring type=groupchat messages in a user archive"

  179. flow

    could be sufficient

  180. Kev

    Makes sense to me.

  181. Ge0rG

    I still think that servers shouldn't send all the MUC garbage to a client

  182. moparisthebest

    Looks like XMPP infringes on this patent in the USA, </stream:stream> illegal: > The claimed invention is a system where at least one robot “generates a goodbye command that terminates a communication session” with another robot and “relinquishes control.” https://www.eff.org/deeplinks/2021/03/stupid-patent-month-telehealth-robots-say-goodbye

  183. Ge0rG

    Is it April Fool's already?

  184. moparisthebest

    Unfortunately no