XSF Discussion - 2019-07-20

  1. pep.

    To what <message/> should <origin-id/> be added? All?

  2. pep.

    Now that is was ruled out that LMC must refer to the original message, does it make sense to include an origin-id tag in it? As it will most likely not be referenced. (Same applies for other payloads I guess)

  3. lovetox

    pep. you need origin-id not for lmc

  4. lovetox

    you need it to deduplicate your own mam messages

  5. lovetox

    hm ah i remember now

  6. pep.

    hmm, because I don't get reflected messages in 1:1, I keep the origin-id and netx time I fetch MAM I can see where I stopped?

  7. pep.

    hmm, because I don't get reflected messages in 1:1, I keep the origin-id and next time I fetch MAM I can see where I stopped?

  8. lovetox

    it was because in the MUC xep was a sentence that MUCs dont have to keep the ids that clients set

  9. lovetox

    so message-id was overwritten by muc service, so you had no chance to deduplicate with it

  10. pep.

    Well in MUC it wouldn't apply right? As there is stanza-ids? Well, when the MUC does MAM

  11. pep.

    Otherwise yeah I can use origin-id

  12. lovetox

    no you are joined in a MUC, you set a message id

  13. lovetox

    you send the message

  14. lovetox

    damn, you are right

  15. lovetox

    i forgot why we added this

  16. lovetox

    maybe i remember later, im sure there was a reason

  17. pep.

    Maybe the XEP could be clarified and add a clear motivation

  18. lovetox

    but yeah that reason is maybe gone, maybe ask Daniel i think he remembers

  19. pep.

    I have an implementation in slix, I'll wait for feedback a bit more I guess

  20. Ge0rG

    Too many ids, too much confusion.

  21. Lance

    origin-id was a way to stamp without leaking your jid

  22. Lance

    since stanza-id requires a by

  23. adityaborikar

    Is this site down, or is there anywhere else I could read this white paper https://www.isode.com/whitepapers/xmpp-constrained-bandwidth.html

  24. adityaborikar

    The link is on https://xmpp.org/about/myths.html

  25. mathieui

    adityaborikar, https://web.archive.org/web/20160310073118/https://www.isode.com/whitepapers/xmpp-constrained-bandwidth.html

  26. adityaborikar

    Thanks ๐Ÿ˜€

  27. ralphm

    I think they replaced it with https://www.isode.com/whitepapers/low-bandwidth-xmpp.html

  28. Zash

    Cool URLs don't change!

  29. ralphm

    But it would sure be nice if they kept URLs working.

  30. ralphm

    FWIW it seems to be a different article

  31. ralphm

    As in newer.

  32. ralphm

    I'm sure Tobias knows.

  33. lovetox

    pep. i think i remember again

  34. lovetox

    first everything previous to mam:2 did not inject stanza-id

  35. pep.

    yeah I assumed there was something something mam and stanza-id

  36. lovetox

    second many clients used message-ids which were not uuids

  37. lovetox

    like 1, 2, 3, 4

  38. lovetox

    so yeah if we live now in a world where most clients use uuids as message-ids, and almost every server uses mam:2

  39. lovetox

    i think you dont need origin-id

  40. larma

    message id and origin-id have different uniqueness guarantees. - XEP-0359 origin-id MUST BE globally unique (at least that's the common understanding of XEP-0359), so if everyone plays by the rules it is globally unique, otherwise it's unique per user if at least that user plays by the rules. - RFC 6120 stanza id attribute on messages can be unique within that stream or globally unique (per RFC 6120 ยง 8.1.3). There is no MUST in there, so one may argue that no uniquness guarantee is provided, but explicitly no global uniqueness. Interestingly in XEP-0045 v1.31+, there is a rule that MUCs SHOULD reflect the original message stanza id on reflected messages, which implies that there is not even a by-stream uniqueness guarantee on the message id as other MUC occupants can reuse prior message ids and thus cause a collision of message stanza id on a stream (and still be fully standards compliant).

  41. larma

    Technically there is no uniqueness guarantees on <origin-id> whatsoever, only <stanza-id> are unique per generating entity (as specified in their by attribute)

  42. lovetox

    yes larma, but all you need is unique within an archive

  43. lovetox

    and this is guaranted by mam

  44. larma

    for mam that's guaranteed through <stanza-id>, not message id or <origin-id>

  45. larma

    but there are many other use cases that use a message id (or origin/stanza id)

  46. lovetox

    yeah but they are not as important as message deduplication

  47. lovetox

    so i always thought if a client does not use unique ids, then this feature will just not work good

  48. larma

    LMC changing the wrong message?

  49. lovetox

    yes, of course you can implement LMC badly, then this results in very bad stuff

  50. lovetox

    but normally you have a timeframe where correction is allowed

  51. lovetox

    and even if you decide not, then you should always give the user the chance to see all corrections and original messages

  52. lovetox

    so worst is, the user sees weird stuff

  53. lovetox

    but most clients use uuids

  54. lovetox

    and if you implement some timeframe, like message is only allowed to be corrected within 5 minutes

  55. lovetox

    the chance of weird stuff happening is almost zero

  56. lovetox

    even with clients who dont use uuids

  57. larma

    I feel I don't agree with most of what you wrote...

  58. larma

    "user sees weird stuff" is literally the worst thing that can happen, after all MAM is only one utility to help not doing this

  59. larma

    "message is only allowed to be corrected within 5 minutes" I'd prefer that LMC was not only for the last message and also allow very old messages (up to days are acceptable for me)

  60. larma

    And even if you allow the user to display corrections, usually users won't do that anyway...

  61. lovetox

    hey i like it to, but this would depend on a uniqunes guarantee that can never exist

  62. lovetox

    you can always only trust the other client

  63. lovetox

    and if you trust message-id

  64. lovetox

    or some other made up element

  65. lovetox

    i mean if you feel better, trust origin-id :)

  66. lovetox

    i think this is the downside of a federated protocol

  67. lovetox

    make up for it by reducing the damage wrong IDs could make

  68. lovetox

    1. timeframe, 2. give user the chance to deactivate lmc for bad contacts

  69. lovetox

    3. give full transparancy, dont hide or replace messages

  70. lovetox

    thats how i deal with this

  71. lovetox

    if you got a better idea, im very interested

  72. larma

    origin-id should be unique at least by origin, easiest way to guarantee is to use a proper UUIDv4. If an attacker uses the same origin-id for multiple messages, the later ones can just be ignored completely, nothing to worry about as it is not possible to happen if it's not an attack and attackers are allowed to suffer from their wrongdoings.

  73. larma

    But, you can't do the same for message ids as they are not unique by origin, so duplicate message ids are allowed to happen

  74. lovetox

    im not following, so you plan that i can correct ANY message i ever sent you whenever i want

  75. lovetox

    what is there to attack?!

  76. lovetox

    i can already replace any message

  77. lovetox

    and just in case we are not talking about the same thing, IDs should always every matched in the database with a (jid, id) tuple

  78. lovetox

    there is zero abuse potential other then your contact abuses his own messages he sent you, in that case just stop talking with this dude

  79. larma

    Maybe the name "attacker" was misleading here, I meant an entity that does not correctly implement origin-id and generates the same origin-id twice. Those would suffer from degraded usability (easiest is to ignore the messages with duplicated origin-id). However a user sending the same message id twice should not suffer from degraded usability (i.e. the message should be displayed as usual) as this is perfectly standards compliant behavior

  80. lovetox

    are we still talking about LMC?

  81. lovetox

    if someone sends 2 messages with the same id, and afterwards a correction for that id

  82. lovetox

    just take the more recent one

  83. lovetox

    i dont need origin-id for that and i dont need to ignore a message

  84. larma

    There is no reason why message correction should always reference the latest instance of a message id, the user might have intended to do something different. Message correction might be a bad example as those always originate from the sender and usually the same client (though this is not necessarily the case). However if you pick other usages of ids, you'll even get more problems. Like message attaching might attach to the wrong message or something

  85. larma

    If you use the message id for message attaching (as suggested as a fallback when there is no origin-id in the XEP) you have the risk of this happening, as a client might not be able to know where to attach that message. So best would be to not allow attaching to messages that don't have an origin-id. But now apparently everyone for some reason tries to not implement origin-id

  86. lovetox

    the reason is easy, there are already 2 other ids we manage

  87. lovetox

    so if we need a third one, it should be a damn good reason

  88. larma

    There is your reason ๐Ÿ˜‰

  89. larma

    And: if you don't want to support it on the receving end, at least support it as a sender for those clients that want to use it on the receiving end

  90. larma

    If your message id is already generated using UUIDv4, you'd just have to send the origin-id as well and be done with all the work on the sending side (I assume you already store your message id, so you will be able to handle any usages of the origin-id automatically if it is the same)

  91. lovetox

    the scenarios you think up seem unrealistic

  92. lovetox

    you talk to a contact, that has a client who sends the same message id over and over

  93. lovetox

    but for some reason its still so advanced that it supports message attaching

  94. lovetox

    so the client should now add origin-id instead of just making his message-id a uuid

  95. lovetox

    why cant the XEP not just say, clients who want to support message attaching should generate unique message-id

  96. larma

    The client sending the message and the client being able to display the message attaching must not necessarily be the same

  97. pep.

    lovetox, yeah I guess that would have the same effect? (mandating that clients generate unique @id)

  98. larma

    How do you mandate legacy clients?

  99. larma

    They are already out there, we can't change it afterwards.

  100. pep.

    If they want to implement XEP-foobar, this XEP can say "if you implement me you need to do X"

  101. pep.

    In practice it's a similar solution to origin-id

  102. pep.

    I guess XEP-foobar could say "you need to implement origin-id"

  103. lovetox

    larma, i dont have a problem with legacy clients having degraded experience

  104. pep.

    Same here

  105. lovetox

    but yeah message attaching is a bit unique, i have to look into it more

  106. lovetox

    LMC is really not a problem at all, with clients that use the same ids over and over

  107. lovetox

    i wonder how i would display message attaching on the sender side

  108. lovetox

    would i also only link in my own database to the id

  109. lovetox

    or would i copy the message im attaching and adding it in some other database column

  110. lovetox

    i think i would just copy the message into some "attached_data" column

  111. lovetox

    that way i dont even have to care about if the other client uses unique ids

  112. lovetox

    if he uses unique ids he sees the correct stuff, if not, he doesnt implement message attaching anyway and will not see it

  113. larma

    OK, completely realistic scenario: - On my desktop I am using some legacy client that does reuse message ids and does not support message attaching - On my phone I use a modern client that does support message attaching I send a message from my desktop, the recipient attaches a message to my message, I send another message from my desktop that happenns to have the same message id. I open my phone, the message attachment is attached to the wrong message. Even if the XEP of message attaching would require a unique message id, that would not apply to the legacy client, because it doesn't implement the XEP and the recipient does not know if the message id is unique, so can only guess. If the XEP would require origin-id, it would be missing from the legacy client and thus the recipient would be unable to attach to their message

  114. lovetox

    larma i agree origin-id would solve this case

  115. larma

    We could also replace origin-id by an empty <origin-id /> to signal that the message id was unique, but we need this information to be transported in the message.

  116. lovetox

    just upgrading your desktop client would also

  117. lovetox

    hm actually i like that idea, to add an empty element

  118. larma

    The desktop client is allowed to not send a unique id as long as they don't implement a XEP that does require them to. And I am pretty certain there are still a lot of clients that don't do unique ids

  119. larma

    Maybe we want to add this as on option to XEP-0359

  120. larma

    Make 'id' optional on <origin-id> if the id of the surrounding stanza already provides the uniqueness guarantees for <origin-id>

  121. Lance

    empty element would reintroduce the problem the origin-id is there to mitigate: MUCs rewriting the message ids. you'd lose the original id iagain n that case

  122. lovetox

    Lance, but this was fixed in the MUC XEP

  123. larma

    Lance, true for MUCs, but for non-MUCs the 'id' could still be optional

  124. larma

    lovetox, it's an optional feature of MUCs

  125. larma

    to keep the ID when reflecting

  126. Lance

    let me know when the jdev follows the latest MUC XEP

  127. Lance

    let me know when the jdev room follows the latest MUC XEP

  128. Lance

    unfortunately :/

  129. lovetox

    Im not sure what your argument is, we have to stay compatible with old outdated software only because it is compliant with some XEP

  130. larma

    Introducing new requirements in a MUC is not really possible, there will always be legacy servers, only way would be to not connect to those anymore

  131. lovetox

    i couldnt care less about jabber.org

  132. lovetox

    if users tell me Gajim doesnt work with jabber.org, i tell them, go use another server

  133. lovetox

    not rewrite Gajim

  134. larma

    lovetox, > The service SHOULD reflect the message with the same 'id' that was generated by the client, to allow clients to track their outbound messages. If the client did not provide an 'id', the server MAY generate an 'id' and use it for all reflections of the same message (e.g. using a UUID as defined in RFC 4122 [18]).

  135. lovetox

    im fine with indicating with origin-id that a client uses a unique id

  136. lovetox

    i think thats a fine use for origin-id

  137. lovetox

    if you want to change your behavior on that, fine, i think would not have the motiviation

  138. lovetox

    and fyi Gajim sets origin-id since forever :)

  139. larma

    I wasn't talking about Gajim, I was talking about you suggesting pep. not to do it

  140. lovetox

    yes, message attaching is to new, i didnt look into it

  141. lovetox

    i stand by my opinion that for all other XEPs, like LMC you dont really need this

  142. lovetox

    and i think, all clients who support mam, support unique ids

  143. lovetox

    and you would never use a client without mam in a multi client setup

  144. larma

    You don't need it for LMC, but need it as soon as it becomes MC without L

  145. lovetox

    so yeah i agree there are edge cases like the one you told, but im not sure we should put to much effort into supporting these

  146. lovetox

    but adding <origin-id/> to indicate unique ids, is not that much effort :)

  147. larma

    I doubt pep. intention was to actually use a different ID for message @id and <origin-id>

  148. lovetox

    what i really would like to prevent is adding another db column "origin-id"

  149. larma

    I understand your point from a dev perspective ๐Ÿ˜‰

  150. lovetox

    omg, jabber.ru does not send the muc subject on join

  151. lovetox

    if no subject is set

  152. lovetox

    do i interpret this correctly

  153. lovetox


  154. lovetox

    that a room subject MUST be sent, even if there is none?

  155. Zash


  156. lovetox

    i guess users will not join mucs without subject anymore with the next Gajim version :D

  157. Zash

    Prosody didn't send empty subjects in some earlier version but it's been fixed for some time. Not sure about other implementations.

  158. eevvoor

    Why does jabber.ru not send it? Is is due to the server they use? I thought it is a client issue, so how does it affect jabber.ru lovetox?

  159. lovetox

    i think jabber.ru uses some very old ejabberd version

  160. lovetox

    so they dont profit from bugfixes

  161. lovetox

    eevvoor, the MUC xep mandates that a subject has to be sent, and gajim waits for it

  162. lovetox

    and if it never comes, the joining process is never completed

  163. ralphm

    That's silly. I've seen Gajim have more issues like that.

  164. eevvoor

    ah thus not the server sw is the prob but the missing updates :D. Reminds me of the so beloved ccc server. Badly administrated ...

  165. lovetox

    ralphm, whats silly about it?

  166. lovetox

    should we expect servers not following xeps now or what

  167. eevvoor

    yes lovetox you should support backcompatibility forever!!!elven111! XD

  168. ralphm

    lovetox: I agree servers must send it, but waiting on it, even if other stuff for the room comes in, seems silly

  169. lovetox

    so i guess self presence is has 110 status code is also silly?

  170. lovetox

    i mean if there is other stuff coming in, no need for that

  171. lovetox

    subject is the line between history messages, and live messages

  172. lovetox

    its the order of events, it indicates that history messages is complete and the join is full completed

  173. lovetox

    that means i can tell the user, now you can send messages

  174. ralphm

    I understand what it is for

  175. lovetox

    and not in between a history message fetch

  176. eevvoor

    so it carries kin of semantics, lovetox, in your opinion.

  177. eevvoor

    so it carries kind of semantics, lovetox, in your opinion.

  178. ralphm

    What I don't understand is that it stays in limbo forever

  179. eevvoor

    But of course it is nice if the client is robust and works well if the xep is not met, ralphm ?

  180. eevvoor

    So to avoid limbo?

  181. lovetox

    ralphm, of course i could add a timeout

  182. lovetox

    but all clients have to add this workaround

  183. eevvoor

    also not nice, in case it is just delayed ...

  184. lovetox

    or one server can just send a subject

  185. lovetox


  186. eevvoor

    many combinations to be considered ...

  187. ralphm

    lovetox: this is what I mean. A timeout would take care of not keeping the user in limbo in the face of broken servers

  188. eevvoor

    Hm but timeouts can result in strange bugs.

  189. ralphm

    Another example: when Slack still had an XMPP gateway, it didn't respond to the iq for private storage. This would also hang Gajim indefinitely.

  190. ralphm

    I manually patched Gajim to work around this.

  191. eevvoor

    dining philosophers ...

  192. lovetox

    working around servers that dont answer IQs, ok

  193. ralphm

    Why? That's a MUST too

  194. lovetox

    but you still accept it

  195. lovetox

    and worked around it

  196. lovetox

    i mean answering IQs is one of the most basic rules in all of XMPP

  197. lovetox

    if we cant depend on that anymore ..

  198. ralphm

    I am more of the 'expect failure' variety.

  199. lovetox

    its a fine line

  200. ralphm

    And this example was extra tricky, because it is part of the Gajim connect sequence, not a random iq that is sent amongst other things.

  201. ralphm

    Of course I also reported it as a serious bug with Slack.

  202. ralphm

    And not getting a response for private storage shouldn't block the UI to look like the connection is not ready.

  203. ralphm

    So a timeout would have been a better approach, or async handling.

  204. lovetox

    yes i agree, its not anymore btw, requesting bookmarks is now independent of connection process

  205. lovetox

    only server disco info, roster, and roster delimiter iqs are now in the connection process

  206. ralphm