XSF Discussion - 2019-03-24


  1. Guus

    MattJ / flow : https://github.com/xsf/xeps/pull/771

  2. Guus

    Would something go horribly, horribly wrong, if a server simply adds a stable/unique stanza ID to any message that it processes?

  3. MattJ

    It makes client life hard/impossible, sadly

  4. MattJ

    Clients would no longer know when a message is archived or not

  5. MattJ

    So they don't know if the id can/should be stored for later querying (e.g. for catch-up)

  6. Guus

    I do not like the fact that we're deducing that something is archived by merely detecting the presence of something that's supposed to be an opaque identifier.

  7. Guus

    (also, I don't have a better suggestion)

  8. MattJ

    Guus: that's why <archived> existed

  9. Guus

    What's the impossibility for clients, exactly?

  10. Holger

    Hmm I don't quite see this problem. There's no guarantee archived messages will remain in the archive forever anyway.

  11. Holger

    And I don't quite see how the info whether a locally stored message is also in the archive helps the client.

  12. Holger

    In my book it's fine to add a stanza ID to all messages. It may actually help with non-MAM use cases.

  13. Guus

    It'd make my implementation a lot easier...

  14. Andrew Nenakhov

    > Would something go horribly, horribly wrong, if a server simply adds a stable/unique stanza ID to any message that it processes? In short that's the basis of our XEP that we use to ensure message delivery. Works well.

  15. Guus

    Andrew Nenakhov the basis is that something goes wrong if we do (and you found an alternative), or: you do it, and you've seen that nothing goes wrong?

  16. Andrew Nenakhov

    Client sends stanza with provisional id, server stamps it with 0359 unique and stable id, sends this id to client as a confirmation.

  17. Andrew Nenakhov

    Guus, define wrong )

  18. Guus

    Mushroom-clouds on the horizon.

  19. Andrew Nenakhov

    We centralize everything to work via server archive. If archive breaks, kaput, yes

  20. Andrew Nenakhov

    > Mushroom-clouds on the horizon. In case of an all scale nuclear weapon attack, im protocols will hardly matter anymore

  21. Guus

    Andrew Nenakhov I do like to make sure though that IM protocol design decisions do not cause an all scale nuclear weapon attack.

  22. MattJ

    Holger, you're right (there is no guarantee archived messages will remain in the archive forever)

  23. MattJ

    But if it's not in the archive, the client can assume it was purged and it needs to re-fetch the archive

  24. Guus

    MattJ: how can it fetch a purged archive?

  25. MattJ

    Guus, I mean in the sense that old messages are purged

  26. Guus

    Mind you that it is Sunday, I'm an idiot, and did not have enough coffee

  27. Guus

    I don't understand.

  28. MattJ

    Messages in the archive are not kept forever on most deployments

  29. MattJ

    The oldest messages are removed after some expiry time (let's say 30 days)

  30. Guus

    with you so far.

  31. Guus

    but why does the client need to re-fetch the archive if that happens?

  32. Guus

    need/would want to

  33. MattJ

    So a client that wants to receive all messages works by continuously remembering the id of the last archived message it received

  34. MattJ

    When it goes offline for a couple of days, it will come back online and request all messages since the last id it saw

  35. Guus

    if anything, it'd do a massive amount of data transfer only to end up with _less_ local history?

  36. MattJ

    With me still?

  37. Guus

    yes

  38. MattJ

    So now it goes offline for two months

  39. MattJ

    The last id it saw is no longer in the server's archive

  40. MattJ

    So it performs the query and gets item-not-found

  41. MattJ

    so it knows that the message has been expired, and any messages in the archive are messages it has never seen before

  42. MattJ

    because they are all newer

  43. Guus

    right. And if we'd slap on a stanza-id on every message without archiving, it'd _always_ get a item-not-found, assume its local cache is older than that's on the server, and it'd download all history, every time.

  44. Guus

    that's what you're saying, right?

  45. MattJ

    Yes

  46. Guus

    I'm guessing that it'd not do this if the server doesn't advertise the MAM feature though.

  47. MattJ

    Sure

  48. waqas

    I feel like the whole "item-not-found means get full archive" thing is a hack. A server could lose a message for other reasons, e.g., storage failure causing recent stuff to be lost, or deletion of specific message due to gdpr, or some bug, etc.

  49. MattJ

    waqas, it's not allowed to

  50. pep.

    > and it'd download all history, every time. It would download up to the date it just requested

  51. Guus

    waqas I was trying to formulate a similar remark in my head.

  52. MattJ

    waqas, it can replace with placeholders if it needs to

  53. pep.

    Which may or may not be the whole history

  54. waqas

    MattJ: Storage failure isn't something someone can't be allowed to have.

  55. MattJ

    waqas, handling storage failure in defined ways is entirely sensible

  56. Guus

    sure, but soleely depening on 'item-not-found' based on a last-known ID still seems ... hackish...

  57. Guus

    sure, but solelely depening on 'item-not-found' based on a last-known ID still seems ... hackish...

  58. MattJ

    Guus, it's defined by the XEP to be this way, it's absolutely not a hack

  59. MattJ

    I mean, what else would you guys propose??

  60. waqas

    MattJ: Not really. If you lose a disk and restore from recent'ish backup, you'll have a situation where supposedly every recent message would be item-not-found..

  61. Guus

    MattJ it's a lot easier to disagree with stuff without having to suggest better alternatives 😉

  62. MattJ

    waqas, you can't just rewind time like that in most systems without consequences

  63. waqas

    Yes, and given that laws of physics disallow "messages can't be removed from archive after acked", a protocol shouldn't rely on that.

  64. Guus

    what if the client asks for the last-known ID archived by the server?

  65. waqas

    MattJ: To be clear, I think a sane recommendation would be if item-not-found, get archive by some timestamp based setup, but trying to get archive from beginning of time is silly in such a case.

  66. pep.

    (What I said above?)

  67. Guus

    (removed bad idea)

  68. waqas

    Yep, listen to pep.

  69. MattJ

    Yes, but the server was relocated to a different timezone and the admin forgot to set it to UTC

  70. pep.

    Dates don't include TZs? :s

  71. waqas

    Almost all popular dbs people use (mysql, postgres) in their default replica settings, when the master node is lost and another takes over (or a restoration from backup happens) will potentially lose recent writes. If the MAM XEP wants to assume that wouldn't happen, I'd consider it pretty silly.

  72. MattJ

    waqas, if you want to write your own XEP go ahead

  73. waqas

    MattJ: Do you see the problem I'm pointing out?

  74. Guus

    Maybe 'silly' isn't the best classifier here.

  75. Guus

    > Yes, but the server was relocated to a different timezone and the admin forgot to set it to UTC do we need the XEP to account for this?

  76. MattJ

    Guus, do we need the XEP to account for any of this?

  77. Guus

    Well, if we can modify it somehow to be more resilient against data corruption, and allow for easier re-use of stanza-id, I think it'd add considerable value.

  78. MattJ

    waqas, I don't think a server that can't provide a durable store should be able to claim it does

  79. MattJ

    There's a simple fix for this, the XEP already has a flag to tell the client that the results are not necessarily persisted

  80. waqas

    MattJ: I'm asserting that the vast majority of MAM deployments can't guarantee durability in a disk-lost scenario. Recent writes being lost is a fact of life, you can't spec your way around it without mandating things you have no way to mandate.

  81. MattJ

    I look forward to your PR

  82. waqas

    Note that I don't think the MAM XEP has to change, just the assumption that item-not-found always means MAM storage was deleted up to that item is wrong.

  83. MattJ

    So yet another hidden thing for client devs to think about

  84. waqas

    In a world where fsync doesn't necessarily mean data was durably stored, and SQL dbs multi-master replication defaults to async mode (and is rarely used anyway), that's reality.

  85. Guus

    MattJ where in the XEP is the what I called 'hack' described?

  86. Guus

    I was looking to see if the exact wording would make me think of hints for improval

  87. MattJ

    Guus, it quite possibly isn't

  88. Guus

    ah ok.

  89. Guus

    I'd love to be able to add stanza-id's everywhere, without implying that this means that MAM is available.

  90. Guus

    but doesn't service discovery sufficiently guard against that?

  91. MattJ

    Adding stanza-id doesn't imply MAM is available

  92. MattJ

    Buf it MAM is available, it implies you can't put stanza-id on every stanza

  93. Guus

    I'd like be able to. Is a feasible solution one that allows the client to request the id of the most-recent MAM entry, in order to verify if it has that one in its local archive?

  94. Guus

    If the XEP doesn't currently define the 'store the id of the last message, assuming that it is the last ID in your server-sided archive', there might be room for a change like that?

  95. MattJ

    Guus, one of the main premises of the XEP is history sync, this would break it

  96. MattJ

    Forget the message purging issue for the moment

  97. MattJ

    If the client records the id of the last message it received, and then later uses this to query an archive, what would you propose it do if the id it happened to remember wasn't an archived one?

  98. Guus

    item-not-found

  99. MattJ

    and then what?

  100. Guus

    Naively (I'm not client builder): I'd see up until what date I'd have a local archive, and retrieve from there.

  101. MattJ

    So fetch by timestamp?

  102. Guus

    with some wiggle-room, but yes.

  103. MattJ

    That way you'll either get duplicates or miss messages

  104. MattJ

    And that's not hackish?

  105. Guus

    Duplicates I can de-dupe with the message ID

  106. MattJ

    We could have just built the whole XEP on timestamps instead of ids if we're happy with that

  107. Guus

    misses would be bad.

  108. MattJ

    It's an ugly hack

  109. Guus

    well, let's not rewrite everything just yet - I'm fairly certain you've put way more thoughts into this than I have 🙂

  110. MattJ

    This is not something I would accept a rewrite for, for certain

  111. MattJ

    The correct fix is to re-introduce a way for the client to know whether the message is in the server's archive or not

  112. Holger

    > Buf it MAM is available, it implies you can't put stanza-id on every stanza Depends on server implementation, no? The server just must be able to respond to the before/after requests.

  113. Guus

    so, why can't it ask for the last-recorded message id in the archive?

  114. MattJ

    Guus, how does that help?

  115. Guus

    what's my last message? do I have this? no: resync everything.

  116. MattJ

    Guus, that's broken

  117. MattJ

    Just because you don't have the last message in the archive doesn't mean you don't have the first

  118. Holger

    E.g. ejabberd uses timestamps as IDs, so it doesn't matter whether the queried ID is archived, before/after still does the right thing.

  119. Guus

    resync everything from the last one that you have, I mean.

  120. MattJ

    Guus, you don't know what the last one you have is

  121. MattJ

    Holger, multiple stanzas with the same timestamp?

  122. Holger

    Microsecond accuracy, if you hit that in practice then yes it breaks.

  123. MattJ

    Holger, what about clock drift then?

  124. Guus

    MattJ how don't you know what your last message is? You can order your local archive chronologically, use the last one?

  125. MattJ

    I'm not against using timestamps *in* the id, but it's wrong to use them as the id with no extra logic

  126. MattJ

    Guus, the last what? I don't know which ones the server archived

  127. Holger

    MattJ: Clock drift across cluster nodes? That would break as well yes.

  128. Guus

    Hmm, my parents just walked in. Wife is preparing for 'the stare' again.

  129. Holger

    (Or do you mean clock jumping backwards? That can't happen with our clock except server restarts.)

  130. Guus

    Mattj, but if archiving is enabled, you can assume that the messages that you have in ... aah, I don't have the time to further discuss this now, sorry.

  131. Guus

    ('stare')

  132. Guus

    I'd love to pick this up later.

  133. MattJ

    Holger, using the system's monotonic clock? or something custom?

  134. Guus

    got to go now

  135. Holger

    Erlang has a thing that doesn't jump back, not sure how it's implemented.

  136. Holger

    Anyway yes this is not the most robust solution against such pathological cases of course (it just has other nice properties). Whatever I just wanted to say that MAM doesn't imply only archives messages have an ID per se.

  137. Holger

    (ejabberd doesn't actually add IDs to non-archived messages, though I keep pondering with it.)

  138. MattJ

    Holger, as discussed, things will break (read: get hard/impossible) for clients if you add stanza-id to non-archived stanzas

  139. MattJ

    which is not a good situation, and should be fixed

  140. Holger

    Maybe I misunderstood the breakage vector. I would've thought things will be fine as long as the server is aware how the non-archived IDs are ordered compared to the archived messages.

  141. Holger

    MattJ, just in case you're interested, this sounds like custom clock that (attempts to) adjusts towards OS clock by changing frequency (up to 1%) while avoiding jumps: http://erlang.org/doc/apps/erts/time_correction.html#No_Time_Warp_Mode

  142. MattJ

    Fun

  143. Holger

    (At the cost of risking incorrect offsets of course, so they warn against doing this.)

  144. MattJ

    Holger, the server knowing how to interpret the ids is not really relevant... unless you're saying it should not return item-not-found but quietly accept ids that don't actually exist in the archive

  145. MattJ

    That would cause weirdness with clients that try to fill holes

  146. MattJ

    and probably other stuff

  147. Holger

    > unless you're saying it should not return item-not-found but quietly accept ids that don't actually exist in the archive Ah yes that's what I'm saying. IIRC 0059 suggests doing just that (wasn't it even a SHOULD?).

  148. Holger

    But I'm on my phone and the sun is shining. Gonna shut up now 🙂

  149. MattJ

    Is it too late to start over with MAM?

  150. MattJ

    Not using RSM for a start

  151. MattJ

    Trying to use existing building blocks has just caused confusion and unintended consequences

  152. pep.

    Well MAM is still experimental :-°

  153. pep.

    What about another bump?

  154. MattJ

    Everyone would love that

  155. pep.

    That's a thing I don't like in general. The XEP is still experimental but in reality it's just as if it was almost Final. If you change anything everybody is going to grump

  156. MattJ

    It certainly still has open issues, as a spec

  157. pep.

    Sure. I'm not just talking about MAM, that's how I feel about our specs in general

  158. MattJ

    Can't have it both ways

  159. MattJ

    Just this morning it was mentioned that XEP-0313 being Experimental is a reason Pidgin doesn't have support

  160. pep.

    I'd say that's an issue with developer expectations. If you implement it as experimental, know that it's likely going to change

  161. pep.

    (And even more, really, draft, even a final spec can be amended with another spec, so..)

  162. pep.

    MattJ, isn't that just an excuse from pidgin devs? :p

  163. MattJ

    It's not a viewpoint I share, but I'm biased

  164. waqas

    Are devs expected to implement experimental xeps?

  165. MattJ

    If a standard explicitly has a big red warning at the top, and warning or no warning is subject to radical change... if I had a limited amount of free time, would I want to implement it?

  166. waqas

    "While implementation of an Experimental protocol is encouraged in order to determine the feasibility of the proposed solution, it is not recommended for such implementations to be included in the primary release for a software product (as opposed to an experimental branch)." — https://xmpp.org/extensions/xep-0001.html#states-Experimental

  167. pep.

    waqas, in the meantime, it's a needed feature

  168. pep.

    And it's even in the compliance suite..

  169. MattJ

    That's the real problem (that experimental or not, it's a needed feature)

  170. pep.

    I'd say both these criteria (needed feature / compliance suite) put even more pressure on the XEP to go to draft/final. I'm not saying I like it

  171. pep.

    And as you say there are still areas that need to be improved

  172. pep.

    Maybe there should be a rule that compliance suites can't recommend draft specs. In the hope that people focus/provide feedback on XEPs that are needed

  173. Zash

    I thought there was

  174. pep.

    Well if there was, MAM shouldn't be in there

  175. pep.

    nor carbons? (last call ended but it's still proposed)

  176. waqas

    We should stop calling them "compliance" suites

  177. flow

    MattJ, I am not sure if using existing building blocks caused confusion. It appears to me that not clarifying how they are intended to use and are allowed to use (think for example if <before/> and <after/> can be used in the same query) is causing confusion

  178. MattJ

    flow: they can't, the end :)

  179. flow

    That is what I would also say, but it is at least underspecified in XEP-RSM