jdev - 2021-08-23

  1. Sam

    Talking about how to query MAM earlier someone pointed out that if you're using start/end you can get rid of the duplicae last message by also adding a RSM before or after. This effectively makes your query the same as if you'd used the mam#extended "before-id" and "after-id". This made me wonder why not just always query for all history and filter using before/after?

  2. moparisthebest

    TCP head-of-line blocking ?

  3. Sam

    I don't understand what you mean; how would that be related?

  4. jonas’

    Sam, because you cannot filter with before/after. you'll only get N results there, limited to the page size.

  5. jonas’

    and using before *and* after is underspecified IIRC

  6. jonas’

    and using before *and* after (in the same query) is underspecified IIRC

  7. flow

    some even believing that using both, 'before' and 'after', should be disallowed

  8. Zash

    Hence the mam#extended `before-id` and `after-id`

  9. Sam

    Makes sense if you need both before and after, but I don't

  10. Sam

    I don't get the thing about paging though, it would start the first page after the one you want, then you'd just page like normal, no? Just like after-id

  11. Sam

    So I don't get why you'd only get N results

  12. jonas’

    feature request: all MAM implementatinos should encrypt their before/after tags symmetrically so that implementatoins learn that they are opaque strings and only by coincidence match the archive IDs.

  13. Zash


  14. MattJ

    Ah but XEP-0313 requires that they *are* the archive IDs, sorry :)

  15. jonas’

    MattJ, shush!

  16. Zash

    But whyyyyyy

  17. jonas’

    (that requirement should've been removed with the introduction of after-id/before-id

  18. MattJ

    But I agree, that would have been a possible change for #extended :)

  19. jonas’

    (that requirement should've been removed with the introduction of after-id/before-id)

  20. Zash

    We could have had query cursors!

  21. Sam

    Yah, archive IDs are opaque strings and those are archive IDs, right? I really don't understand the difference apparently

  22. jonas’

    Sam, no

  23. jonas’

    archive IDs are opaque strings you feed into before-id / after-id

  24. Zash


  25. Sam

    If they're not what before/after are for what goes there? It's the id attribute which has to match the archive ID, no?

  26. jonas’

    the things you feed into <before/> and <after/> are opaque strings you get from the RSM result returned by the server (<last/>, <first/> in the <rsm/> tag)

  27. Sam

    So you can't start a query with before/after?

  28. jonas’


  29. jonas’

    (or: "exactly")

  30. Sam

    Hmm, that's what was recommended to me the other day and it seems to work

  31. Zash

    That depends on the RSM ... combination with other protocol

  32. MattJ

    In normal RSM, you can't. It works because XEP-0313 forces implementations to use archive IDs in RSM.

  33. Sam

    So it seems fine to do in 0313 then

  34. jonas’

    but why would you rely on this specificity of 313 instead of doing it generically?

  35. jonas’

    (especially if it might go away…)

  36. Zash

    Can you take it away tho, without going mam:3

  37. Zash

    (we're at mam:2 still, right? RIGHT?)

  38. MattJ

    It's unlikely to be going anywhere... :)

  39. Sam

    Because I am a normal user who knows nothing about XEP development and doesn't know it will go away and has started using it, making it harder to take away, just like all XEP development

  40. jonas’

    Zash, > Implementation of the protocol described herein is encouraged in exploratory implementations, but production systems are advised to carefully consider whether it is appropriate to deploy implementations of this protocol before it advances to a status of Draft.

  41. Zash

    jonas’, you know as well as me that nobody cares about that 🙂

  42. jonas’

    Zash, it's monday, shush ;P

  43. Zash

    Supply and demand

  44. Zash

    Maybe Draft should be forced once it's discovered that the XEP is used in production

  45. jonas’


  46. Sam

    I mean, if it's bad somehow fine, I won't use it, but this seems like a perfectly reasonable thing to do after reading the XEP. Also none of this is made clear anywhere, and other users are apparently doing it (it was Gajim that recommended using it to me), so if you don't want users doing it, seems like the XEP should have made it clear what the differences are, not just made two after's that are exactly the same ("you" being "people who don't want after/before to be used" not the original authors of MAM)

  47. jonas’

    to me, it's not clear why people think it's a fine thing to do after reading RSM, I'm still trying to figure out (for years) how to make it clear to folks what the difference between query arguments to the backing protocol (e.g. MAM, disco, jabber search) and <before/> and <after/> is.

  48. Sam

    Is there a place that says "after is something different, here's what it is" in MAM?

  49. MattJ

    I really don't think it's important, since it's a per-protocol thing anyway

  50. MattJ

    and MAM explicitly went down this route. I know it doesn't help RSM in the general case, but I think MAM was the first thing to truly use RSM in most implementations.

  51. MattJ

    Which is sad, because the next thing will mean everyone will either reimplement it or need to refactor it to be generic.

  52. Sam

    Hmm, okay, the one place that *might* indicate how after is used is the paragraph that starts "Having previously made a query that returned results limited by the server"

  53. Sam

    That seems to indicate that after can't be used in the first query. However, servers seem to support it, so maybe it's just a case of the XEP not covering edge cases well and servers doing the wrong thing

  54. Sam

    In which case, we either need to outright forbid it and go into that fight with implementations, or just accept it and codify it (which could possibly result in getting rid of after-id since no one uses it anyways yet and this would be identical at that point)

  55. MattJ

    It is codified in the XEP, it's too late for that

  56. MattJ

    "For the purposes of this protocol, the UIDs used by RSM correspond with the UIDs of the stanzas stored in the archive."

  57. MattJ

    and "Note: There is no concept of an "open query", and servers MUST be prepared to receive arbitrary page requests at any time."

  58. Sam

    So I don't see why people care that others are using it then or saying it's wrong, seems fine and removes the need to support archive-id which is just a duplicate ¯\_(ツ)_/¯

  59. Sam

    after-id, I mean

  60. MattJ

    Sam, because people have opinions. Mine isn't strong on this one :)

  61. MattJ

    MAM uses RSM because it needed a paging mechanism, and RSM was exactly that.

  62. Sam

    I mean, yah, sure, I'm just trying to understand if those opinions are grounded in some actual problem I'm going to run into

  63. jonas’

    Sam, unless you need to select a range of messages, because you can't do that with RSM alone.

  64. MattJ

    Later people requested arbitrary range requests, which RSM was not

  65. Sam

    jonas’: sure, that part makes sense, but I don't need to do that.

  66. Sam

    Also, I kind of wonder if that works just fine too, but I haven't tested it

  67. jonas’

    it's at least undefinde behaviour as per the spec

  68. MattJ

    In glorious hindsight we'd have supported that from the beginning, and possibly would have decided not to use RSM (or not use it in the same way)

  69. jonas’

    so you might end up with different server behaviour across implementations and releases

  70. Sam

    Right, but I'm not using before and after, so that's fine. As far as I can tell, just querying for after *or* before is well specified and means they are exact duplicates of after-id and before-id (at least, when used on the first page). So that seems like a good strategy to have MAM that works *today* and not have to wait for after-id.

  71. MattJ


  72. Zash

    > querying for after *or* before is well specified in RSM, right > and means they are exact duplicates of after-id and before-id sans RSM related semantics, like how using before means paging in the other direction

  73. Sam

    oh yah, good point; also doesn't matter for my use case thankfully (in fact, it's desirable), but that is a difference that could matter to some people.

  74. Sam

    Thanks for the help all. The only thing still nagging me is whether there's any benefit to continuing to use start/end to limit the initial query. At least from the client side I don't think there is, and if you're using a relational database I don't think there is on the server (assuming you have an index on the archive ID). With other non-relational databases I don't really know.

  75. Sam

    MattJ: would you be opposed to a summary of this discussion being added to the implementation notes in MAM, or possibly on that page on modernxmpp? I think I finally understand the picture here enough to write it down

  76. MattJ

    I was just thinking that (that it probably warrants an implementation note at this point)

  77. Sam

    *nods* I'll draft some text and send it your way shortly.

  78. MattJ

    Sounds good :)

  79. flow

    Sam, feel free to announce the PR for modernxmpp here as well, I'd be interested to read your text. Just to see if it matches my understanding too

  80. Sam

    flow: will do

  81. flow

    thanks :)

  82. Sam

    flow, MattJ: done: https://github.com/xsf/xeps/pull/1099