-
Sam
Talking about how to query MAM earlier someone pointed out that if you're using start/end you can get rid of the duplicae last message by also adding a RSM before or after. This effectively makes your query the same as if you'd used the mam#extended "before-id" and "after-id". This made me wonder why not just always query for all history and filter using before/after?
-
moparisthebest
TCP head-of-line blocking ?
-
Sam
I don't understand what you mean; how would that be related?
-
jonas’
Sam, because you cannot filter with before/after. you'll only get N results there, limited to the page size.
-
jonas’
and using before *and* after is underspecified IIRC✎ -
jonas’
and using before *and* after (in the same query) is underspecified IIRC ✏
-
flow
some even believing that using both, 'before' and 'after', should be disallowed
-
Zash
Hence the mam#extended `before-id` and `after-id`
-
Sam
Makes sense if you need both before and after, but I don't
-
Sam
I don't get the thing about paging though, it would start the first page after the one you want, then you'd just page like normal, no? Just like after-id
-
Sam
So I don't get why you'd only get N results
-
jonas’
feature request: all MAM implementatinos should encrypt their before/after tags symmetrically so that implementatoins learn that they are opaque strings and only by coincidence match the archive IDs.
-
Zash
Huuh
-
MattJ
Ah but XEP-0313 requires that they *are* the archive IDs, sorry :)
-
jonas’
MattJ, shush!
-
Zash
But whyyyyyy
-
jonas’
(that requirement should've been removed with the introduction of after-id/before-id✎ -
MattJ
But I agree, that would have been a possible change for #extended :)
-
jonas’
(that requirement should've been removed with the introduction of after-id/before-id) ✏
-
Zash
We could have had query cursors!
-
Sam
Yah, archive IDs are opaque strings and those are archive IDs, right? I really don't understand the difference apparently
-
jonas’
Sam, no
-
jonas’
archive IDs are opaque strings you feed into before-id / after-id
-
Zash
mam#extended2
-
Sam
If they're not what before/after are for what goes there? It's the id attribute which has to match the archive ID, no?
-
jonas’
the things you feed into <before/> and <after/> are opaque strings you get from the RSM result returned by the server (<last/>, <first/> in the <rsm/> tag)
-
Sam
So you can't start a query with before/after?
-
jonas’
no
-
jonas’
(or: "exactly")
-
Sam
Hmm, that's what was recommended to me the other day and it seems to work
-
Zash
That depends on the RSM ... combination with other protocol
-
MattJ
In normal RSM, you can't. It works because XEP-0313 forces implementations to use archive IDs in RSM.
-
Sam
So it seems fine to do in 0313 then
-
jonas’
but why would you rely on this specificity of 313 instead of doing it generically?
-
jonas’
(especially if it might go away…)
-
Zash
Can you take it away tho, without going mam:3
-
Zash
(we're at mam:2 still, right? RIGHT?)
-
MattJ
It's unlikely to be going anywhere... :)
-
Sam
Because I am a normal user who knows nothing about XEP development and doesn't know it will go away and has started using it, making it harder to take away, just like all XEP development
-
jonas’
Zash, > Implementation of the protocol described herein is encouraged in exploratory implementations, but production systems are advised to carefully consider whether it is appropriate to deploy implementations of this protocol before it advances to a status of Draft.
-
Zash
jonas’, you know as well as me that nobody cares about that 🙂
-
jonas’
Zash, it's monday, shush ;P
-
Zash
Supply and demand
-
Zash
Maybe Draft should be forced once it's discovered that the XEP is used in production
-
jonas’
haha
-
Sam
I mean, if it's bad somehow fine, I won't use it, but this seems like a perfectly reasonable thing to do after reading the XEP. Also none of this is made clear anywhere, and other users are apparently doing it (it was Gajim that recommended using it to me), so if you don't want users doing it, seems like the XEP should have made it clear what the differences are, not just made two after's that are exactly the same ("you" being "people who don't want after/before to be used" not the original authors of MAM)
-
jonas’
to me, it's not clear why people think it's a fine thing to do after reading RSM, I'm still trying to figure out (for years) how to make it clear to folks what the difference between query arguments to the backing protocol (e.g. MAM, disco, jabber search) and <before/> and <after/> is.
-
Sam
Is there a place that says "after is something different, here's what it is" in MAM?
-
MattJ
I really don't think it's important, since it's a per-protocol thing anyway
-
MattJ
and MAM explicitly went down this route. I know it doesn't help RSM in the general case, but I think MAM was the first thing to truly use RSM in most implementations.
-
MattJ
Which is sad, because the next thing will mean everyone will either reimplement it or need to refactor it to be generic.
-
Sam
Hmm, okay, the one place that *might* indicate how after is used is the paragraph that starts "Having previously made a query that returned results limited by the server"
-
Sam
That seems to indicate that after can't be used in the first query. However, servers seem to support it, so maybe it's just a case of the XEP not covering edge cases well and servers doing the wrong thing
-
Sam
In which case, we either need to outright forbid it and go into that fight with implementations, or just accept it and codify it (which could possibly result in getting rid of after-id since no one uses it anyways yet and this would be identical at that point)
-
MattJ
It is codified in the XEP, it's too late for that
-
MattJ
"For the purposes of this protocol, the UIDs used by RSM correspond with the UIDs of the stanzas stored in the archive."
-
MattJ
and "Note: There is no concept of an "open query", and servers MUST be prepared to receive arbitrary page requests at any time."
-
Sam
So I don't see why people care that others are using it then or saying it's wrong, seems fine and removes the need to support archive-id which is just a duplicate ¯\_(ツ)_/¯
-
Sam
after-id, I mean
-
MattJ
Sam, because people have opinions. Mine isn't strong on this one :)
-
MattJ
MAM uses RSM because it needed a paging mechanism, and RSM was exactly that.
-
Sam
I mean, yah, sure, I'm just trying to understand if those opinions are grounded in some actual problem I'm going to run into
-
jonas’
Sam, unless you need to select a range of messages, because you can't do that with RSM alone.
-
MattJ
Later people requested arbitrary range requests, which RSM was not
-
Sam
jonas’: sure, that part makes sense, but I don't need to do that.
-
Sam
Also, I kind of wonder if that works just fine too, but I haven't tested it
-
jonas’
it's at least undefinde behaviour as per the spec
-
MattJ
In glorious hindsight we'd have supported that from the beginning, and possibly would have decided not to use RSM (or not use it in the same way)
-
jonas’
so you might end up with different server behaviour across implementations and releases
-
Sam
Right, but I'm not using before and after, so that's fine. As far as I can tell, just querying for after *or* before is well specified and means they are exact duplicates of after-id and before-id (at least, when used on the first page). So that seems like a good strategy to have MAM that works *today* and not have to wait for after-id.
-
MattJ
Yes
-
Zash
> querying for after *or* before is well specified in RSM, right > and means they are exact duplicates of after-id and before-id sans RSM related semantics, like how using before means paging in the other direction
-
Sam
oh yah, good point; also doesn't matter for my use case thankfully (in fact, it's desirable), but that is a difference that could matter to some people.
-
Sam
Thanks for the help all. The only thing still nagging me is whether there's any benefit to continuing to use start/end to limit the initial query. At least from the client side I don't think there is, and if you're using a relational database I don't think there is on the server (assuming you have an index on the archive ID). With other non-relational databases I don't really know.
-
Sam
MattJ: would you be opposed to a summary of this discussion being added to the implementation notes in MAM, or possibly on that page on modernxmpp? I think I finally understand the picture here enough to write it down
-
MattJ
I was just thinking that (that it probably warrants an implementation note at this point)
-
Sam
*nods* I'll draft some text and send it your way shortly.
-
MattJ
Sounds good :)
-
flow
Sam, feel free to announce the PR for modernxmpp here as well, I'd be interested to read your text. Just to see if it matches my understanding too
-
Sam
flow: will do
-
flow
thanks :)
-
Sam
flow, MattJ: done: https://github.com/xsf/xeps/pull/1099