XSF Discussion - 2021-02-24


  1. SamWhited

    Looks like this co-op magazine about the web might be taking submissions if anyone is a good writer and wants to make a few extra dollars while writing about XMPP: https://compost.digital/

  2. MattJ

    SamWhited: > Does it matter if you include <after> in Result Set management, but also set after-id in the MAM form? Does the latest ID win? The form is the filter query, RSM is for paging through the results of the query.

  3. jonas’

    form is where, RSM is limit

  4. jonas’

    form is WHERE, RSM is LIMIT

  5. Zash

    And OFFSET

  6. jonas’

    yep

  7. Zash

    Except how do you offset on an id?

  8. jonas’

    yes, it’s better than SQL in that regard

  9. jonas’

    but semantically it’s that

  10. SamWhited

    RSM has an 'after' concept too though.

  11. Zash

    SamWhited: I think you misunderstood what was said. The RSM 'after' corresponds to OFFSET in SQL

  12. SamWhited

    I don't see how that's different than. after-id still

  13. Zash

    after-id is WHERE RSM after is OFFSET

  14. jonas’

    SamWhited, that `after` in MAM-RSM corresponds to the stanza ID is an implementation detail you must not rely on

  15. jonas’

    on the surface it doesn’t matter, but there are ugly edge cases and long mailing list discussions which came to be because of the lack of a WHERE which is separate from the RSM-<after/>/<before/>-LIMIT mechanism

  16. Zash

    The MAM stuff selects a subset of messages. RSM shifts a view over that subset.

  17. Zash

    FWIW while implementing extended MAM in Prosody I cheated and just put the after-id in the same slot in the internal API as the RSM after

  18. Zash

    But, implementation detail..

  19. jonas’

    shush!

  20. Zash

    Only matters for <count> I think, which should not be affected by RSM parameters, only MAM ones

  21. SamWhited

    ooh, I need to read RSM again I think then. Are you saying that 'after' is an int, not an ID?

  22. jonas’

    SamWhited, no, after is an opaque string

  23. jonas’

    you mustn’t interpret it (on the client side anyway)

  24. SamWhited

    I still have zero concept of what you mean by 'offset' and 'where' then

  25. Zash

    SamWhited: I think we were talking abstract concepts, not exact SQL mapping

  26. jonas’

    I was for sure

  27. Zash

    OFFSET in SQL is an int, yes. But conceptually you SELECT a range, and then shift the starting point with OFFSET

  28. jonas’

    no, you select a *subset of records*

  29. jonas’

    not necessarily a range (which is thought of to be regular or consecutive even)

  30. Zash

    /correct s/range/subset/

  31. Zash

    I must have thought of a SELECT with only after-id

  32. MattJ

    Step 1 is to run the query and get the results, step 2 is to return the correct page. If you use <after> in RSM with an id that isn't in the result set you'll get an item-not-found: https://xmpp.org/extensions/xep-0059.html#notfound

  33. MattJ

    (previous versions of 313 indicated otherwise, but this was fixed in one of the later revisions)

  34. SamWhited

    Okay, no longer on a phone and coffee has been aquired, I was hoping that would help, but I still don't see how that's different from after-id. I mean, they both contain "after" in the name and take an ID, they both return an error if the ID doesn't exist.

  35. SamWhited

    Going back to read the RSM description of "after" now.

  36. Kev

    It’s not a mechanical objection, AFAICT Sam, but a case of dogma not wanting to use RSM for this. Obviously it worked mechanically using RSM, because that was deployed and working.

  37. SamWhited

    Kev: I guess that's what I don't get then, it says "use RSM" but then also says "Also here's more form fields that appear the same"

  38. SamWhited

    Is it just that RSM requires max (I think?) and this is just "after but without requiring a limit"?

  39. SamWhited

    Or is it that RSM doesn't specify what the ID is and it might be an ID local to the result set and not the actual stanza ID like after-id takes? Although in that case I'd question why it didn't just say "use RSM, and use the stanza ID"

  40. Zash

    Isn't that exactly what MAM says?

  41. Zash

    You use RSM, IDs are stanza-ids.

  42. SamWhited

    Okay, then it definitely seems like exactly the same thing in two places in the sstanza to me

  43. SamWhited

    Maybe someone can provide a SQL query that includes <after> and after-id how they would translate?

  44. Zash

    Again, conceptually, after-id would map to `WHERE item_id > after-id` and RSM after would map to `OFFSET after-first.id` or something (excuse the math)

  45. SamWhited

    So after-id is some sort of string comparison? I dont' get that from reading the XEP

  46. SamWhited

    Nothing says that IDs have to be ordered as far as I can see

  47. Zash

    You're still taking this too literally

  48. SamWhited

    Okay, let's step back then, I don't understand what the difference between WHERE and OFFSET is in that query if it's not "one is doing a string comparison, one is saying 'give me rows that are after this row'"

  49. Zash

    So yeah in actual SQL it'd be more like `WHERE _internal_id > (SELECT _internal_id WHERE id=$after_id}`

  50. Zash

    Unless you actually do have ordered IDs

  51. SamWhited

    That seems *exactly* the same as offset then

  52. Zash

    But we can't count on that, it's an implementation detail

  53. Zash

    It's not

  54. Zash

    You could implement it either way, but it's not the same, conceptually.

  55. SamWhited

    I guess I don't see how. Or if they are the same but just "different conceptually" why bother having both?

  56. SamWhited

    Let me try this: would I ever actually *use* both in a MAM query?

  57. Zash

    You could also implement it by not having any limit, collect *all* the results and then throw away items until you go past the RSM after id

  58. Zash

    Then the MAM fields go in the SQL, RSM go into cropping the results afterwards

  59. SamWhited

    Where does it say that?

  60. Zash

    Using both after-id and rsm after would be ponitless yes.

  61. SamWhited

    So what is the point of having both if using them both would be pointless?

  62. SamWhited

    Or does it actually specify how I should implement it somewhere?

  63. Zash

    It's written a bit up where MattJ said so.

  64. SamWhited

    That's just an error that gets returned?

  65. Zash

    > Step 1 is to run the query and get the results, step 2 is to return the correct page.

  66. Zash

    That part

  67. SamWhited

    Where does the XEP say that though?

  68. SamWhited

    And why would it force an implementation detail like that on me?

  69. SamWhited

    Eg. wouldn't I just have a single concept of "give me stuff after this ID" and then if I wanted to just directly query for that vs. query for everything, cache those results, and return subsets of them as more queries come in entirely my decision and I wouldn't need two afters to indicate either thing?

  70. MattJ

    Step 1: you want to query the archive between two ids Step 2: You formulate the query using after-id and before-id Step 3: You send the query to the server Step 4: The server returns the results, but there are many, so it returns the first page of 20 stanzas Step 5: You make a second request with the same query parameters, but specify <after>last-id-in-the-page</after> in your RSM payload

  71. MattJ

    You *could* just update after-id in the query parameters, you'll get the same messages

  72. SamWhited

    Why not just make the query specifying <after> to start with, is what I'm asking?

  73. edhelas

    "how to mimic SQL requests on top of a protocol where the server is actually doing SQL requests behind"

  74. MattJ

    after-id didn't always exist, and it has different semantics (e.g. you can't specify before+after in RSM)

  75. MattJ

    But yes, they do overlap, but they are conceptually different

  76. Zash

    It matters more for before, since that implicitly makes it so you start at the end of the results and page backwards

  77. SamWhited

    I don't see why MAM couldn't just say "if before and after are in RSM it's a range in this spec", but okay, that's the first thing that's vaguely sounded like a real difference to me, thanks

  78. MattJ

    So I recommend using each one for its intended purpose

  79. SamWhited

    ahhh, okay, before meaning you go backwards makes sense, so it would need a different concept I guess. That's absolutely infuriating, but makes sense.

  80. SamWhited

    Okay, I dunno, I guess I still don't see why MAM couldn't just specify that, but I guess it's a tiny bit weird modifying RSM behavior just for one spec. This seems like a major failure of RSM though that we had to include another concept of the exact same thing, but also sort of use RSM but only sometimes, etc.

  81. MattJ

    I think RSM is pretty good at what it's designed for

  82. MattJ

    It's not designed for querying

  83. SamWhited

    Fair enough; seems like it just shouldn't be used in MAM at all then.

  84. MattJ

    and the alternative would be duplicating all the paging syntax and semantics into MAM

  85. SamWhited

    Maybe I'm really the only one who's confused by this, but that would seem to be better. If RSM just doesn't work for what MAM's trying to do, shoehorning it in and duplicating some of its concepts just seems confusing to me.

  86. MattJ

    It does work

  87. MattJ

    RSM is not for querying, MAM is not for paging

  88. MattJ

    I really don't see the problem :)

  89. MattJ

    Unless you choose to confuse the two

  90. SamWhited

    That seems like a distinction without a difference to me in this particular case.

  91. MattJ

    If you choose to treat them as the same thing, then you're inviting confusion in edge cases

  92. SamWhited

    I dunno, maybe I'm just being obtuse since no one else seems to have a problem with this, but saying "there's after and there's after, they do the same thing but one of them is for querying initially and one is for subsequent queries to get pages later" just seems insane

  93. MattJ

    I don't see a problem with "here is a query", and "here is how you access further pages of the same query"

  94. SamWhited

    But again, I seem to be the only one confused by this so I guess I'm wrong 🤷\

  95. MattJ

    You decide what results you are interested in, and then you page through until you're done

  96. SamWhited

    And the spec says *none* of this as far as I can tell. There was one throw away comment about how after/before behavior wasn't defined in RSM, that's it I think

  97. SamWhited

    The spec just leaves you wondering why after-id and after aren't the same thing.

  98. MattJ

    Says none of what exactly?

  99. MattJ

    "A client or server will typically want to limit the number of results transmitted at a time, thereby breaking the result stream into smaller 'pages'. For this purpose a server MUST support Result Set Management (XEP-0059) [4] and MUST support the paging mechanism defined therein."

  100. MattJ

    That's the section that introduces RSM

  101. SamWhited

    Doesn't say that RSM is for querying after the first query and after-id is for the first query or whwhatever

  102. MattJ

    Because that's not the case

  103. SamWhited

    It just introduces two concepts, after and after-id that at first glance appear identical.

  104. SamWhited

    I swear that's literally what you just said, no?

  105. MattJ

    RSM is not designed for querying

  106. MattJ

    Yes, you can get the same result set two different ways (after-id or using RSM) in specific query types

  107. MattJ

    But you're designing an API here

  108. MattJ

    So this should be of benefit to you, you need to get it right

  109. Zash

    grep vs more ?

  110. SamWhited

    "filtering vs. querying" then. The point is that the spec just says "you can use RSM, it has after, there is also after-id in the form" (not an actual quote, obviously).

  111. MattJ

    1. choose the results you are interested in 2. select the page

  112. MattJ

    In an API these things should be 100% different

  113. MattJ

    For some queries (where you only care about ids), sure, you could do everything with RSM

  114. MattJ

    For anything more complex, they are not the same thing

  115. SamWhited

    I'm not sure if this is offtopic or not, because I honestly can't tell how it applies to after-id vs. after. But when zash says "grep vs more" or MattJ says "1. choose the results you are interested in 2. select the page", does it matter that I'm doing that behind thes scenes? I'm not sure how to phrase this, but that just sounds like an implementation detail that doesn't matter.

  116. SamWhited

    I would probably *never* do that in an implementation. I would always just make small queries, return results. Those results include info for making the next query. I would never actually pull out the entire archive within a range and then allow smaller queries to be filtered on that.

  117. SamWhited

    I can't tell if it actually matters for this discussion though.

  118. MattJ

    Just make your API nice, I don't care what mechanics you use if it works and is a nice API

  119. MattJ

    If you do it wrong I just reserve the right to never look at the code :)

  120. SamWhited

    I *think* my API is nice, but as far as I can tell no matter how complex the query you could set the ID in after or after-id *unless* you need a window and want to set after/before. So the XMPP API still seems really weird to me.

  121. SamWhited

    (assuming you're using an ID and want stuff after it at all, I mean, obviously you could also set neither for the initial query)

  122. SamWhited

    So I can do what you say and use after-id the first time and <after> afterwards, it just seems worth clarifying in the XEP that you should always do that.

  123. MattJ

    You don't just use after-id the first time, you can include it every time

  124. SamWhited

    Oh crap, okay, now I'm back to square one and have no idea what you're talking about or why I'd do that.

  125. MattJ

    Ok, if we're back to square one then I have better things to do :)

  126. MattJ

    Just implement it, I'm sure you'll manage

  127. SamWhited

    Maybe it really is just me in which case fair enough, but I really don't see how this distinction isn't confusing and doesn't need to be clarified in the XEP.

  128. MattJ

    Also consider whether you want to make RSM reusable in your codebase

  129. MattJ

    Because it sounds like you're not even considering that

  130. SamWhited

    RSM is already resuable in the codebase.

  131. MattJ

    But you don't want to reuse it for some reason, even though MAM explicitly allows it

  132. MattJ

    You would rather recalculate MAM query parameters instead of just tacking on the RSM page request

  133. SamWhited

    I mean, I want to reuse it if it fits, but it sounds like MAM is having to partially re-invent it which is just confusing.

  134. MattJ

    I don't care what you do, but anything else seems more work

  135. MattJ

    MAM is not partially reinventing it because RSM does *not* do querying

  136. MattJ

    If you read the XEP that's pretty clear

  137. SamWhited

    I don't have to recalculate anything, it sounds like *exactly* the same thing to me which is what I still don't get. You make a query, get results back. You make another query and change <after> (or after-id) to the last one you got from the previous query.

  138. MattJ

    The RSM XEP, that is

  139. SamWhited

    MAM you have to make a query. RSM picks the page you get. I get that, I just don't see how it changes that after and after-id are the same thing.

  140. MattJ

    If you start with a time range query you're not just changing after-id, you're adding it. And if you want the last page first, what are you going to do then?

  141. MattJ

    Yes, I 100% agree that in some cases you can achieve the same query in two different ways

  142. MattJ

    I 100% don't think that this is something that needs to take up an afternoon of discussion

  143. MattJ

    If your API has the concept of a query, and the concept of a page, they map perfectly to MAM and to RSM

  144. SamWhited

    > If you start with a time range query you're not just changing after-id, you're adding it. Sure, but that would be the exact same thing still if you use after-id or <after>, right? Yes, you can also make other queries, I don't see how that changes anything. > > And if you want the last page first, what are you going to do then? You make the query using <before/> just like both MAM and RSM say to do, it has nothing to do with after/after-id being duplicated?

  145. MattJ

    if you want to conflate the two, feel free to do that

  146. SamWhited

    Okay, I dunno, I don't think I'm conflating anything and I don't see why that distinction should even exist, which is what's concerning me, but I'll go re-read the relavant sections of MAM again and see if I can find anthing that clarifies this that I might have forgotten. Thanks for your help.

  147. SamWhited

    This is just driving me *nuts* because I don't understand the miscommunication. I'm not sure if I'm asking something wrong or there's some assumption others have that I don't, or what. It literally seems like we just have two identical things in the XEP and the only reason is that *technically* there is some slightly semantic difference in the words we use to describe them but no practical difference.

  148. Zash

    I couldn't consume coffee fast enough to keep up.

  149. SamWhited

    Okay, sorry, last thing just to make sure I'm being absolutely clear about what's confusing me with a practical example, then I'll just give up and go re-read MAM again:

  150. SamWhited

    Is there any difference in results that should be returned for the following two queries: https://gist.github.com/SamWhited/bdc7b2465a5f91d92bc881a3367d97d3

  151. Zash

    SamWhited: Same messages, but they might differ on <rsm:count>

  152. SamWhited

    ahhhh, okay, that's the difference, I understand now I think.

  153. SamWhited

    There is actually a thing that would change, not just some vague semantic difference in wording but in practice they're identical.

  154. SamWhited

    I am not sure what I should have asked to get at that sooner, but maybe the lesson is "always include examples".

  155. Zash

    Since that count is the count of the whole set, if no RSM or limits were involved. But it's also not guaranteed to be exact.

  156. MattJ

    or present

  157. Zash

    Examples good.

  158. SamWhited

    Sure, I definitely wouldn't include that (it's very expensive in postgres without lots of hackery), but at least there is a real practical difference between the two concepts now.

  159. MattJ

    There already was (besides the few offered earlier)

  160. Zash

    I think in Prosody we only include it if you also ask for rsm:max=0

  161. MattJ

    Anyway, back to work

  162. SamWhited

    Going back I don't see how I would have gotten there from any of the previous replies, they all seem to sound like "sure, they're identical, but we can use different words to describe them"

  163. SamWhited

    I'm sure I was jus tasking something weird though and not making the confusion clear.

  164. SamWhited

    (ignoring the before/after thing, that one I agree is vaguely a reason, it just also seems like it would have been easy enough to do in MAM)

  165. Kev

    It is possible that not using RSM at all would be sensible. It would certainly be possible.

  166. Kev

    It seemed like a good idea at the time, though.

  167. SamWhited

    To be clear, I am not advocating for a specific solution. I wasn't even sure there was a problem. I was just trying to understand why both after/after-id existed (or, with the reason I was given, why it actually matters). I think "count would be different" is the reason.

  168. SamWhited

    MattJ: are there any pending TODOs on MAM? Would you mind if the editor did another LC?

  169. MattJ

    Nothing pending, no

  170. SamWhited

    Would you mind if a LC happened then? There is one "FIXME" in the text that I can propose an editorial change for after work, but otherwise I'd like to see it advance if possible.

  171. SamWhited

    (unless of course you'd prefer to do it as the author, I just thought I'd offer)

  172. MattJ

    Nope, go for it

  173. SamWhited

    Thanks

  174. SamWhited

    /cc whomever the editor is these days. jonas’ maybe?

  175. MattJ

    The last revision was intended to be the last revision, I just wanted to wait a while for feedback

  176. MattJ

    It's been a while, so I'm happy with moving it on now

  177. Kev

    Maybe I should actually read the latest version :D

  178. jonas’

    SamWhited, ACK, maybe ask for LC together with your PR is easiest

  179. SamWhited

    MattJ: that's sort of what I was thinking. Given how small and mostly-inactive this community is, I figure just doing a LC is probably the best way to actually get people to look at it :)

  180. jonas’

    SamWhited, ACK, maybe ask for LC together with your PR that should be easiest

  181. jonas’

    Kev, well, that’s what the LC is for :)

  182. MattJ

    I also gave plenty of opportunity for feedback before submitting the revision too, so another argument for moving ahead

  183. SamWhited

    Will do.

  184. jonas’

    SamWhited, thanks!

  185. Kev

    jonas’: Well, kinda. I’m Author too ;)

  186. Zash

    Haha

  187. jonas’

    Kev, are you sure? ;)

  188. Kev

    :(

  189. arc

    Meeting time?

  190. jonas’

    arc, wrong day of week?

  191. jonas’

    or did board reschedule?

  192. arc

    We rescheduled

  193. MattJ

    Wait what

  194. jonas’

    oh, was that not announced on board@?

  195. jonas’

    oh, in the minutes

  196. MattJ

    I guess I should have read those more thorougly

  197. MattJ

    It's very unlikely I'll be able to attend

  198. arc

    I would happily be wrong on this

  199. ralphm

    We could skip though, since we're only half a week on

  200. arc

    I'm cool with that

  201. ralphm

    MattJ, does the new slot work in general?

  202. arc

    I just got out of the ER with extremely high blood pressure

  203. ralphm

    Dude. Take care of yourself first.

  204. ralphm

    I make it motion if necessary

  205. ralphm

    I'll make it a motion if necessary

  206. jonas’

    floor comment: I support that motion

  207. ralphm

    Motion carries :D

  208. jonas’

    ralphm, MR 20201126T15:04:59Z 000 <MattJ>  I generally have a hard cut-off at 1700Z in winter, 1600Z in summer

  209. jonas’

    followed-up by: MR 20201126T15:07:53Z 000 <MattJ>  If we must, I can probably do that on a Thursday or Friday, at least during the winter

  210. ralphm

    Sorry my exocortex didn't raise that

  211. ralphm not banging the gavel for the Board Meeting => +1W

  212. MattJ

    I can't do this time on any Wednesday and sensibly participate. My wife is working and I look after the children 😕

  213. ralphm

    Ok. Back to the drawing board then

  214. emus

    Have you tried using those appointment finding polls?

  215. DebXWoody

    OpenPGP for XMPP - Context IM Is there a valid use case where a user has two OpenPGP keys which both are associated with one XMPP account?

  216. vanitasvitae

    I mean, the spec does not forbid it

  217. vanitasvitae

    DebXWoody ^

  218. vanitasvitae

    So why noz

  219. vanitasvitae

    I think thats partly why there is the metadata node with public key list: https://xmpp.org/extensions/xep-0373.html#announcing-pubkey-list

  220. DebXWoody

    (on one device ). To have more keys in general (multi device) yes. But about more than one key in one local keyring.

  221. vanitasvitae

    I think there is no point in that. Then again I see no interop-concerns if a device has two secret keys. There might be a use-case for that as well (different keys presented to different contacts, like a work-key and a free-time key?)

  222. vanitasvitae

    But other devices would simply see two keys as if they would belong to different devices as there is no binding between devices and keys

  223. vanitasvitae

    So if your client wants to allow two secret keys on a device, I see no issues with that 😀

  224. DebXWoody

    I won't 😆 The user has to choose which key he will use to sign a message. I was thinking to show an error, but I don't know of there may some use cases.

  225. vanitasvitae

    if I'd implement OX, I'd either generate a key for the user, or if the user can provide their own keys, I'd ask them for the key-ID they want to use 🙂

  226. vanitasvitae

    btw. DebXWoody are you subscribed to standards@?

  227. vanitasvitae

    I just dropped a mail asking for possible dates for a regular OX meeting 😉