XSF Discussion - 2021-05-04


  1. Holger

    I'd like to support PubSub nodes with an unlimited number of items. AFAICS there's no `max_items=unlimited` thing. We now have `max_items=max`, but I think the meaning is a little different. (Plus I'm unsure whether there's an agreement on whether `max` is going to stay that way, now?)

  2. Zash

    And so it begins

  3. Kev

    max was as many as the deployment will allow, wasn’t it?

  4. Holger

    So my idea of how to proceed is: I'll submit a PR that adds support for `max_items=unlimited` to 0060, everyone's thrilled of the idea, it's merged this afternoon! That's how things work, right?

  5. Holger

    Kev, yeah.

  6. Kev

    Which, if a deployment allows unlimited items, would mean unlimited.

  7. Kev

    And if the deployment doesn’t allow unlimited would mean as close to unlimited as you can get.

  8. Holger

    So I guess the _idea_ was something like "client configures max_items=max" and sees the resulting limit when querying the config from the server.

  9. Holger

    Where "the resulting limit" is an integer.

  10. MattJ

    Not necessarily (IMHO)

  11. Holger

    Or is it fine for the server to return `max_items=max`?

  12. Zash

    IIRC yes

  13. Zash

    Only the server would know the true limit

  14. Holger

    Might be two questions: (1) is it fine as per the wording, (2) is it fine as per the author's intention.

  15. MattJ

    Given that that value may change over time (based on configuration, implementation default, etc.)

  16. Zash

    Clients would just see 'max'

  17. Holger

    Ok. Works for me. Kinda even better than my idea of how to proceed 🙂

  18. Holger

    So people were fine with the `max` thing, or at least gave up on going `-1` or something instead?

  19. Holger

    I.e. if I add support, there's a slight chance it's still in line with 0060 next month? 🙂

  20. MattJ

    There's a significant chance

  21. Holger

    \o/

  22. Zash

    Where's that PR?

  23. MattJ

    I think we decided it's probably easiest at this point to add a new type for this field

  24. MattJ

    (and similar fields in other XEPs)

  25. Holger

    The `max` thing is already in 0060.

  26. Zash

    🙁

  27. MattJ

    Holger, https://github.com/xsf/xeps/pull/988

  28. Holger

    I just was unsure whether it (a) can be (ab)used for my use case and if so, (b) whether it's there to stay.

  29. Holger

    Ah.

  30. Holger

    MattJ, thanks.

  31. MattJ

    "unlimited" will throw a spanner in the works

  32. Holger

    Heh.

  33. Holger

    I'll shut up then 🙂

  34. Holger

    And work with `max`.

  35. MattJ

    I think in my mind "max" was "unlimited" (or as high as is possible)

  36. MattJ

    which is why it doesn't make sense to translate it back to a number

  37. Holger

    Hrm.

  38. MattJ

    It's unfortunate that the client can't then know whether it's actually unlimited

  39. Holger

    Strictly speaking I would still think it does make sense.

  40. Holger

    Right.

  41. Kev

    It both does and doesn’t.

  42. MattJ

    You're right and wrong

  43. Kev

    I agree with 50% of that :)

  44. Kev

    A disco feature for max-means-unlimited might make sense, I guess. Or some way of making the max discoverable (which I don’t think it currently is, but might misremember)

  45. Holger

    I would still think the proper solution is to have both `unlimited` and `max`. They mean different things. Having max-means-$whatever in disco instead sounds weirdo to me.

  46. MattJ

    Having both seems weird

  47. Kev

    The two reasons that come to mind for it making sense to convert back to a number for display purposes (ideally while persisting ‘max’ internally) are that ‘max’ will break unsuspecting clients, and that it’s nice to know what the max is. The rason for not converting to an int is that you want the client to know it’s pinned at the max.

  48. Holger

    MattJ, why?

  49. MattJ

    If unlimited exists, why would anyone ever use max? Unlimited is higher than max, or at least implies there is no maximum

  50. Kev

    Or, why would anyone ever use unlimited, I guess. Because you probably want either unlimited or as close to it as you can get, which is what max gives you.

  51. Holger

    There's either a limit, expressed as a non-negative integer, or no limit, expressed as `unlimited`. The client won't know. So we have the special value `max` which means `unlimited` or, if not possible, the highest limit allowed.

  52. MattJ

    The whole "max" thing was added because clients wanted to be able to shove as much into the node as they could, without picking an arbitrary numer

  53. Holger

    Right.

  54. Kev

    *Unless* you want unlimited and anything less than that you want to error on.

  55. MattJ

    The whole "max" thing was added because clients wanted to be able to shove as much into the node as they could, without picking an arbitrary number

  56. Holger

    Right.

  57. MattJ

    Which is exactly what you're after with "unlimited"

  58. Holger

    Yes.

  59. Zash

    And we couldn't have advertised the server limit because clients want it pinned to the max, even if it changes

  60. Holger

    In my specific use case that's actually what I'd want.

  61. Holger

    (Return an error if `unlimited` is not allowed.)

  62. MattJ

    I don't think it makes sense for a server that allows unlimited pubsub to also have an arbitrary "max" value

  63. Kev

    Holger: But in that case, the absense of a “max means unlimited” feature would work for that, no?

  64. Holger

    I can live without that error, but it would be the proper solution.

  65. Holger

    Kev, yes, I'm trying to argue for the proper solution being to have both `max` and `unlimited`. In which case max would of course not mean unlimited.

  66. Zash

    So we just shove more enum entries in that?

  67. Zash

    Why not `default` too‽

  68. Holger

    In my book, yes. enum entries are relatively cheap, no?

  69. MattJ

    Especially when mixed into the same data type as abitrary integers

  70. Holger

    > I don't think it makes sense for a server that allows unlimited pubsub to also have an arbitrary "max" value I don't quite understand.

  71. Kev

    > In which case max would of course not mean unlimited. No? I’d have expected when ‘unlimited’ was supported for ‘max’ to mean ‘unlimited’.

  72. Holger

    Kev, ah. Yes 🙂

  73. MattJ

    Holger, what's not to understand? What would "max" be if the server allows "unlimited"?

  74. Holger

    Seems I'm having a hard time to express myself.

  75. MattJ

    Client developers that pushed for "max" didn't want it because they wanted some arbitrary limit

  76. Kev

    Welcome to the Internet :)

  77. Holger

    If the server allows unlimited and the client requests `max` then the result is unlimited.

  78. Zash

    From the clients perspective, if they configure with limit=unlimited, then if it's supported, everything is ok, but if it's not, then they get a proper error.

  79. MattJ

    They pushed for it because they wanted no limit, but aren't necessarily going to get that

  80. Daniel

    Max was certainly intended to mean unlimited (from a client's perspective)

  81. MattJ

    ^

  82. Zash

    Eugh

  83. Kev

    > Max was certainly intended to mean unlimited (from a client's perspective) With the realisation that some arbitrarily high limit might exist, because unlimited storage hasn’t been invented yet, except in /dev/null, right? :)

  84. Holger

    Daniel, my assumption was, for the case where the server allows 1000, `max` means 1000.

  85. MattJ

    Holger, that's correct. But such a server wouldn't also allow unlimited entries. Would it?

  86. MattJ

    If it did, then "max" == unlimited

  87. Kev

    What would probably be neatest would be for a server to return an annotation when a client configures the node to ‘max’ saying what the effective value of ‘max’ is, with ‘unlimited’ being a possible value :)

  88. Kev

    Such that a Holger client can abort because 9,223,372,036,854,775,807 isn’t high enough, but everyone else can not care :)

  89. Holger

    MattJ, I don't get it 🙂 If I configure my server to only allow 1000 items for a given node, then this server won't at the same time allow an unlimited number of items for that node, no.

  90. Holger

    But you obviously meant something else 🙂

  91. Daniel

    My wishful thinking was that what ever internal limits the server might have it won't matter in practice because they are hight enough

  92. Daniel

    So max would be a de facto unlimited

  93. MattJ

    Holger, no, that's exactly what I meant

  94. MattJ

    Just pretend `max` means unlimited and carry on :)

  95. Holger

    So if you guys are telling me to just use a super-large integer in place of `unlimited` because it has the same effect in practice then I at least get the idea. But no I'm not going to implement `if limit < 9,223,372,036,854,775,807 then delete_old_items()` 🙂

  96. Holger

    MattJ, yes yes, works for me.

  97. MattJ

    Are you talking about the client or server perspective here?

  98. Holger

    Server.

  99. MattJ

    I'm not sure why the server needs to care. Either you have a limit, or you don't

  100. Holger

    I'd like to allow clients to configure nodes without limit.

  101. MattJ

    If you don't, then max really does mean unlimited, not some arbitrary value

  102. Zash

    limit=max → high as it goes plz limit=unlimited → give me unlimited, or give me ERROR

  103. Zash

    Hm

  104. Zash

    Wouldn't adding entries to this enum require yetanotherxeprevision?

  105. Holger

    Yup. I totally get how this solution works in practice. Assuming it's fine to return `max_items=max` when the client asks for the node config.

  106. MattJ

    Which would error on most servers, because unrestricted resource consumption is unrealistic for most deployments :)

  107. Zash

    Aaaaaaaaah

  108. MattJ

    Holger, it is

  109. Holger

    I'll insist that strictly speaking, it is hacky and wrong, but it works for me 🙂

  110. MattJ

    It would be wrong to not return "max" in the node config

  111. MattJ

    Whether you have a limit or not

  112. Holger

    Wait, you're saying I should return `max` even in the case where there _is_ a limit?

  113. MattJ

    Yes, because that's what the configuration is set to, "max"

  114. Holger

    So for a node with max_items=1000. If the client configured `max_items=1000` I should return `1000`, if the client said `max_items=max` I should return `max`?

  115. Kev

    Did we bump the xep60 config namespace for this? Because if not, returning max to an unsuspecting client under the previous namespace that only supported ints would be wrong :D

  116. MattJ

    Kev, sure, it broke Prosody assumptions about it being an integer too *shrug* :)

  117. MattJ

    Why do clients get a namespace bump but servers don't?

  118. Kev

    Becuase the server would reject max as being invalid until it supported max?

  119. flow

    Holger, IIRC 'max' should only be send by the configuring entity and the real integer value is returned by the pubsub service

  120. flow

    but I could be wrong

  121. Kev

    (Although I note that there are definitely deployments that are doing checking against schemas (not necessarily XEP schemas), and so any changing of protocol like this without a bump breaks them, and that’s bad)

  122. flow

    and I woudldn't be suprised if this is not clearly stated in the XEP

  123. Holger

    flow, that's what I assumed, and then everyone told me I was wrong 🙂

  124. Kev

    (Checking about schemas on the way through a routing entity, that is)

  125. Holger

    Right, it's not clearly stated.

  126. Kev

    Holger: I think only Matt said that’s wrong :)

  127. Zash

    And why can't we do nice things like advertise the the types in the config from?

  128. Zash

    And why can't we do nice things like advertise the the types in the config form?

  129. flow

    pep tried to get the types in data forms validation, but that processed stalled

  130. MattJ

    There are no schemas, since it's a data form

  131. MattJ

    and it's text-single

  132. Holger

    Kev, ok, and admittedly he only said half-wrong half-right, but in the end, wrong, or something 🙂

  133. flow

    Holger, I am curious: why do you want unlimited? as Kev pointed out, the inifnite tape of a turing machine is a theoretical construct ;)

  134. Zash

    MattJ, dataforms validation is sorta like schema tho?

  135. MattJ

    Zash, yes, but not used by XEP-0060

  136. Zash

    Nice things, can not have.

  137. flow

    and as server dev, I'd expect you to be very cautious about resource usage (and resource domination by a single client) anyway :)

  138. Holger

    flow, well, I'm coming from deletion of old items potentially being expensive.

  139. Holger

    flow, it's precisely about resource usage, just different resources 🙂

  140. flow

    Holger, so in your scenario, a client that continiously creates pubsub items, would eventually run in which limit?

  141. Holger

    Right now we delete the oldest item(s) when a new one is published.

  142. flow

    volume size of the volume the database is stored on?

  143. Holger

    That can be an expensive operation if you have millions of items.

  144. Holger

    (Which is why we don't support this with MAM, for example.)

  145. flow

    doesn't have postgres some vacuum mechanism for that? ;)

  146. Holger

    Just the query to look up the oldest item(s) is too expensive.

  147. Zash

    Stuff them in buckets, sized after about the number of items you expect to delete each time, something something

  148. Holger

    But I'm not sure this matters. I mean I'm highly interested in implementation suggestions for my problem at hand. But I do think we shouldn't enforce having some limit ("just choose a large one!") protocol-wise, either way.

  149. flow

    I am aksing because I wonder if your "unlimited" is not really "very high and I can't provide the exact limit but there is one, I guarantee" limited

  150. flow

    I am asking because I wonder if your "unlimited" is not really "very high and I can't provide the exact limit but there is one, I guarantee" limited

  151. Holger

    There's no fixed limit, no.

  152. MattJ

    What we're saying is that there is, when you run out of disk space :)

  153. Holger

    The server might clean up evey now and then, in my specific case.

  154. flow

    so it's like max="best-effort"

  155. MattJ

    and this is what "max" is for

  156. MattJ

    It's just the client saying it wants no limit, so it will take the highest limit you have to offer (including no limit)

  157. Holger

    Let's not turn in circles 🙂

  158. flow

    I'd also don't like to turn in circle

  159. flow

    I'd also don't like to turn in circles

  160. Holger

    Either way I do believe that the spec should clarify these things. Esp. whether the server is actually _supposed_ to return `max` if the client specified `max`.

  161. flow

    definetly

  162. Holger

    In which case there's no way for the client to discover the limit right now.

  163. flow

    definitely

  164. Holger

    Whereas it means the limit might change.

  165. Holger

    (Well I guess it might always change anyway.)

  166. Holger

    Anyway, BBL.

  167. Zash

    And sneaking in some XEP-0122 element doesn't solve this?

  168. flow

    What if: if there is a known max limit, let the pubsub service return it. if there is no known max limit, let the pubsub service return "max"

  169. MattJ

    I'd push for the obvious solution of the server responding with the configuration option that the client set...

  170. Daniel

    > I'd push for the obvious solution of the server responding with the configuration option that the client set... +1

  171. MattJ

    The problem is that the max limit may change (e.g. because it's set by the operator, or based on available storage capacity), so the returned value is not very useful to the client anyway

  172. flow

    right

  173. Holger

    "Obvious" 🙂

  174. MattJ

    What is far more important is a way to configure the behaviour that occurs when the limit is reached

  175. MattJ

    We don't have that, and the "delete oldest" behaviour is stupid for many applications

  176. flow

    so another max_items enum value, but please do not call it "unlimited", seems sensible

  177. Kev

    I think the only compelling reason to return the current int equivalent rather than ‘max’ is that ‘max’ will break compliant implementations - and I do believe that’s a compelling reason.

  178. MattJ

    Also we need to fix that ejabberd and Prosody disagree about the definition of "oldest" :)

  179. Kev

    But returning ‘max’ (with some additional way to discover what that means) is the conceptually right thing, I think.

  180. flow

    MattJ, you mean we need to fix the specification but be clear and unambiguous when it comes to this

  181. MattJ

    Sure

  182. flow

    MattJ, you mean we need to fix the specification to be clear and unambiguous when it comes to this

  183. MattJ

    Both are spec issues

  184. support@mpro.la

    hello

  185. Zash

    👋️

  186. Sam

    lovetox: sure thing; no rush though, if it's like 5 minutes before that's fine.

  187. Sam

    (sorry, I never remember how to PM on here)

  188. Sam

    Reminder to all that the Office Hours got pushed back an hour today, so show up in ~45 minutes and we'll get started! Today's is a demo of the new Gajim 1.4 UI and features! https://socialcoop.meet.coop/sam-pku-dud-niv

  189. jonas’

    Kev, poooooooooooooooooooooooooooke https://github.com/xsf/xeps/pull/1047

  190. Zash

    Needs more XEP-0132

  191. deuill

    Oh man missed this one again! Pretty excited to see the PRs for Gajim flowing in though.