jdev - 2022-08-31


  1. MattJ

    Holger, is xpath currently used in ejabberd? Would there be any objection if it was a XEP dependency? (or at least a subset)

  2. MattJ

    I'm pretty wary of XPath, but also wary of NIH something equivalent

  3. MattJ

    https://github.com/clopen/xpath-receive-email

  4. Kev

    I'm pretty wary of XPath. I don't think there's any danger of us needing to NIH something *equivalent*, which is where the danger in it lies.

  5. Sam

    Was trying to figure out a good way to say just that. What Kev said.

  6. Kev

    What we're likely to need is to NIH a way of referencing a path to an XML element, which ... is a long way short of XPath.

  7. Kev

    Bonus points if we used constrained XPath so that you could either use an xpath library, or implement something non-insane yourself just to do basic path lookups.

  8. Sam

    {space:local}/local/{space:*}>someattr there, wrote a whole spec with all the features you could want.

  9. Kev

    I think even xpath 1 lets you define functions etc. doesn't it? The path formats themselves aren't insane, I think, and are implementable oneself without excessive pain, unless I misremember.

  10. Kev

    But anything that lets you define functions seems...ill-advised.

  11. Sam

    "constrained XPath" sounds exactly as bad as "constrained XML" which I have had multiple bugs as a result of us doing (and found multiple issues in other things). That way leads to madness and implementations just ignoring the constraints.

  12. Sam

    Or security issues when they accidentally ignore the constraints or there's some way to smuggle some escaped function past their constraining code.

  13. Kev

    There ... is some truth in that.

  14. Kev

    Actually, looking at XPath, I think even the Location Paths are probably more than anything we'd need in XMPP.

  15. Kev

    MattJ What's the problem getting solved?

  16. MattJ

    Kev, clients need to tell their server what information should be sent to their push notification server when offline

  17. MattJ

    Current spec approach is "just send <body/>", which nobody actually does because of privacy

  18. Kev

    So something a lot lot simpler than xpath would work fine, presumably?

  19. MattJ

    Non-standard approach by Tigase is to extract some specific stuff, transform it into JSON, and send that

  20. MattJ

    But that loses all extensibility

  21. MattJ

    Yes, definitely (hence why I said "subset")

  22. MattJ

    But as Sam said, "just" supporting a subset when available libraries implement the full thing can be dangerous

  23. MattJ

    People will pick off-the-shelf libraries and either fail to disable the dangerous parts or make inadequate attempts to sanitize things

  24. Kev

    Even ignoring Sam's (I think valid) concerns, using an xpath library for this on the server sounds like some amount of pain - at least for us it'd involve serialising temporarily, parsing that into an appropriate XML doc, running the XPath on it, and then converting into the internal format again, for then reserialising later (or similar).

  25. MattJ

    Sure, it would be some amount of pain for Prosody too

  26. Kev

    Probably (with no data to support this) doubling the cost of processing each stanza or something.

  27. MattJ

    That's a bit over the top, since this wouldn't run on every stanza

  28. Kev

    Every eligible stanza, right.

  29. Kev

    But I think that's probably every message, as 'most people' don't have their phone app open 'most of the time'.

  30. Kev

    But you're right, and that's not on its own a reason to not do it. The inconvenience might be, and the security concerns probably are.

  31. MattJ

    Okay, so say you're right and it doubles... it goes from how many microseconds to how many microseconds?

  32. Kev

    ^

  33. MattJ

    I'm not saying bringing in XPath is something I'd enjoy. But I'm trying to get a feel for all our options, because this issue has been stalling progress in this area for years now.

  34. MattJ

    We have an internal custom "XPath lite" in Prosody, which is of course non-standard

  35. MattJ

    and as you say, just allows a path to an element, attribute or text content

  36. MattJ

    As far as I'm concerned, standardizing that (or something like it) is the NIH approach

  37. MattJ

    Which is not to say it's automatically wrong, if nothing else is suitable

  38. MattJ

    Thankfully this is also something that would only be needed on servers

  39. Kev

    I’m pretty comfortable with NIHing something that’s just {NS}ele/… or similar if the alternative is xpath. I think integrating full xpath would actively be a mistake if we only need the former, and Sam’s concerns about subsetting it seem valid.

  40. MattJ

    So do we have consensus for XEP-xxxx: Element Path Queries?

  41. Kev

    WFM

  42. MattJ

    I'll write something up for discussion

  43. Kev

    Ta.

  44. Sam

    Excellent, thanks MattJ

  45. lovetox

    Sorry for my ignorance, what is the use case for that?

  46. lovetox

    it that not something on the xml lib level, and not the xmpp?

  47. MattJ

    Use case: > clients need to tell their server what information should be sent to their push notification server when offline

  48. MattJ

    And XML libraries provide XPath, if they provide anything at all

  49. MattJ

    SAX parsers like expat (a popular choice for XMPP) tend not to provide XPath because they don't store the document or expose any kind of DOM

  50. lovetox

    ok thanks

  51. goffi

    MattJ: As you're working on this use case, I would like to use push notification with a trusted component to send email to users when they get a pubsub notification while offline. For instance says that you have a new comment on a blog post, I would need the pubsub element to construct the email. Privacy is not a problem here as it would be a trusted component with the same admins as the server itself.

  52. goffi

    MattJ: thus if the protoXEP you're working on or planning to work on would take this use case into account, it would be great

  53. MattJ

    Yes, it should suffice for that use-case too

  54. goffi

    neat

  55. goffi

    MattJ: regarding XPath, I think too that we don't need a full features XPATH and can make our own stuff. XPath is handy to check complex XML trees, but in the case of XMPP it's often not deep, and we can probably filter with simple stuff like element name/namespace + optionally attribute or maybe a index if there are several elements with same name/namespace. Do we need something that can be put in an attribute, or can we use dedicated element for that? Something like `<find_element name="body">` ? Advantage is that we could extend it if necessary in the future.

  56. goffi

    MattJ: we could also use `id` to put them in attribude, something like: `<something-useful use-element="find_123" /><find-element id="find_123" name="something-useful" ns="urn:example:bla:0" xmlns="urn:example:find-element:0" />`

  57. MattJ

    That can easily start to get very complex to implement

  58. goffi

    I don't think so, you end up with simple filtering data that you can either use directly in a loop, or transform easily to an XPATH or equivalent internally. But anyway, it was just random thought, you'll end up with something cool I'm sure.

  59. MattJ

    https://pad.nixnet.services/s/g9orugzSq is an initial draft

  60. Kev

    Would it be easier to not have implicit namespace inheritance?

  61. MattJ

    My feeling is "no" :)

  62. MattJ

    But maybe

  63. MattJ

    It would be easier if we actually had a single standard root namespace in XMPP

  64. Kev

    Ah. Because of stream namespace. Right.

  65. MattJ

    But we can always specify in the downstream XEPs that everything is in jabber:client or something

  66. MattJ

    But if it's about implementation complexity for the components without a namespace, we already do that in Prosody and it's not much effort (but I'd be curious to hear if that's not the case in other codebases)

  67. goffi

    MattJ: what about several elements with same name/namespace (e.g. in your example, you would have an extra `<entry title="The Hobbit" />`

  68. goffi

    )

  69. MattJ

    Yeah, I intentionally avoided that for now :)

  70. goffi

    but we need to know if we get first or last elements

  71. goffi

    in this case

  72. pep.

    It only matters if the order of elements is guaranteed I guess. But this spec can still specify it

  73. goffi

    with <message> it's important as we can have several bodies (with different languages), for blog we have text and xhtml content

  74. MattJ

    Prosody gives the first

  75. MattJ

    But I suspect we may want to add indexing and predicates

  76. MattJ

    I think we're doing this for extensibility, people are going to find uses for those things

  77. pep.

    Registering to your server features you don't want to see? Curious if that'd be a use-case

  78. pep.

    "no chatstates, no chat markers, no receipts plz"

  79. MattJ

    pep., meet https://xmpp.org/extensions/xep-0273.html :)

  80. pep.

    That title sounds..

  81. MattJ

    It's from 2009, assume innocence

  82. pep.

    ^^'

  83. MattJ

    Okay, I got a bit daring and added some more advanced stuff: https://pad.nixnet.services/s/g9orugzSq

  84. goffi

    MattJ: It looks good to me. It's very similar to lxml's ElementPath.

  85. Zash

    speccing stanza:find() ?

  86. MattJ

    Something like that

  87. Zash

    except it's more limited

  88. Zash

    except stanza:find() is more limited

  89. MattJ

    But still, I'm now going back to https://hg.prosody.im/prosody-modules/file/8231774f5bfd/mod_cloud_notify_encrypted/mod_cloud_notify_encrypted.lua#l88 to see what that logic would look like translated to XML/paths, and... I don't know

  90. Sam

    Is there a behavioral difference between entries/entry@title and entries/entry[title]@title?

  91. Zash

    or upload an util.datamapper schema?

  92. Zash

    my guess would be that the former may match a tag without the attribute?

  93. MattJ

    Exactly

  94. moparisthebest

    Just have the client upload some JavaScript the server can execute to construct what to send...

  95. Sam

    So one returns empty string one returns no match?

  96. Sam

    If the attribute doesn't exist do they both return no match?

  97. MattJ

    Yes

  98. Zash

    Any need to collect lists of things?

  99. MattJ

    I'm not sure

  100. MattJ

    I suspect that the answer is "maybe" but can usually be deduced from the context rather than needing some operator

  101. MattJ

    i.e. if only one result is expected, just take the first

  102. MattJ

    Which is probably going to be the most common

  103. MattJ

    But as I say, unless we can figure out how to actually apply this, I'm unconvinced it's useful

  104. Link Mauve

    I remember a discussion recently where someone wanted to send wasm blobs to other entities for processing. :°)

  105. moparisthebest

    Hey that's better than JavaScript let's do it

  106. MattJ

    goffi, I think in your case you'd be happy bouncing the whole stanza to the push server, I assume

  107. MattJ

    Because you're going to email it, and I assume that won't be encrypted, so you don't have anything to hide from the push service

  108. goffi

    MattJ: indeed, I would be happy with the whole stanza, the component is trustable no privacy problem there.

  109. MattJ

    The problem for the Siskin-style push notifications is that they need to be 1) summarized (Apple has size limits on notifications) and 2) encrypted (so neither Apple nor the push service see any message details)

  110. MattJ

    #2 is easy enough, but #1 is currently implemented using logic on the server side that isn't extensible and right now even depends on experimental XEPs

  111. MattJ

    Some discussion is at https://github.com/tigase/tigase-xeps/issues/4

  112. MattJ

    An alternative approach is to spec the summarization process, and make sure that can be negotiated and evolved over time

  113. Zash

    Per XEP treatment? Would be sorta in line with MAM, CSI, Carbons etc

  114. MattJ

    Okay, so instead of all the path query stuff: https://pad.nixnet.services/s/GPBR4xa4k

  115. MattJ

    Encode some rules directly into the server, handling all current use cases, allow for future extensibility if new ones arise

  116. pep.

    "MattJ> Because you're going to email it, and I assume that won't be encrypted, so you don't have anything to hide from the push service" I want to challenge the reasoning about not having to hide from the push service if it's not encrypted :x

  117. pep.

    I mean, I don't find this as obvious

  118. MattJ

    Read it as "you don't have anything that can be hidden"

  119. pep.

    from whom

  120. MattJ

    The push service

  121. pep.

    Well you may trust your server and the MTA but not the push service

  122. MattJ

    Okay. So what are you going to do about it?

  123. pep.

    I don't know, I'm just saying I don't find it as obvious as you made it sound

  124. MattJ

    Okay

  125. pep.

    Text as sibling isn't valid in XMPP right? So you don't need to worry about it?

  126. MattJ

    I don't think anything actually forbids it, other than our collective common sense

  127. MattJ

    I meant to note that it would result in undefined behaviour

  128. pep.

    Good job for the spec :)

  129. MattJ

    I've been working with SASL2, and just wanted to see how much of this tangle of semi-related XEPs we could sort out at the same time

  130. Link Mauve

    Email actually has some deployed e2ee, namely OpenPGP and S/MIME.

  131. Link Mauve

    Both could be used also with a gateway.

  132. MattJ

    Link Mauve: someone else can have the fun of specifying and implementing that 🙂

  133. Link Mauve

    Yup.

  134. MattJ

    For the 5 users it will ve used by

  135. Link Mauve

    I’ve successfully avoided e2ee on XMPP altogether so far, hoping to keep under the radar. o:)

  136. MattJ

    For the 5 users it will be used by