jdev - 2020-08-27


  1. PAP0

    quien vende cc

  2. lovetox

    i have some problem understanding the jid escaping

  3. lovetox

    it seem incompatible with jid splitting into its part

  4. lovetox

    say a user provides a string "user@host@example.com"

  5. lovetox

    to use jid escaping here, i need first to know what the localpart is

  6. lovetox

    as escaping must only be on localparts

  7. lovetox

    but the localpart here would be, "user" incorrectly

  8. lovetox

    RFC: Remove any portion from the beginning of the string to the first '@' character (if there is an '@' character present).

  9. jonas’

    lovetox, the user needs to provide an escaped JID if you use that input format

  10. jonas’

    JID escaping of stuff like @ is more interesting for transports than anything else

  11. jonas’

    I also think that you shouldn’t allow JID-escaping @. It opens the door for fun impersonation attacks.

  12. lovetox

    im trying to implement the XEP, the XEP goal is to escape userinput

  13. lovetox

    not that users escape it on their own

  14. lovetox

    it expicilty provides test vectors where userinput is unescaped

  15. lovetox

    im asking the question now, how the XEP thinks i should escape that jid, maybe im missing something

  16. jonas’

    well then, you can’t have it both ways ;)

  17. Kev

    In escaping, the hostname isn't escaped unless I'm misremembering.

  18. jonas’

    you can only safely escape the localpart if you take it as separate input

  19. Kev

    So that one you can safely escape because there's no resource part.

  20. Kev

    But if you gave e.g. user@something/user@something/user@something it'd be entirely ambiguous how to escape it.

  21. Kev

    As long as it's a bare JID you're escaping, you're ok.

  22. jonas’

    Kev, `user@host@example.com` is ambiguous already, isn’t it?

  23. jonas’

    (the example provided by lovetox)

  24. lovetox

    Kev, the JID parsing rules split on the first q

  25. lovetox

    Kev, the JID parsing rules split on the first @

  26. Kev

    Only if I misremember about not escaping hosts.

  27. Kev

    Let me see.

  28. jonas’

    Kev, no, you’re right about that

  29. Kev

    lovetox: But you're not parsing a JID.

  30. lovetox

    of course, user input is a JId

  31. Kev

    No, it's not.

  32. jonas’

    but is it localpart=user\40host, domain=example.com or is it localpart=user, domain=user@example.com

  33. Kev

    It's only a JID once you've done the escaping on it.

  34. lovetox

    Ok Kev, how do i find the domainpart in a string that is not a JID

  35. Kev

    jonas’: But user@example.com isn't a valid domainpart.

  36. jonas’

    Kev, aha!

  37. jonas’

    right

  38. jonas’

    so since @ cannot exist in a domainpart && you know that there is no resourcepart, you know where to split

  39. Kev

    So if you know that post-escaping it will be a bare JID, you look right-to-left for an @, and the remainder is escaped as the node

  40. jonas’

    (i.e. you can waive the splitting rules in RFC6122 and rsplit instead of lsplit because of htat)

  41. jonas’

    (i.e. you can waive the splitting rules in RFC6122 and rsplit instead of lsplit because of that)

  42. lovetox

    ok are you sure about that

  43. lovetox

    did you just think that up right now?

  44. jonas’

    >>> aioxmpp.stringprep.nodeprep("@foo") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3/dist-packages/aioxmpp/stringprep.py", line 159, in nodeprep check_prohibited_output( File "/usr/lib/python3/dist-packages/aioxmpp/stringprep.py", line 116, in check_prohibited_output raise ValueError("Input contains invalid unicode codepoint: " ValueError: Input contains invalid unicode codepoint: U+0040

  45. eta

    >_>

  46. jonas’

    @ is not allowed in the domainpart according to nodeprep

  47. Kev

    Yes, I'm sure that a domainpart in a JID can't contain an @

  48. lovetox

    thats not what i meant, you just described a parsing mechanism, that is nowhere described in the XEP, but seems needed to implement that

  49. Kev

    I am sure that if you are trying to escape input in order to produce a bare JID that you can do so in the manner I described, yes.

  50. jonas’

    lovetox, if it’s not allowed in the domainpart (which is true) && you do not support a resourcepart in your input, then it is safe to parse from the right side

  51. jonas’

    if you *do* support a resourcepart, then you cannot do that

  52. Kev

    I'm not pretending that 106 is well-written in how to practically escape stuff, BTW.

  53. jonas’

    oh

  54. jonas’

    Kev, we’re both wrong

  55. jonas’

    the domainpart is *nameprep*, not *nodeprep*

  56. jonas’

    nameprep allows @

  57. jonas’

    (disclaimer: unless buggy implementation on my side)

  58. lovetox

    it does only matter if idna allows @ in a domain part

  59. Kev

    I am convinced you can't have a @ in a domainpart. I could be wrong, but I am still currently convinced.

  60. lovetox

    thats correct

  61. lovetox

    IDNA2008 does not allow @

  62. jonas’

    Kev, nothing in https://tools.ietf.org/html/rfc3491 forbids @

  63. jonas’

    idna2003 seems to allow it though

  64. lovetox

    luckily i dont use that then :D

  65. jonas’

    bad thing that you can’t know what a domain registrar uses

  66. lovetox

    lucky that userinput is mostly bare jid

  67. jonas’

    but I think it’s fairly safe to assume that if someone has a domain with an `@` in it, it is going to break way more than just an XMPP client

  68. lovetox

    i cant come up with a case where a user might want to specify a full jid

  69. jonas’

    so I wouldn’t blame a client for breaking on that :)

  70. jonas’

    (especially if "breaking" is just escaping stuff, and not segfaulting or something crazy like that)

  71. jonas’

    lovetox, so go ahead

  72. Ge0rG

    who's using @ in the localpart anyway, instead of %, # or _?

  73. lovetox

    Users?

  74. lovetox

    you can’t know what a user types into the JID field :)

  75. Kev

    If you're doing escaping on it, it's not a JID field, it's an arbitrary input to be turned into a JID field ;)

  76. jonas’

    Kev, -fno-pedantic

  77. Kev

    Sometimes these things do matter.

  78. jonas’

    (I hope this gcc-level joke comes across)

  79. Kev

    And in this case, it actually does, because if you try using JID parsing rules on something that isn't a JID ,you have pain - as shown earlier!

  80. lovetox

    hm / is also not allowed with IDNA2008

  81. lovetox

    so could we not do a rsplit in @, afterwards do a first encounter / split on the result

  82. lovetox

    that would always produce domain and resource

  83. lovetox

    then we can even support resource user input

  84. lovetox

    oh damn @ is allowed in the resource

  85. lovetox

    forget what i said

  86. Kev

    I think the question has to be what you're trying to have the user enter.

  87. lovetox

    i think this cant be done perfectly

  88. lovetox

    i can give the user a hint, that he only should input bare jids

  89. lovetox

    but i cant really find out if he did that

  90. lovetox

    <a.example.com/b@example.net>

  91. jonas’

    not with JID escaping, indeed

  92. jonas’

    without JID escaping, that is a ufll JID

  93. jonas’

    without JID escaping, that is a full JID with empty localpart

  94. lovetox

    correct with that example, JID parsing, and userinput parsing yield different results

  95. lovetox

    but i guess i can live with that :)

  96. jonas’

    yeah

  97. jonas’

    nobody needs to enter full JIDs manually anyways

  98. lovetox

    if / where disallowed to escape

  99. lovetox

    we could do split on first / then do a rsplit on @

  100. lovetox

    and everything would work

  101. lovetox

    and this parsing could then be used for jids and userinput

  102. lovetox

    at least with IDNA2008

  103. lovetox

    :)

  104. lovetox

    further question

  105. lovetox

    XEP says : * Note: The character sequence \20 MUST NOT be the first or last character of an escaped localpart.

  106. lovetox

    but it does not say what we have to do when we encounter such a JID on the wire

  107. lovetox

    because \20@asd.com is a valid JID

  108. lovetox

    does that mean i have to ignore the first \20

  109. lovetox

    and not convert it to a space

  110. jonas’

    lovetox, not convert it is probably the "right" thing

  111. jonas’

    since it would be an invalid JID otherwise

  112. lovetox

    no i meant unescape it

  113. Ge0rG

    at least the one with the least resulting damage

  114. lovetox

    means "\20call\20me@example.com" -> "\20call me@example.com"

  115. jonas’

    lovetox, yeah

  116. Ge0rG

    somebody should add such evil examples to the XEP

  117. lovetox

    jonas’, fyi you are missing that rule in your aioxmpp escaping code

  118. lovetox

    also a client that supports escaping, is not able to generate such a jid on the wire

  119. lovetox

    if the user inputs \20 at the start, it would be \5c20 on the wire

  120. lovetox

    hm so that means a user in my client can’t add a contact with such a JID

  121. lovetox

    thats insane, now i need an exception for that too

  122. lovetox

    This XEP is highly underspecified

  123. jonas’

    lovetox, this XEP is a terrible bandaid and I wouldn’t expose users to it at all

  124. jonas’

    in the sense that in any place where JIDs may be used for direct input, I’d show the JIDs as they are on-the-wire

  125. jonas’

    the only reason the aioxmpp implementation exists is to facilitate search and display on s.j.n, where you have a separate copy button which will copy the correctly-escaped version

  126. lovetox

    i wish i didnt spend the last 3 hours to implement that

  127. lovetox

    maybe i should just ignore that, and fuck the people who choose a JID that starts with \20 :D

  128. lovetox

    the problem is also that there are circumstances where you don’t know if a JID is escaped or not

  129. lovetox

    xmpp:call\20me@example.com

  130. lovetox

    is this escaped or not?

  131. jonas’

    I consider URIs to be on-the-wire

  132. lovetox

    thats the only way, because consider it escaped would not be backwards compatible

  133. lovetox

    hm, that would only influence how i display it

  134. lovetox

    so that would be fine

  135. lovetox

    so i should consider uris as escaped

  136. lovetox

    and take them as is to be backwards compatible

  137. lovetox

    hm maybe i should really only use this for displaying and not for input

  138. lovetox

    seems safer

  139. lovetox

    or i offer a button where user can disable escaping on input

  140. lovetox

    hm or what if i only escape \ when there are not allowed chars in the JID

  141. lovetox

    yeah if someone writes call/20me@example.com , i simply consider it escaped already

  142. lovetox

    yeah if someone writes call\20me@example.com , i simply consider it escaped already

  143. jonas’

    that sounds like a reasonable plan

  144. lovetox

    and fuck the people who mix unescape and escaped sequences in the input

  145. lovetox

    why does that rule exist

  146. lovetox

    that we have to escape \20 to \5c20

  147. lovetox

    whats the danger when not doing that

  148. jonas’

    because what if a matrix user is called yo\20life

  149. lovetox

    hm in understand why i would receive \5c20

  150. lovetox

    but why do i have to transform this on my userinput

  151. lovetox

    im not a matrix client or a gateway

  152. lovetox

    hm i guess for the same reason, when the user wants his name displayed as \20

  153. lovetox

    not as space

  154. lovetox

    but hard to know what the user wants here

  155. jonas’

    "In face of ambiguity, refuse the temptation to guess"

  156. lovetox

    hmmm its only a problem for the 10 escaped chars .. i guess when i encounter any of the 10 escape sequeneces, i have to tell the user that input expects unescaped input

  157. lovetox

    that means if he wants to write yo\20life on matrix, he can put this in, but when he want to write call\20me on xmpp, he has to write "call me"

  158. lovetox

    not sure if that makes sense or not

  159. eta

    isn't that what the transport interaction XEP was supposed to solve

  160. lovetox

    the more i read this the less it makes sense

  161. lovetox

    Note: The character sequence \20 MUST NOT be the first or last character of an escaped localpart.

  162. lovetox

    how can an addon XEP make such a rule

  163. jonas’

    because that is the document which defines `escaped localpart`

  164. jonas’

    (as opposed to RFC 6122 localpart)

  165. jonas’

    thus, by definition, a localpart starting with \20 is *not* an escaped localpart

  166. lovetox

    ok, so it means if i encounter a jid \20\27asd@example.com

  167. lovetox

    i have to display it like that

  168. lovetox

    and not unescape it, because its not escaped to begin with

  169. jonas’

    exactly

  170. jonas’

    lovetox, care to file a PR with a test case for that?

  171. jonas’

    for the test vector list in the XEP

  172. lovetox

    hm yeah a add it to my list

  173. lovetox

    ok so its clear how to deal with it when i encounter it on the wire

  174. lovetox

    still need a way to write such a JID

  175. jonas’

    well, so if you encounter `\20foo` in an input, it is clear. No need to escape anything here.

  176. jonas’

    if you encounter `\20foo bar` in an input, you need to escape, so you’d get `\5c20foo\20bar`

  177. lovetox

    what if he wants to write the JID: \5c20foo@bar?

  178. jonas’

    "run"

  179. lovetox

    :D

  180. lovetox

    i think i make a simple, escape checkbox, that is by default checked

  181. MattJ

    Or just don't?

  182. Ge0rG

    checkboxes are awesome

  183. MattJ

    [X] Sarcasm

  184. Martin

    ❎ Obviously

  185. Zash

    ✅ True fact

  186. flow

    You could probably just assume that JIDs entered by the user are unescaped and run the escape algorithm over them. Some goes in the other direction: Every JID shown to the user, e.g. in the MUC member list, is unescaped first.

  187. lovetox

    flow the problem with an assumption without giving the user the choice is

  188. lovetox

    that you can’t talk to some jids anymore

  189. mac

    السلام عليكم