jdev - 2020-08-27

  79. lovetox i have some problem understanding the jid escaping
  80. lovetox it seem incompatible with jid splitting into its part
  81. lovetox say a user provides a string "user@host@example.com"
  82. lovetox to use jid escaping here, i need first to know what the localpart is
  83. lovetox as escaping must only be on localparts
  84. lovetox but the localpart here would be, "user" incorrectly
  85. lovetox RFC: Remove any portion from the beginning of the string to the first '@' character (if there is an '@' character present).
  86. jonas’ lovetox, the user needs to provide an escaped JID if you use that input format
  87. jonas’ JID escaping of stuff like @ is more interesting for transports than anything else
  88. jonas’ I also think that you shouldn’t allow JID-escaping @. It opens the door for fun impersonation attacks.
  89. lovetox im trying to implement the XEP, the XEP goal is to escape userinput
  90. lovetox not that users escape it on their own
  91. lovetox it expicilty provides test vectors where userinput is unescaped
  92. sonny has left
  93. sonny has joined
  94. lovetox im asking the question now, how the XEP thinks i should escape that jid, maybe im missing something
  95. jonas’ well then, you can’t have it both ways ;)
  96. Kev In escaping, the hostname isn't escaped unless I'm misremembering.
  97. jonas’ you can only safely escape the localpart if you take it as separate input
  98. Kev So that one you can safely escape because there's no resource part.
  99. Kev But if you gave e.g. user@something/user@something/user@something it'd be entirely ambiguous how to escape it.
  100. Kev As long as it's a bare JID you're escaping, you're ok.
  101. jonas’ Kev, `user@host@example.com` is ambiguous already, isn’t it?
  102. jonas’ (the example provided by lovetox)
  103. lovetox Kev, the JID parsing rules split on the first q
  104. lovetox Kev, the JID parsing rules split on the first @
  105. Kev Only if I misremember about not escaping hosts.
  106. Kev Let me see.
  107. jonas’ Kev, no, you’re right about that
  108. Kev lovetox: But you're not parsing a JID.
  109. lovetox of course, user input is a JId
  110. Kev No, it's not.
  111. jonas’ but is it localpart=user\40host, domain=example.com or is it localpart=user, domain=user@example.com
  112. Kev It's only a JID once you've done the escaping on it.
  113. lovetox Ok Kev, how do i find the domainpart in a string that is not a JID
  114. Kev jonas’: But user@example.com isn't a valid domainpart.
  115. jonas’ Kev, aha!
  116. jonas’ right
  117. jonas’ so since @ cannot exist in a domainpart && you know that there is no resourcepart, you know where to split
  118. Kev So if you know that post-escaping it will be a bare JID, you look right-to-left for an @, and the remainder is escaped as the node
  119. jonas’ (i.e. you can waive the splitting rules in RFC6122 and rsplit instead of lsplit because of htat)
  120. jonas’ (i.e. you can waive the splitting rules in RFC6122 and rsplit instead of lsplit because of that)
  121. lovetox ok are you sure about that
  122. lovetox did you just think that up right now?
  123. jonas’ >>> aioxmpp.stringprep.nodeprep("@foo") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3/dist-packages/aioxmpp/stringprep.py", line 159, in nodeprep check_prohibited_output( File "/usr/lib/python3/dist-packages/aioxmpp/stringprep.py", line 116, in check_prohibited_output raise ValueError("Input contains invalid unicode codepoint: " ValueError: Input contains invalid unicode codepoint: U+0040
  124. eta >_>
  125. jonas’ @ is not allowed in the domainpart according to nodeprep
  126. Kev Yes, I'm sure that a domainpart in a JID can't contain an @
  127. lovetox thats not what i meant, you just described a parsing mechanism, that is nowhere described in the XEP, but seems needed to implement that
  128. Kev I am sure that if you are trying to escape input in order to produce a bare JID that you can do so in the manner I described, yes.
  129. jonas’ lovetox, if it’s not allowed in the domainpart (which is true) && you do not support a resourcepart in your input, then it is safe to parse from the right side
  130. jonas’ if you *do* support a resourcepart, then you cannot do that
  131. Kev I'm not pretending that 106 is well-written in how to practically escape stuff, BTW.
  132. jonas’ oh
  133. jonas’ Kev, we’re both wrong
  134. jonas’ the domainpart is *nameprep*, not *nodeprep*
  135. jonas’ nameprep allows @
  136. jonas’ (disclaimer: unless buggy implementation on my side)
  137. lovetox it does only matter if idna allows @ in a domain part
  138. Kev I am convinced you can't have a @ in a domainpart. I could be wrong, but I am still currently convinced.
  140. lovetox thats correct
  141. lovetox IDNA2008 does not allow @
  142. jonas’ Kev, nothing in https://tools.ietf.org/html/rfc3491 forbids @
  144. jonas’ idna2003 seems to allow it though
  145. lovetox luckily i dont use that then :D
  146. jonas’ bad thing that you can’t know what a domain registrar uses
  147. lovetox lucky that userinput is mostly bare jid
  148. jonas’ but I think it’s fairly safe to assume that if someone has a domain with an `@` in it, it is going to break way more than just an XMPP client
  149. lovetox i cant come up with a case where a user might want to specify a full jid
  150. jonas’ so I wouldn’t blame a client for breaking on that :)
  151. jonas’ (especially if "breaking" is just escaping stuff, and not segfaulting or something crazy like that)
  152. jonas’ lovetox, so go ahead
  155. Ge0rG who's using @ in the localpart anyway, instead of %, # or _?
  156. lovetox Users?
  157. lovetox you can’t know what a user types into the JID field :)
  162. jonas’ Kev, -fno-pedantic
  164. Kev Sometimes these things do matter.
  165. jonas’ (I hope this gcc-level joke comes across)
  166. Kev And in this case, it actually does, because if you try using JID parsing rules on something that isn't a JID ,you have pain - as shown earlier!
  179. sonny has left
  189. lovetox hm / is also not allowed with IDNA2008
  190. lovetox so could we not do a rsplit in @, afterwards do a first encounter / split on the result
  191. lovetox that would always produce domain and resource
  192. lovetox then we can even support resource user input
  193. lovetox oh damn @ is allowed in the resource
  194. lovetox forget what i said
  195. Kev I think the question has to be what you're trying to have the user enter.
  196. lovetox i think this cant be done perfectly
  197. lovetox i can give the user a hint, that he only should input bare jids
  198. lovetox but i cant really find out if he did that
  199. lovetox <a.example.com/b@example.net>
  200. jonas’ not with JID escaping, indeed
  201. jonas’ without JID escaping, that is a ufll JID
  202. jonas’ without JID escaping, that is a full JID with empty localpart
  203. lovetox correct with that example, JID parsing, and userinput parsing yield different results
  204. lovetox but i guess i can live with that :)
  207. jonas’ nobody needs to enter full JIDs manually anyways
  208. lovetox if / where disallowed to escape
  209. lovetox we could do split on first / then do a rsplit on @
  210. lovetox and everything would work
  211. lovetox and this parsing could then be used for jids and userinput
  212. lovetox at least with IDNA2008
  213. lovetox :)
  217. lovetox further question
  218. lovetox XEP says : * Note: The character sequence \20 MUST NOT be the first or last character of an escaped localpart.
  219. lovetox but it does not say what we have to do when we encounter such a JID on the wire
  220. lovetox because \20@asd.com is a valid JID
  221. lovetox does that mean i have to ignore the first \20
  222. lovetox and not convert it to a space
  223. jonas’ lovetox, not convert it is probably the "right" thing
  224. jonas’ since it would be an invalid JID otherwise
  225. lovetox no i meant unescape it
  226. Ge0rG at least the one with the least resulting damage
  227. lovetox means "\20call\20me@example.com" -> "\20call me@example.com"
  228. jonas’ lovetox, yeah
  231. Ge0rG somebody should add such evil examples to the XEP
  232. lovetox jonas’, fyi you are missing that rule in your aioxmpp escaping code
  234. lovetox also a client that supports escaping, is not able to generate such a jid on the wire
  235. lovetox if the user inputs \20 at the start, it would be \5c20 on the wire
  236. lovetox hm so that means a user in my client can’t add a contact with such a JID
  237. lovetox thats insane, now i need an exception for that too
  238. lovetox This XEP is highly underspecified
  239. jonas’ lovetox, this XEP is a terrible bandaid and I wouldn’t expose users to it at all
  240. jonas’ in the sense that in any place where JIDs may be used for direct input, I’d show the JIDs as they are on-the-wire
  241. jonas’ the only reason the aioxmpp implementation exists is to facilitate search and display on s.j.n, where you have a separate copy button which will copy the correctly-escaped version
  242. lovetox i wish i didnt spend the last 3 hours to implement that
  243. lovetox maybe i should just ignore that, and fuck the people who choose a JID that starts with \20 :D
  246. lovetox the problem is also that there are circumstances where you don’t know if a JID is escaped or not
  248. lovetox xmpp:call\20me@example.com
  249. lovetox is this escaped or not?
  250. jonas’ I consider URIs to be on-the-wire
  251. lovetox thats the only way, because consider it escaped would not be backwards compatible
  253. lovetox hm, that would only influence how i display it
  254. lovetox so that would be fine
  255. lovetox so i should consider uris as escaped
  256. lovetox and take them as is to be backwards compatible
  258. lovetox hm maybe i should really only use this for displaying and not for input
  259. lovetox seems safer
  262. lovetox hm or what if i only escape \ when there are not allowed chars in the JID
  266. lovetox yeah if someone writes call/20me@example.com , i simply consider it escaped already
  267. lovetox yeah if someone writes call\20me@example.com , i simply consider it escaped already
  268. jonas’ that sounds like a reasonable plan
  269. lovetox and fuck the people who mix unescape and escaped sequences in the input
  270. lovetox why does that rule exist
  271. lovetox that we have to escape \20 to \5c20
  274. jonas’ because what if a matrix user is called yo\20life
  275. lovetox hm in understand why i would receive \5c20
  276. lovetox but why do i have to transform this on my userinput
  277. lovetox im not a matrix client or a gateway
  278. lovetox hm i guess for the same reason, when the user wants his name displayed as \20
  279. lovetox not as space
  280. lovetox but hard to know what the user wants here
  284. jonas’ "In face of ambiguity, refuse the temptation to guess"
  285. lovetox hmmm its only a problem for the 10 escaped chars .. i guess when i encounter any of the 10 escape sequeneces, i have to tell the user that input expects unescaped input
  286. lovetox that means if he wants to write yo\20life on matrix, he can put this in, but when he want to write call\20me on xmpp, he has to write "call me"
  289. lovetox not sure if that makes sense or not
  290. eta isn't that what the transport interaction XEP was supposed to solve
  294. lovetox Note: The character sequence \20 MUST NOT be the first or last character of an escaped localpart.
  295. lovetox how can an addon XEP make such a rule
  296. jonas’ because that is the document which defines `escaped localpart`
  297. jonas’ (as opposed to RFC 6122 localpart)
  298. jonas’ thus, by definition, a localpart starting with \20 is *not* an escaped localpart
  299. lovetox ok, so it means if i encounter a jid \20\27asd@example.com
  300. lovetox i have to display it like that
  301. lovetox and not unescape it, because its not escaped to begin with
  302. jonas’ exactly
  303. jonas’ lovetox, care to file a PR with a test case for that?
  304. jonas’ for the test vector list in the XEP
  308. lovetox ok so its clear how to deal with it when i encounter it on the wire
  309. lovetox still need a way to write such a JID
  310. jonas’ well, so if you encounter `\20foo` in an input, it is clear. No need to escape anything here.
  311. jonas’ if you encounter `\20foo bar` in an input, you need to escape, so you’d get `\5c20foo\20bar`
  312. lovetox what if he wants to write the JID: \5c20foo@bar?
  315. jonas’ "run"
  316. lovetox :D
  317. lovetox i think i make a simple, escape checkbox, that is by default checked
  318. sonny has joined
  319. MattJ Or just don't?
  322. Ge0rG checkboxes are awesome
  329. MattJ [X] Sarcasm
  330. Martin ❎ Obviously
  349. flow You could probably just assume that JIDs entered by the user are unescaped and run the escape algorithm over them. Some goes in the other direction: Every JID shown to the user, e.g. in the MUC member list, is unescaped first.
  362. Link Mauve has left
  379. sonny has left
  381. sonny has joined
  391. larma has joined
  401. lovetox flow the problem with an assumption without giving the user the choice is
  402. lovetox that you can’t talk to some jids anymore
  420. lovetox has joined
