jdev - 2024-09-19


  1. lovetox

    singpolyma, https://www.rfc-editor.org/rfc/rfc5122.html#section-2.2 > Certain characters are allowed in XMPP node identifiers and XMPP > resource identifiers but not in the relevant portion of an IRI or > URI. The characters are as follows: > > In node identifiers: [ \ ] ^ ` { | } > > In resource identifiers: " < > [ \ ] ^ ` { | } > > The node identifier characters are not allowed in userinfo by the > sub-delims rule and the resource identifier characters are not > allowed in segment by the pchar rule. These characters MUST be > percent-encoded when transforming an XMPP address into an XMPP IRI or > URI.

  2. lovetox

    so "xmpp:foo\40example.com@otherexample.com" is not a valid IRI

  3. lovetox

    because \ is not percent encoded

  4. singpolyma

    lovetox: sure that's possible. But the browser is percent encoding it anyway so that's not the main source of the issue here

  5. lovetox

    Gajim does treat it like a JID, and not as a IRI I fixed that

  6. dwd

    > I thought you mean eg xmpp: someone%40example.com which is ok if course I mean, it probably works, but technically it's not right.

  7. lovetox

    its not right i think, and Gajim would not parse this

  8. lovetox

    @ as delimiter between node and domain, is not to be percent encoded, this would prevent a parser from splitting the jid into its parts

  9. singpolyma

    you can percent encode every single char in a URI if you want

  10. singpolyma

    a correct parser will decode every encoded thing

  11. singpolyma

    and so it will still be an @ by the time it hits the jid parser

  12. singpolyma

    dwd: In what way would you think it's not right?

  13. lovetox

    i was talking about a IRI

  14. lovetox

    thats what we use normally in xmpp environment, no need to convert something to a URI

  15. lovetox

    Gajim has no support for the URI form

  16. singpolyma

    Those are equivalent for this purpose, IRI vs URI. They use the same escaping

  17. lovetox

    no seems not if you say you can percent encode everything on a URI

  18. lovetox

    the IRI spec is quite clear to me, and it nowhere says to percent encode the @ in an xmpp address

  19. lovetox

    there are to capitals on processing and iri and generating an iri, and all it says is to encode by inode, ires, ihost rules

  20. lovetox

    and respectively decode by these rules, but if you encode the @ you cannot get node and host, because jid splitting does not work anymore

  21. singpolyma

    I don't understand. The @ is not special at all. Why wouldn't you be able to encode it?

  22. singpolyma

    You split on the @ once you have the jid you are parsing. That's not part of the IRI syntax

  23. lovetox

    what do you mean by its not special? there is a syntax to create a iri, and it says to place a @ between node and host

  24. lovetox

    nowhere does it say to percent encode it later

  25. singpolyma

    Ah I see, rfc5122 makes it special even though it normally wouldn't be. Interesting. So if you specifically support xmpp IRI then technically it's special even though if you use a generic IRI parser it will work. Hence dwd's "probably works but technically wrong"

  26. lovetox

    i implemented the iri syntax because thats what we usually use to share, URI make not much sense in context of sending xmpp links. And there is a special paragraph of transforming and IRI to URI and back, but i read that, and it was complicated enough that i didnt want to consider it

  27. lovetox

    the problem with that is that most implementors do url_encode("xmpp:" + JID)

  28. lovetox

    and think thats valid

  29. singpolyma

    IRI vs URI isn't relevant in most contexts yeah

  30. singpolyma

    What's different here is generic IRI syntax vs rfc5122 syntax

  31. lovetox

    yeah i dont understand the use case where it would, so i would try to agree that we all use IRI, or someone eli5 to me how that conversion from URI to IRI works :)

  32. singpolyma

    URI vs IRI is just about Unicode or not

  33. lovetox

    ah, because IRI allows unicode chars

  34. lovetox

    and URI percent encodes them all

  35. singpolyma

    Yes

  36. singpolyma

    That's all