XSF Discussion - 2020-09-15


  1. mdosch

    Some iteam members present? https://mail.jabber.org/pipermail/operators/ seems to be offline. At least I can't reach it.

  2. mdosch

    Meh, once you complain out of a sudden it works again. :-/

  3. eevvoor

    mdosch, yeah, the presentation phenomenon ;)

  4. flow

    jonas’> MattJ, note that XML forbids control characters, even in escaped form Do you have a reference for me?

  5. flow

    I just tried to find the corresponding part via a quick look at the XML 1.0 spec, but failed. I very well believe that something like that could be in there

  6. MattJ

    flow, https://www.w3.org/TR/REC-xml/#sec-references

  7. MattJ

    "Character references" (i.e. &#nnn; escapes) are only allowed to represent legal characters

  8. flow

    MattJ, "Legal characters are tab, carriage return, line feed, and the legal characters of Unicode and ISO/IEC 10646"

  9. flow

    not sure (yet) if this definition of "legal character" includes e.g. control characters

  10. MattJ

    "Characters referred to using character references MUST match the production for Char."

  11. MattJ

    Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

  12. flow

    ahh

  13. flow

    so I guess everyhing that does not match this production, must be escaped via JSON escaping

  14. MattJ

    Yes

  15. flow

    although I could image that it could be disputed if "\u0007" is valid in JSON

  16. flow

    but that's not our discussion :)

  17. flow

    now i wonder if there are code points that are legal in XML 1.0 but have to be escaped using hex entity encoding

  18. Link Mauve

    flow, depends on the context, for instance in attributes, U+0022 and U+0027 are both legal, but one of them has to be escaped.

  19. Link Mauve

    U+003C in most contexts too of course.

  20. flow

    yep, i mean besides the set of general etnities (amp, lt, gt, apos, quot)

  21. jonas’

    flow, I don’t think so

  22. Link Mauve

    Hmm, depending on the document encoding, characters outside of its representable set of Unicode?

  23. Link Mauve

    For instance if your document is latin1, any character above that would have to be escaped.

  24. flow

    encoding happens one layer below, shouldn't be relevant

  25. Link Mauve

    Obviously irrelevant in the context of XMPP.

  26. jonas’

    flow, does it?

  27. jonas’

    it does

  28. jonas’

    nevermind

  29. flow

    the encoding, on the wire or within the file, is not relevant for the discussion how we stuff JSON in XMPP's XML

  30. flow

    the encoding, on the wire or within the file, is not relevant for the discussion how we shove JSON in XMPP's XML

  31. Link Mauve

    flow, because it is set to UTF-8 in XMPP, but e.g. if you use an XML editor and save as latin1, said editor would have to replace any codepoint above U+00FF with an escaped version of it, giving the exact same result as if it had been escaped previously.

  32. Link Mauve

    Am I wrong?

  33. Link Mauve

    What the character data represents is irrelevant in this case.

  34. flow

    Link Mauve: JSON -A> XMPP -B> XML -C> wire

  35. flow

    we are only talking about A and B here

  36. Link Mauve

    Ok.

  37. flow

    that the resulting XML in this chain is encoded using UTF-8 is irrelevant for the discussion what and how to escape JSON in XMPP's XML

  38. flow

    that the resulting XML in this chain is encoded using UTF-8 before being put on the wire is irrelevant for the discussion about what and how to escape JSON in XMPP's XML

  39. Link Mauve

    Well not so much, it can be considered irrelevant only because UTF-8 allows representing all legal characters in XML 1.0, so only illegal characters (but legal in JSON) have to be taken care of.

  40. Link Mauve

    Well not so much, it can be considered irrelevant only because UTF-8 allows representing all legal characters in XML 1.0 in unescaped form, so only illegal characters (but legal in JSON) have to be taken care of.

  41. flow

    Link Mauve, but the set of illegal XML 1.0 characters does not depend on the encoding used to store the XML, right?

  42. Link Mauve

    No, it doesn’t.

  43. flow

    ahh double negation

  44. flow

    not sure if in aggreement or not

  45. flow

    not sure if in agreement or not

  46. Link Mauve

    It is an agreement. ^^

  47. flow

    phew

  48. Link Mauve

    :)

  49. jonas’

    language!

  50. jonas’

    (is hard)

  51. Zash

    something something langsec

  52. emus

    Mid-month reminder to send your project news, aritcles and updates to the XMPP Newsletter on September! 💌️ https://github.com/xsf/xmpp.org/pull/787

  53. emus

    Mid-month reminder to send your project news, articles and updates to the XMPP Newsletter on September! 💌️ https://github.com/xsf/xmpp.org/pull/787