jdev - 2023-03-06


  1. pep.

    https://github.com/mastodon/mastodon/pull/23913 Mastodon implementing rich text formatting with an ok wire format. hint hint

  2. jonas’

    Transform h1 through h6 tags to <p><strong>contents</strong></p>

  3. jonas’

    m(

  4. Zash

    Wasn't it already HTML embedded in JSON?

  5. pep.

    Well they're not respecting the semantics of html, but at least they're not just stuffing markdown in there

  6. pep.

    Zash, yeah but a very much restricted set

  7. nicoco

    I'm curious, why would markdown be a bad thing? too limited?

  8. Zash

    Not limited enough!

  9. Zash

    nicoco, we abolished XHTML-IM because web developers can't be trusted not to pass trough <script> and stuff, and markdown is a HTML superset with the same problems

  10. Zash

    Even with XEP-0393, there was at least one case of someone blindly passing it trough a markdown library with html pass-trough enabled

  11. Zash

    Thus we can't have nice ~things~ styling, it's not safe!

  12. Ge0rG

    we can't have nice ~things~ people

  13. nicoco

    thanks for the context. html pass-through sounds terrible indeed.

  14. Zash

    Ge0rG, we can't fix people tho, only take away their nice things

  15. Ge0rG

    that's a good start.

  16. Menel

    What was the reason not just to add many big warnings what not to do in the xep? And if someone does it anyways, it is on them?

  17. MSavoritias (fae,ve)

    🤷 good question

  18. MSavoritias (fae,ve)

    > I'm curious, why would markdown be a bad thing? too limited? Xhtml helps with accesibility https://github.com/mastodon/mastodon/pull/23913#issuecomment-1455180785

  19. MSavoritias (fae,ve)

    Html*

  20. MSavoritias (fae,ve)

    Plus its easier on the client to use the format. Instead of having to guess with whatever we currently have

  21. jonas’

    it would help if they didn't translate headers to <p><strong

  22. jonas’

    it would help if they didn't translate headers to <p><strong/></p>

  23. pep.

    nicoco, to me it's not about the input format. It doesn't matter at all. It's all about the wire format

  24. pep.

    nicoco, to me it's not about the input format. It doesn't matter at all. It's all about wire format

  25. MSavoritias (fae,ve)

    Yeah. Stanadard format to parse instead of arbitary characters

  26. Link Mauve

    jonas’, this is probably at display time, not in the wire.

  27. jonas’

    sure

  28. Peter Waher

    Zash: The solution is not to ban the entire (and useful) technology, but to recognize the problem and make sure to highlight/require the script tag not be permitted (i.e. pass HTML or Markdown through a preprocessor that weeds out illegal/unwanted stuff.). To this list, I would add the style attribute and the style element, and any element defined for the HTML header.

  29. pep.

    Basically what we had in xhtml-im

  30. Peter Waher

    correct

  31. Peter Waher

    It might be included in compliancy requirements, that such security vulnerabilities are addressed (i.e. compliancy requirements for clients)

  32. MSavoritias (fae,ve)

    Sounds like a nice step to me

  33. moparisthebest

    > Xhtml helps with accesibility > https://github.com/mastodon/mastodon/pull/23913#issuecomment-1455180785 Except it doesn't because markdown is HTML so... (No one that I'm aware of is advocating for a markdown wire format)

  34. singpolyma

    XHTML-IM is "abolished" just the spec says deprecated on it. It's still the best thing we've got and Implemented by several clients and even by memberbot :)

  35. moparisthebest

    xhtml-im is XMPP's "malloc/free is fine, just don't make mistakes" :)

  36. Zash

    Someone motuvated just need to un-deprecate it with a bigget security warning 🤷

  37. singpolyma

    Honestly I would remove all the "restricted subset" stuff from the spec, that's up to the implemetation. Lots and lots of apps have to sanitize HTML inputs already, including feed readers etc, it's not like this problem is unique to us or hard to solve

  38. singpolyma

    Just use whatever obvious sanitize your framework/context already has that every non xmpp app is using

  39. Peter Waher

    PS: Markdown is not HTML, but is a content format (and is often used to generate HTML, but can be used to generate other formats as well). Most Markdown dialects support the embedding of HTML elements (even though it restricts its usefulness when it comes to generate other forms of presentation formats). Still, it is not HTML, and it is parsed (and often pre-processed) before HTML is generated. There are very restricted Markdown dialects also, just as XHTML-IM was a very restricted version of XHTML. Benefits of Markdown include (apart from being easy to write) being extensible with a loosely coupled structure. It might make interoperability a challenge, unless interoperability is structured in a similar way as is done with XMPP (with extensions for different types of constructs).

  40. moparisthebest

    Peter Waher: a valid html document is a valid markdown document

  41. Link Mauve

    Peter Waher, Markdown is a HTML superset, in the sense that every valid HTML document is also valid Markdown. Markdown just extends HTML with other formatting, like "*foo*" in CDATA becoming synonymous to <em>foo</em>.

  42. Link Mauve

    Some Markdown implementations also reject HTML elements present in the text.

  43. Link Mauve

    It is usually configurable.

  44. Peter Waher

    No its not

  45. Peter Waher

    from the creator of the original Markdown:

  46. Peter Waher

    https://daringfireball.net/projects/markdown/

  47. Peter Waher

    https://daringfireball.net/projects/markdown/

  48. singpolyma

    No one is proposing markdown for xmpp anyway :P

  49. Peter Waher

    yes

  50. Peter Waher

    but it was not added to the list of proposed extensions

  51. Peter Waher

    A content-extension where multiple content formats could be embedded in a message (a more abstract and generic version than the existing)

  52. Zash

    multipart/alternative? feels like it may be overkill while at the same time not too unlike what we already have with plain ol' <body> and xhtml

  53. moparisthebest

    Multiple copies of the body in different places is also an antifeature: see: lang

  54. Zash

    Or email-style <body>Click here to see the newsletter</body>

  55. pep.

    > Multiple copies of the body in different places is also an antifeature: see: lang xml:lang is actually a proper accessibility feature :)

  56. pep.

    Maybe someday we'll stop dismissing those

  57. jonas’

    "we"?

  58. MSavoritias (fae,ve)

    Do we actually have a way for me to hint what language i am sending?

  59. jonas’

    I'd like to be excluded from that particular "we"

  60. MSavoritias (fae,ve)

    In the message

  61. jonas’

    MSavoritias (fae,ve), yes, xml:lang on your stream is by default transferred down to the <body/> of your message, unless your client overrides it for the stanza or the body.

  62. Zash

    This is also a privacy leak! Such fun!

  63. MSavoritias (fae,ve)

    Why? Cant you set it always to english?

  64. jonas’

    you can, but what's the point then :)

  65. MSavoritias (fae,ve)

    Then its not a leak ;)

  66. jonas’

    (and also breaks accessibility)

  67. MSavoritias (fae,ve)

    For the privacy nuts

  68. jonas’

    sure

  69. jonas’

    well, ideally, you'd set it to whatever (klingon or so) on the stream, and then explicitly for each message/body

  70. jonas’

    privacy-wise

  71. pep.

    jonas’: individually maybe, but as a community I think we largely fail

  72. MSavoritias (fae,ve)

    Sounds cool though. More clients should hins like that imo

  73. jonas’

    MSavoritias (fae,ve), you can even have multiple <body/> elements with different xml:lang to send translations of your message!

  74. MSavoritias (fae,ve)

    Sounds cool though. More clients should hint like that imo and have a settinp for it

  75. jonas’

    and then clients can pick based on local preference what to display

  76. MSavoritias (fae,ve)

    😱 where is this stuff? It sounds cooler than stickers almost

  77. jonas’

    MSavoritias (fae,ve), aioxmpp has a complete implementation of that, it's in RFC 6120

  78. jonas’

    MSavoritias (fae,ve), aioxmpp has a complete implementation of that, it's in RFC 6120

  79. jonas’

    or RFC 6122 maybe

  80. jonas’

    or RFC 6121 maybe

  81. jonas’

    (plus the corresponding dependencies)

  82. jonas’

    (such as the xml:lang inheritance rules in XML 1.0)

  83. MSavoritias (fae,ve)

    Heh. Lots of rfcs to cover

  84. jonas’

    indeed

  85. jonas’

    flipside: you can have multiple <body/> elements which say completely different things in different languages

  86. jonas’

    <body xml:lang="en">We want peace!</body><body xml:lang="de">Hafen um 9 angreifen!</body>

  87. jonas’

    nice covert channel.

  88. Zash

    Same with xhtml-im

  89. Zash

    same with email and multipart messages

  90. jonas’

    indeed

  91. jonas’

    and OMEMO, in fact :)

  92. jonas’

    instead of <body>YoUR ClIENt DoeS NOt SUpPOrT OMEmO</body>, you could also send something completely different <3

  93. pep.

    Yeah, in fact most OMEMO messages already have covert messages!!

  94. jonas’

    but this being one of those things where you have to trade absolute security and solving social issues with technical means vs. nice things, I'm ok with err-ing on the side of nice things

  95. Zash

    OTR in the <body> of OMEMO messages?

  96. jonas’

    Zash, door's there, hush, out

  97. Zash

    FINE

  98. jonas’

    most cursed message

  99. jonas’

    Zash, and then add an OX element, for good measure!

  100. pep.

    jonas’, yeah, that's also a trade-off I'm happy to make if it means including more people in

  101. Zash

    YESSsS

  102. Zash continues eating popcorn

  103. pep.

    For some reason I don't get, some people are completly imune to this

  104. jonas’

    I'm beyond the ~popcorn~ chocolated-peanut stage

  105. jonas’

    pep., immune to what?

  106. jonas’

    popcorn?

  107. pep.

    hah

  108. pep.

    To understanding that they're solution to these "Security issues" are to shut some of our users out

  109. pep.

    To understanding that their solution to these "Security issues" are to shut some of our users out

  110. MSavoritias (fae,ve)

    Balance and all that

  111. jonas’

    it's ~turtles~ trade-offs all the way down

  112. MSavoritias (fae,ve)

    Question though: what is bad with 394?

  113. pep.

    Yeah, except when that balance affects more than just you, but also every other client around you

  114. MSavoritias (fae,ve)

    Xep 394

  115. jonas’

    MSavoritias (fae,ve), it's a solution to a problem which shouldn't exist in the first place

  116. jonas’

    it introduces all kinds of nuances, and we should just go back to properly sanitized XHTML-IM

  117. jonas’

    with nice test vectors

  118. MSavoritias (fae,ve)

    Heh you are one of *these* people /s

  119. Zash

    server-injected exploits in every message you mean

  120. pep.

    Testing suites could actually be a useful addition to the XSF's roles.

  121. jonas’

    MSavoritias (fae,ve), *these*?

  122. jonas’

    pep., *subtly points at '392*

  123. MSavoritias (fae,ve)

    That want xhtml :P

  124. jonas’

    MSavoritias (fae,ve), yeah sure

  125. jonas’

    nothing else makes sense

  126. jonas’

    it's semantic, it's well-established, it's well-documented, it embeds perfectly in XMPP, which is already XML-based

  127. jonas’

    all other formats also come with their own non-trivial injection security issues.

  128. jonas’

    we should "just" solve that by providing test vectors which implementations need to pass for security checks

  129. jonas’

    '394 is a huge overly complex mess, '392 is a huge underly complex mess

  130. Zash

    the mess is constant, you can only move it around

  131. jonas’

    (note who is the author of '394 ...)

  132. jonas’

    s/'392/'393/ obviously, '392 is the best™

  133. pep.

    jonas’, yeah 392, but again, that's not "The XSF", that's just you :)

  134. pep.

    Council could enforce this on accepting specs

  135. jonas’

    oh, I'm not listed as author of '394 anymore

  136. pep.

    I think you removed yourself

  137. jonas’

    yeah

  138. MSavoritias (fae,ve)

    Yeah. Accesibility, privacu and tests

  139. jonas’

    MSavoritias (fae,ve), then maybe for context, I wrote the initial version of '394 before I realized that it's not a good way forward

  140. MSavoritias (fae,ve)

    Yeah. Its not a good idea sending stuff out of band imo

  141. MSavoritias (fae,ve)

    The text should be structured itself

  142. jonas’

    ack

  143. MSavoritias (fae,ve)

    Like markdown or xhtml

  144. jonas’

    markdown is not structuring of the text itself

  145. jonas’

    (like '393 is not)

  146. jonas’

    there is no clear separation of text and content

  147. jonas’

    there is no clear separation of text and structure

  148. jonas’

    but I'm not going down *that* rabbithole tonight

  149. MSavoritias (fae,ve)

    In the sense that the text tells you how to. 394 Seems like it adds it in the metadata

  150. MSavoritias (fae,ve)

    Which is an odd place

  151. jonas’

    pep. can explain all that to you

  152. pep.

    jonas’, yeah we've been going over it again in other channels..

  153. pep.

    For the past few days

  154. MSavoritias (fae,ve)

    :)

  155. jonas’

    good thing I was too distracted to pay attention

  156. jonas’

    I guess I'll distract myself for anothe few days until this storm blows over again

  157. pep.

    You're not in there!

  158. pep.

    Someday it'll be big enough for someone(tm) to resubmit it, and then it'll be chaos again for some time

  159. pep.

    I'm tring to gather some thoughts around why conveying intent is important (and actually the whole point of the protocol to me)

  160. pep.

    I'm tring to gather some thoughts around why conveying intent is important (and actually the whole point of a protocol to me)