XMPP Council - 2023-12-16


  1. moparisthebest

    How would current council feel about un-deprecating xhtml-im ? I don't even know if that's the right word or if we have a way to do so :)

  2. singpolyma

    I would be in favour

  3. larma

    Why would we want that over something that's less of a mess to parse, implement and correctly display?

  4. moparisthebest

    larma: such as?

  5. singpolyma

    XHTML is the standard for this kind of thing basically everywhere and I'm not aware of good alternatives anyway

  6. moparisthebest

    ^

  7. larma

    XEP-0394 is easier to parse and easier to implement safely with most toolkits

  8. larma

    and it has much less features, which makes it more likely that we can agree on one feature set.

  9. singpolyma

    it looks harder to implement, and of course is a brand new thing we'd be making up, but probably possible to convert to/from yes

  10. moparisthebest

    394 has never seen any implementations and as far as I know still suffers from basic "can't even implement" problems due to no one agreeing how to index into the body

  11. Zash

    https://xmpp.org/extensions/xep-0426.html ?

  12. larma

    singpolyma: should be easy for Android for example: https://developer.android.com/reference/android/text/Spannable#setSpan(java.lang.Object,%20int,%20int,%20int)

  13. singpolyma

    moparisthebest: we have a whole xep about how to index into a body and it's being used for fallback xep as well, so while Im not fan of 394 I don't thik that's an issue per se

  14. moparisthebest

    Maybe...

  15. singpolyma

    larma: for basic stuff yes. I'm not sure lists will work that way for example

  16. larma

    If you don't plan to use a webview for rendering on Android (which I strongly suggest not to do) the only thing that you have is Spannable. But yes, Android comes with a parser that converts something that is like HTML to Spannable, so if you just shovel XHTML-IM into that, you likely are safe, but the feature set will be exactly the same as if you applied spans directly (and not parsing is faster).

  17. singpolyma

    yes, I'm using android's built in html to spans parser right now. I think I want to fork it because it's a bit limited, but it's fine for a lot of cases and very safe

  18. singpolyma

    the only place xhtml-im really has a chance to give major problems is web apps, and even these routinely must deal with xhtml input from user anyway (contenteditable, comment forms, etc etc) so every framework has evolved one or more ways to deal with this

  19. singpolyma

    of course I don't doubt the ability of devs to get this wrong when they try hard enough, but it's not unique to us

  20. larma

    But then it would be easier for you to have your spans just created directly from XML instructions rather then parsing the XHTML-IM manually and then creating spans from that, no?

  21. singpolyma

    Easier on android specifically? Not actually since the html support is built in and this custom format wouldn't be. But it wouldn't be terribly hard, no

  22. larma

    I meant, because you mentioned you want to fork the HTML parser of Android, aka do it yourself

  23. moparisthebest

    To be clear I wouldn't un-deprecate xhtml-im without a ton of security considerations added in first

  24. singpolyma

    well, I wouldn't do it myself, just fork

  25. singpolyma

    moparisthebest: I dunno about "a ton" just one: consider this markup to be user input and process safely just as you do all other xhtml user input in your application

  26. moparisthebest

    Maybe some links to CVEs too

  27. larma

    I mean, the best would be to make explicit that you are supposed to always parse the XHTML-IM in your code and then create the respective UI elements (HTML tags, Spans, whatever) from your own parsing result. Never shovel XHTML-IM directly into a renderer without understanding what that renderer could do with it.

  28. singpolyma

    Yes mostly, though many systems have safe html user input stuff built in, such as android and ios/macos

  29. singpolyma

    which do what you say internally, but are fine to use, you don't need to "parse" per se

  30. larma

    But also, XHTML-IM has a ton of features I would want to remove, because they go beyond semantic relevance and don't work reliably cross platform. colors, anything that sizes in pixel values, etc

  31. singpolyma

    that's up to the renderer/app, no?

  32. larma

    Support for most of them is SHOULD in the current XEP, not MAY

  33. singpolyma

    sure, I would honestly just remove all the lists of tags and attributes stuff from the XEP

  34. singpolyma

    And maybe not reference any particular version of XHTML directly

  35. singpolyma

    But I'd be happy to leave it and change a bunch of stuff to MAY. whatever makes people happy :)

  36. daniel

    Sure you could strip xhtml-im down to its bones to a point where it has almost nothing to do with the current xep. Or implemt 394 which has the added benefit of not having two sources of truth in a single message

  37. daniel

    My main issue with xhtml-im is not the security issues but the fact that it includes things like color

  38. singpolyma

    I'm not really sure why that's an issue. I'd want optional colour in 394 also if we went that way

  39. singpolyma

    In fact, colour is one of the most used xhtml-im things right now because of biboumi

  40. singpolyma

    I'm sure I can write something to convert between the two formats without too much issue, so it's not a disaster for me either way. XHTML-IM just feels cleaner and more normal and also already working due to having existed for so long

  41. larma

    For color, I was considering to describe a feature where you can change the color without specifying it directly. You could set color to something like primary accent, secondary accent, success, warning, danger, info, muted.

  42. larma

    This would allow for reasonable color features for chat bots and similar, but it would be up to the receiving app to decide what exactly those values mean

  43. Zash

    larma, sounds a bit like what terminals do? :)

  44. larma

    Zash, yes, or many toolkits actually

  45. larma

    I was literally checking https://getbootstrap.com/docs/4.0/utilities/colors/ 😉

  46. Zash

    Picking a point on a color wheel might also be interesting, as long as you can't fail with contrast

  47. larma

    But you don't know what the local accent/link colors are, so you might be accidentally hitting those

  48. larma

    Contrast is not the only issue 😉

  49. larma

    You don't want to make it possible for the other end to send something that looks like a link without it being a link

  50. larma

    or look like a button, blockquote, ...

  51. Zash

    Reasonable

  52. dan.caseley

    I can't read both standards to contribute real-time to the conversation, but... * I agree about not shoving unparsed XHTML into rendering * If the main value of XHTML-IM is coloured messages, why not solve that specifically. A glance over 394 also feels like overshooting.

  53. singpolyma

    I wouldn't call that the main value. The main value is adding formattng to messages in a modular way. Biboumi just happens to use a lot of colour because irc

  54. MSavoritias (fae,ve)

    Also a big value is that it already exists as the biggest markup in use today html. Which can bring a *lot* of accessibility benefits too

  55. dan.caseley

    So then is the purpose about doing a thing that's 2-way useful for Biboumi? What I mean is... Outside of bridging, will it get much use?

  56. dan.caseley

    I can see the value of using HTML rather than rolling our own

  57. singpolyma

    dan.caseley: well, it used to be very much used in xmpp land, but some of the clients have removed support since the xep was deprecated and new clients have mostly not added it

  58. larma

    For Dino we once planned to support a very minimal subset of XHTML-IM, but that was essentially what is also supported by XEP-0394, so it really didn't make a lot of sense. Also even for just that small subset, we;d need to implement CSS as some clients used CSS font-weight for bold rather than HTML <strong>, etc. XHTML-IM really overcomplicates things if you don't want to use any existing HTML+CSS parsers, but imposes high risk if you do. That's why it makes sense to not use it IMO

  59. larma

    For Dino we once planned to support a very minimal subset of XHTML-IM, but that was essentially what is also supported by XEP-0394, so it really didn't make a lot of sense. Also even for just that small subset, we'd need to implement CSS as some clients used CSS font-weight for bold rather than HTML <strong>, etc. XHTML-IM really overcomplicates things if you don't want to use any existing HTML+CSS parsers, but imposes high risk if you do. That's why it makes sense to not use it IMO

  60. singpolyma

    Yeah, I could certainly see a useful level of implementation that ignores all css

  61. singpolyma

    I'm glad on android some css was there because of the colour thing, but if you don't want colours its probably quite skippable

  62. singpolyma

    There's another thing I use xhtml-im for we haven't mentioned yet and that is for inline images. Specifically "custom emoji"

  63. moparisthebest

    I'm also for not doing CSS and against remote clients choosing colors for me, though what larma said re color hints or whatever doesn't sound bad

  64. moparisthebest

    inline images are sorely needed

  65. larma

    Are they? Most popular IM systems don't have a generic support for inline images. Some do custom emojis, but those have very specific restrictions on dimensions (square, height must match font height). Generic inline images again are very risky as they might resemble UI elements and thus could be used to trick users

  66. larma

    I'd be fine with something specifically for custom emojis, but not inline images in general.

  67. singpolyma

    Those are the same thing IMO

  68. singpolyma

    A custom emoji can maybe look like a button but since you can't control size or position it won't actually in any useful sense

  69. pep.

    "Never shovel XHTML-IM directly into a renderer without understanding what that renderer could do with it." < Never shovel anything? directly into anything else without understanding what that anything else could do with it? That can surely be added to security considerations but it's hardly xhtml-im specific.

  70. singpolyma

    Indeed

  71. larma

    well, there are things that are more likely to happen than others. But yes, we already had to explain people that they also shouldn't shovel XEP-0393 message styling blindly into markdown parsers...

  72. pep.

    We had to do that again today in xmpp-rs@

  73. theTedd

    The proper way to un-deprecate a XEP is to submit a new one which just happens to have a lot of similar content 😉 (Resurrecting will lead to poor assumptions about what hasn't changed, and old implementations will be reused.)

  74. theTedd

    The main issue with XHTML-IM is that few people are going to want to write a parser for it, so they will be passing it straight to a HTML renderer - no matter how many warnings and restrictions you put in the XEP, that's the fastest and easiest route. But obviously developers always read the documentation in detail and never just look at the examples, so there's no need to worry. Even for those who do try to sanitise, consider the fun people have just trying to parse XML safely.

  75. theTedd

    As for colours: you may find them an insult to the senses, but clearly some users do want to use them, so allow them, but also give clients the option to ignore them.

  76. pep.

    "old implementations will be reused." < You mean current implementations? :P

  77. theTedd

    Sure

  78. pep.

    Poezio never removed it, it's still very useful

  79. pep.

    fwiw I'm happy to rework colors too. I'm all for improving accessibility

  80. singpolyma

    > well, there are things that are more likely to happen than others. But yes, we already had to explain people that they also shouldn't shovel XEP-0393 message styling blindly into markdown parsers... ... 393 isn't even markdown compatible? What?

  81. larma

    but on first sight it looks similar 😉

  82. pep.

    singpolyma, yeah not that was an explicit goal

  83. singpolyma

    > The proper way to un-deprecate a XEP is to submit a new one which just happens to have a lot of similar content 😉 (Resurrecting will lead to poor assumptions about what hasn't changed, and old implementations will be reused.) Yes, reusing old implementations would be one of the good benefits

  84. theTedd

    Only where those implementations are safe and sanitize the content, not just pass it on to a HTML renderer which conveniently executes embedded javascript

  85. singpolyma

    theTedd: sure, but since there are very few popular xmpp web things that's unlikely to be most of the current implementations

  86. theTedd

    Since few people want to write the parser, yes they embed a HTML parser for displaying the content

  87. singpolyma

    larma, pep.: so does this manifest as people asking why their markdown parser isn't working?

  88. pep.

    singpolyma, for us today it was "heh it's easy to add markdown support in Rust! It's just a new crate and a oneliner!"

  89. pep.

    "Yes it's markdown? Look if I do *foo* I get bold" "well in markdown this would italicize"

  90. pep.

    I think people call markdown stuff that vaguely uses sigils to get somewhat rich text

  91. singpolyma

    And **foo** is bold is markdown but in 393 is unstyled

  92. theTedd

    And there are several different dialects of markdown, each with _some_ overlap

  93. singpolyma

    (though apparently in conversations **foo** is bold. I should see how hard that is to fix...)