XSF Discussion - 2017-10-20

  1. pep.

    When are we going to get out of the loop :(

  2. moparisthebest

    pep.: *Never* muhuhahaha

  3. pep.

    When are we going to get out of the loop :(

  4. zinid

    Just deprecate XML, it's insecure

  5. MattJ

    along with Javascript and SQL

  6. zinid

    that would be great

  7. Zash

    And x86

  8. Zash

    And neurons

  9. jonasw

    Link Mauve, my issue with colors is that they are not portable at all. your choice of colors may be badly readable on my background

  10. jonasw

    since people tend to forget that I had to block inbound colors on my desktop client.

  11. jonasw

    I’m not against a color support, but we need to define a way (Ge0rG suggested to support a palette of XEP-0392 colors which applications can then adapt according to XEP-0392 to their backgrounds) which allows to play nice with themes.

  12. daniel

    jonasw: just set a background color as well

  13. jonasw

    daniel, yes, that makes things *so* much better :P

  14. daniel

    Xhtml is perfect. People are just using it wrong

  15. daniel

    It's the people. Not the protocol

  16. jonasw

    Link Mauve, re JSON based protocol-break: I think it can be done in a way which makes arbitrary XHTML injection much harder to let happen than with XHTML-Im.

  17. zinid

    jonasw: so you think it's worth redoing everything just because the new format will possess less issues? Not everyone agrees with it, that's why this discussion is not going to stop

  18. jonasw

    zinid, *shrug*. I’d still be okay with the "provide an audited reference implementation for XHTML-IM solution", but I feel that won’t get past SamWhited. I’m trying to compromise here without losing what we can do today.

  19. zinid

    if SamWhited doesn't pass it that would mean nothing will ever happen

  20. zinid

    and we're back to the beginning

  21. Ge0rG

    Wow, that discussion.

  22. Ge0rG

    It seems to be so fundamental, maybe we should question the use of XML as well...

  23. jonasw

    Ge0rG, you love to pour oil into fires, don’t you?

  24. Ge0rG

    jonasw: okay, that was a bit harsh. Let's only question session binding and message routing.

  25. jonasw

    Ge0rG, that sounds good.

  26. jonasw

    didn’t we do that already?

  27. Ge0rG

    jonasw: or maybe we need a different approach to the 'database synchronization" thought: XMPP 2.0 is an HTML document that's slowly loading from the server as new messages arrive.

  28. Ge0rG

    That way we can implement dumb clients in Electron.

  29. jonasw

    Ge0rG, you’re describing Comet

  30. Zash

    jonasw: Pouring oil? More like feeding it an optimally mixed solution of pure oxygen and aerosolized oil

  31. Ge0rG

    Because the server is trustworthy by default, there is no need to cover XSS

  32. jonasw

    Zash, :)

  33. Zash

    Yes, trust in the server. We're all trustworthy people making them.

  34. Ge0rG

    Well, problem solved. Back to real work.

  35. zinid

    Ge0rG: or course we can question the use of XML, because it's shit

  36. zinid

    now we need to rewrite everything?

  37. Zash

    Yeah, starting over from scratch is so much fun!

  38. pep.

    Why dont we use http and json already

  39. Ge0rG

    Wait, there is a ready-made solution already. Matrix!

  40. pep.

    Ge0rG, I think we came to the same conclusion

  41. zinid

    pep.: because jabber was initially designed with flaws: there is no separation between encoding rules and data types, so we cannot simply change encoding rules

  42. zinid

    I'm talking about it since 2004

  43. pep.

    zinid: that was sarcastic

  44. pep.

    If it wasn't obvious enough :)

  45. zinid

    pep.: right, but that would be possible if xmpp wasn't designed the way it was

  46. pep.

    Sure, but who would do that :x

  47. zinid

    matrix has exactly the same problem btw, when json is no longer modern and fancy it will be abandoned

  48. Zash


  49. Ge0rG

    So we need to use an established standard that won't vanish. ASN.1!

  50. Zash


  51. jonasw


  52. jonasw

    I always get attacks of pain when I read ASN.1, and I don’t even know why.

  53. Ge0rG

    So recently I started reading that old "Easy introduction into the subset of ASN.1 relevant for [application]", and I gave up after twenty pages.

  54. jonasw


  55. Zash

    jonasw: Spend some time with it. It should turn into a mild itch eventually.

  56. Zash

    I hope you all read x509guide.txt

  57. Steve Kille

    ASN.1 is wonderful. MUCH easier to write specs in than XML and compact on the wire

  58. Zash

    Steve Kille: Which encoding rules? ;)

  59. zinid

    Steve Kille: the learning curve is too high, that's why they are crying :)

  60. jonasw

    Steve Kille, reminds me, I (with my editor hat on) would like to fix some XML issues in the MIX XEP at some point. Let me know when that’d work for you so that we don’t produce conflicts.

  61. zinid

    Zash: why encoding rules would boher an application programmer? You don't need to deal with them directly

  62. Steve Kille

    jonasw: now would be an excellent time for you to do this. I do not have MIX XEP checked out.

  63. jonasw

    Steve Kille, mhm, right, now won’t work for me though :-). I’ll ask you again when I’m ready (I expected a reply along the lines of "I have some update prepared which I’ll push at some point")

  64. Ge0rG

    jonasw: you could ask for a time window as well ;)

  65. Steve Kille

    There are various changes I want to make, but they all need discussion with my co-author

  66. jonasw

    Ge0rG, indeed; I’m away for a week now though

  67. jonasw

    so this was probably not the smartest time for me to ask :)

  68. Ge0rG

    jonasw: you could ask something like "is it okay if I do it in a week" :P

  69. jonasw

    Ge0rG, in two weeks rather

  70. Steve Kille

    I have one editorial task, which you are welcome to take on (but I can also do easily) which is to ensure that ALL of the examples use .example (following IETF guidelines)

  71. Steve Kille

    I got editorial changes forced for:

  72. Steve Kille


  73. Steve Kille

    jonasw: ping me on MIX editing. I am not expecting to make changes over the next few weeks

  74. jonasw

    Steve Kille, okay thanks

  75. pep.

    (What is a white page service?)

  76. Ge0rG

    pep.: a public phone book?

  77. pep.


  78. pep.


  79. Ge0rG

    I think there is one very important twist missing from the current XHTML-IM vs. markdown vs. body-hints debate: We also need an explicit encoding for Emoji in ASCII form. Some clients will parse `(c)` as a mug of coffee, others as ©. We should standardize how Emoji are to be transmitted and translated.

  80. Kev

    As unicode. Job done.

  81. dwd

    Ge0rG, A good point to raise, and also Kev's right.

  82. Kev

    I suggest <no-emoticons xmlns='...'/> as a child element of the message.

  83. edhelas

    in Unicode � trust

  84. Kev

    So that recipients know that the sending client doesn't use text emoticons, only unicode emoji.

  85. Ge0rG

    Kev: in theory, you are right. In practice, it is impossible for a client implementation to figure out which subset of Unicode is supported by its platform.

  86. Kev

    That's ok, in a sense. Clients can e.g. swap out unicode emoji for images locally.

  87. Ge0rG

    Kev: we could of course define entity caps for a client to show which subset it supports.

  88. Kev

    I think (much as I hate it) that that's what Swift's going to have to do on Linux, at least, and probably Windows.

  89. Ge0rG

    I'm also a proponent of emoji hugification in IM. Because you can't read them at 8x16 pixels.

  90. Ge0rG

    TIL about emoji CLDR short names: "The CLDR short name for the character or sequence. Short names vary by language, and are from the CLDR data."

  91. Ge0rG

    😄 = :glimlaggende_gesig_met_oop_mond_en_glimlaggende_oë:

  92. Kev

    Or "Bob", to his friends.

  93. Ge0rG

    (I really like the Slack notation used for Emoji, where you use `:short_name:`)

  94. edhelas

    until you start to use things like jabber:x:data namespace in your message

  95. jonasw

    aaand we’re back to the plain text markup story

  96. edhelas

    personnaly, I think that this discussion is pointless

  97. jonasw

    edhelas, note though that emoji names are (a) always surrounded by whitespace and (b) not to be used in transport, only in display when no emoji capability is there by the rendering engine

  98. Kev

    Ge0rG: That's great, but that's a client-side thing, no reason for us not to then transmit as unicode.

  99. Zash

    Are we going in circles or some kind of 5-dimentional figure 8?

  100. dwd

    jonasw, What? No, :stuff: is a client-side thing, surely?

  101. jonasw

    dwd, I sure hope though!

  102. edhelas

    if we have XHTML-IM security issues, we open tickets in the bugtrackers of the buggy clients, and that's it, end of story

  103. jonasw

    dwd, I sure hope so!

  104. Ge0rG

    Actually, Kev is right. Unicode should be the default transport format for Emoji. However, it would be great to have input conversion from ":)" to "😀" etc.

  105. Kev

    I agree wholeheartedly with that.

  106. jonasw


  107. jonasw

    it’s also lovely how this discussion messes with poezio

  108. Zash

    Client UX/UI issue?

  109. edhelas

    remember the Carbons security issue ? did we removed Carbons ? no

  110. Ge0rG

    Kev: that also means that clients without Emoji support need to keep a mapping table of all Emoji symbols to some other representation, like ASCII

  111. jonasw

    edhelas, one could argue that the vulnerability produced by XSS in XHTML-IM-Web-Clients is worse

  112. Kev

    Ge0rG: ugly, but sadly true.

  113. jonasw

    (up to stealing your password)

  114. edhelas

    we just added a small paragraph, released a security message, fixed all the buggy clients, and that was it

  115. Kev

    I'm actually tempted to write a XEP, given otherwise I'm busy setting up an AD this morning.

  116. jonasw

    Kev, oh the pain.

  117. edhelas

    jonasw define "worse"

  118. Ge0rG

    Kev: or we use client caps to indicate emoji support, and then the server can automagically translate unicode into :ascii: in outgoing messages!

  119. jonasw

    setting up LDAP is fun already, I don’t want to know how AD is like

  120. dwd

    Kev, You need some solid displacement activity.

  121. jonasw

    edhelas, stealing your password is arguably worse than being able to impersonate peers.

  122. Ge0rG

    Kev: write an XEP for what?

  123. Kev

    <no-emoticons xmlns='...'/>

  124. edhelas

    it's the responsibility of clients and servers devloppers to sanitize properly their I/O

  125. jonasw

    edhelas, yes, but do we need to make it hard for them?

  126. Ge0rG

    Kev: how would you handle ":)" with such an XEP marker?

  127. Kev

    Ge0rG: By rendering ":)"

  128. edhelas

    it's not "hard" to sanitize, most of those clients didn't even had any kind of security layer

  129. Ge0rG

    Kev: but that's not what people expect.

  130. jonasw

    Ge0rG, on the UI input or when received over the network?

  131. jonasw

    edhelas, source?

  132. Ge0rG

    jonasw: in both situations, because backward compatibility!11!

  133. jonasw

    I personally find @style hard to sanitize. You can’t do that with regexes alone.

  134. Kev

    Ge0rG: It's the sending client saying to the recipient "I have already done emoji conversion locally, so if there's something that looks like an ASCII emoticon in here, just render it as-is, because it's what the sender intended".

  135. jonasw

    (I’m pretty sure that the subset of CSS allowed by @style is not regular)

  136. edhelas

    jonasw if you code a client and simply do .innerHTML = message.body, well you should serously go take some Web dev courses again

  137. Ge0rG

    Kev: that makes sense to me.

  138. jonasw

    edhelas, and what if I don’t, but @style contains a background-image with a URL which makes the browser execute javascript?

  139. jonasw

    it’s a thing.

  140. edhelas

    seriously you want to reinvent your own markup ?

  141. jonasw

    edhelas, actually, I’m replaying the arguments of others here.

  142. edhelas

    looks like some JS hipster project

  143. jonasw

    my primary goal in this situation is to retain the capability for well-defined rich markup. If that’s by inventing our own rich markup or by providing a solid reference sanitizer for XHTML-IM, I don’t care.

  144. dwd

    edhelas, Most (decent) frameworks make that hard, but possible. The problem is that none of them make santizing HTML easy. The correct colution for embedding unknown-origin HTML is to enclose it in an iframe, but the problem there si that UX takes a hit.

  145. Ge0rG

    dwd: is copy&paste of message histories the only UX problem with iframes?

  146. jonasw

    Ge0rG, scrolling, sizing the iframe, …

  147. jonasw

    (but copy&paste is the worst, I think)

  148. jonasw

    and also iframes make me furious, I have to be able to select all the text!

  149. Ge0rG

    scrolling. ewwww.

  150. jonasw

    (I don’t think that you can tell an iframe to behave like a <div/> regarding layout, so that’s a whole can of worms there)

  151. jonasw

    (but maybe it’s possible I haven’t checked)

  152. edhelas

    personnally I just find XHTML-IM not great for users in general

  153. jonasw

    edhelas, why?

  154. edhelas

    and as I said, you'll have the exact same problem when users will use Atom in Pubsub

  155. jonasw

    for atom in pubsub, using an iframe is probably more realitsic

  156. edhelas

    I have to deeply sanitize Atom content to ensure that I don't have JS/CSS/iframe injections in it

  157. jonasw

    Ge0rG, and I’m not sure you can access the dom of an iframe from the outside

  158. edhelas

    no, hell no

  159. edhelas

    just sanitize things, you have nice libs for that

  160. jonasw

    edhelas, where are those nice libs, and why doesn’t the XHTML-IM XEP mention them?

  161. edhelas

    https://github.com/ezyang/htmlpurifier for PHP

  162. jonasw

    for JS that is

  163. jonasw

    because we’re talking about actual web clients

  164. jonasw

    things running in HTML+JS

  165. edhelas


  166. edhelas

    jonasw https://code.google.com/archive/p/google-caja/wikis/JsHtmlSanitizer.wiki ?

  167. jonasw

    There was an error obtaining wiki data: {"data":{"text":null},"status":-1,"config":{"method":"GET","transformRequest":[null],"jsonpCallbackParam":"callback","url":"https://www.googleapis.com/storage/v1/b/google-code-archive/o/v2%2Fcode.google.com%2Fgoogle-caja%2Fwiki%2FJsHtmlSanitizer.wiki?alt=media","headers":{"Accept":"application/json, text/plain, */*"}},"statusText":""}

  168. jonasw

    edhelas, did you just google that or is that a library with a good security track record?

  169. edhelas

    just googling around

  170. jonasw

    yeah, that’s not how this will work

  171. Ge0rG

    Wow. a11n for Emoji is a black art on its own: http://unicode.org/repos/cldr/trunk/specs/ldml/tr35-general.html#SynthesizingNames

  172. edhelas

    but as I said, more than XHTML, I prefer to strip more of the tags and style, because I don't want my client discussion UI to look ugly

  173. jonasw

    Ge0rG, a11y for unicode is a black art on its own

  174. jonasw

    edhelas, we agree on burning @style with fire, right?

  175. edhelas

    I just agree to stop trying to fix a problem by bringing another one

  176. edhelas

    we have XHTML-IM, just bring good practice and fix those clients

  177. jonasw

    convince council :-)

  178. jonasw

    deep inside I’m still thinking that XHTML-IM can be "fixed" to the extent that developers are well-aware that it is tricky to get right and they’re also given the tools to do it right.

  179. Ge0rG

    web developers don't give a shit. They wouldn't be web developers otherwise.

  180. jonasw

    that’s a bit harsh

  181. dwd

    Ge0rG, I don't think that's true. I'm on a call with one now.

  182. edhelas

    Ge0rG thanks, much appreciated <3

  183. jonasw

    Ge0rG, I think there are web developers out there who actually want to make a good thing and who actually care. It’s not the majority though.

  184. jonasw

    (from what I feel when using web applications)

  185. Ge0rG

    edhelas: sorry. That was not intended to insult you. You are doing great work, actually.

  186. Ge0rG

    Oh my. They even distinguish AE and BE in the CLDR annotations: "😋" = "face savoring food" = "face savouring food"

  187. jonasw


  188. Ge0rG

    "🥖" = "baguette bread" = "French stick" = "baguette" in English, depending on your locale.

  189. jonasw

    en_GB everywhere!

  190. Ge0rG

    An i18n a11y nightmare in five Emojis!

  191. Ge0rG

    (not that you could conclusively count the number of Emojis in a given UTF8 string)

  192. jonasw


  193. jonasw

    why can’t we have nice things :(

  194. Ge0rG

    jonasw: because 🤦🏿🤖💩🤰

  195. jonasw


  196. Ge0rG

    I wonder if "🤰♂" will be translated into "Arnold Schwarzenegger"

  197. dwd

    I might have persuaded our FE dev to wade in on the XHTML issue.

  198. jonasw

    dwd, in which way "wade in"?

  199. dwd

    jonasw, State his opinion. Given that he's a web/js developer first and foremost, it might be a useful perspective.

  200. jonasw


  201. zinid

    edhelas also a web dev ;)

  202. edhelas

    damn, you unveiled me

  203. jonasw

    cue dramatic fanfare

  204. edhelas

    my evil plan of deploying broken XHTML-IM everywhere is falling appart

  205. Ge0rG

    world domination through XMPP XSS

  206. dwd

    zinid, It is possible there is more than just one web dev indeed.

  207. zinid

    dwd: so we need moar opinions? we don't get enough yet? :)

  208. dwd

    zinid, We don't need opinions, so much as a consensus.

  209. Link Mauve

    “10:17:01 jonasw> I personally find @style hard to sanitize. You can’t do that with regexes alone.”, indeed, you first have to split on “;” and then to split each value on “:” and then to pick only the elements you want to support from the left part. I would assume there is no language in which this is any difficult, though.

  210. Link Mauve

    background-image is explicitly not allowed by XHTML-IM, you wouldn’t allow it.

  211. Ge0rG

    Link Mauve: couldn't you still inject function calls into the right part?

  212. Link Mauve

    Ge0rG, none of the allowed properties support any URI or function call, I checked that the other day.

  213. Ge0rG

    Link Mauve: does that guarantee that browsers won't execute function calls / URIs when encountered there?

  214. Link Mauve

    I’ve found that browsers take security issue seriously, if the specification doesn’t allow such things and a bug is found in a browser, I’d expect it to be fixed very quickly.

  215. Ge0rG

    XHTML-IM it is, then!

  216. Zash


  217. Ge0rG

    Zash: it looks like the only item on the https://xmpp.org/extensions/xep-0038.html#sect-idm139548995353376 list that can't be mapped to Unicode is :jabber:. What an irony.

  218. Kev

    💡 doesn't quite cut it, does it?

  219. Zash

    Throw in some zwj stuff and call it a day

  220. Ge0rG

    💡 + Variant Selector 16.

  221. Ge0rG

    So I've written a poezio plugin that replaces all incoming Emojis with their respective :alias:. And now I see how ugly it looks and that I still need to look on my phone to see if it was an Emoji originally.

  222. moparisthebest

    but markup in <body> is bad, someone said

  223. Zash

    Unicode isn't markup

  224. Ge0rG

    moparisthebest: yes, and what I wrote underscores that point

  225. moparisthebest

    Ge0rG‎: I still need to look on my phone to see if it was an Emoji originally

  226. moparisthebest

    my question is, why do you care

  227. Zash

    @nickname is markup, right?

  228. Link Mauve


  229. moparisthebest

    yep that's bad too, need a fancy UI and protocol established for highlighting someone

  230. Link Mauve

    moparisthebest, mentions already got a XEP.

  231. SamWhited

    Did it?

  232. SamWhited

    Link please!

  233. Ge0rG

    moparisthebest: I don't know why I care, but it turned out I do

  234. Link Mauve

    I don’t think it was this one I was thinking about: https://xmpp.org/extensions/inbox/jid-mention.html

  235. Link Mauve

    Maybe references?

  236. Zash

    Heh, Link

  237. Link Mauve

    Yes? :p

  238. moparisthebest

    Ge0rG, sounds like a personal problem vs a protocol problem :)

  239. SamWhited

    *snort* I didn't mean to do that.

  240. SamWhited

    I'd forgotten about this one, thanks

  241. Zash

    Can {xep attention} be directed at a MUC participant?

  242. Bunneh

    Zash: XEP-0224: Attention (Standards Track, Draft, 2008-11-13) See: https://xmpp.org/extensions/xep-0224.html

  243. moparisthebest

    jid-mention looks like a great way to spam more, that looks like about all it's useful for

  244. Link Mauve

    Zash, sadly no.

  245. goffi

    Link Mauve: jid-mention has been vetoed in favor of reference

  246. Link Mauve


  247. SamWhited

    Just got kicked from mucs on this server when I joined in another client. No idea why.

  248. Link Mauve

    SamWhited, a lot of people just left with “Kicked: remote server not found: Server-to-server connection failed: DNS resolution failed” as their status.

  249. Link Mauve

    The same second as yours.

  250. SamWhited

    ah, maybe my joining from another client was a red herring

  251. dwd

    In terms of "marking up" a mention, XEP-0372 is the best thinking we have currently.

  252. SamWhited

    References seems nice, but I wish the bit that reads "TODO: define character appropriately" were expanded. I don't think it could be implemented as-is in an interoperable manner without that bit being expanded.