it is simply not legal in XML 1.0 character data ✏
Zash
UNACCEPTABLE
jonas’
so if you tried to escape it with  or somesuch, that would still be not-well-formed
Martin
It's also interesting how it ends up there:
https://bugs.debian.org/974205
pulkomandyhas left
jonas’
hah
pulkomandyhas joined
Martin
> Switch to console, run
> profanity, and try some escape sequence such as hitting CTRL+V twice,
> then enter. Disconnects from the server again.
This one triggered it for me too.
jubalhhas joined
debacle
Martin, IMHO such sequences should be filtered by the UI already, before it ever reaches the XML or XMPP library. I.e. ncurses.
jubalh
how will one define 'such sequences'?
jubalh
list all of them? only allow certain characters? what about unicode then?
jubalh Not sure. Check whether input is valid UTF-8? I hope, either glib or ncurses or expat have a function to check that? In case invalid input, blame user and throw away their input.
Martin
>Is the German letter ß a real letter or just a fancy way of writing ss?
Eszet not SS! OMG…
Link Mauve
Martin, uppercasing might not agree with you. :p
Martin
Sorry, I don't get it.
Link Mauve
uppercase("weiß") might give "WEISS".
Link Mauve
I think it depends on the Unicode version.
Martin
We have an uppercased eszet now!
Martin
https://en.wikipedia.org/wiki/Capital_%E1%BA%9E
Link Mauve
Turns out, Unicode is from before 2017.
Link Mauve
So it had to support the only existing rule back then.
Martingoes on the street and demands inclusion of ẞ
jonas’
jubalh, so, easy. On input, you convert everything to unicode (please see the link). You’ll then have to filter out all codepoints between U+0000 and U+001F (incl.) except U+0009, U+000A and U+000D
jonas’
then you pass that to the XML library for serialisation as XML
jonas’
(the XML library should hit you if you don’t do the filtering; if it doesn’t, fix it)
machas joined
DebXWoodyhas joined
machas left
machas joined
zapbhas joined
goffihas left
jubalh
jonas’: will note it down, thanks
machas left
zapbhas left
zapbhas joined
Beherithas left
Beherithas joined
machas joined
zapbhas left
zapbhas joined
florettahas left
florettahas joined
machas left
lovetoxhas joined
machas joined
lovetoxhas left
test2has joined
machas left
zapbhas left
zapbhas joined
test2has left
flow
the problem is already that the "XMPP (or XML) library" allows such codepoints in CDATA, is there even an XMPP (or XML) library invovled?
Zash
If you think there isn't, then *YOU* are the XML library!
flow
well depends, is printf(SOCKET, "<foo bar='baz'>asdf</foo>") an XML library?
flow
*fprintf
test2has joined
Ge0rG
flow: you forgot some format strings that get passed attacker-supplied data