XSF Discussion - 2019-09-13

jonas’ 07:43:36
flow, I don’t see how your solution suggested on xmpp@ietf is practical. 1. "let’s change all the existing unicode interface implementations!" does not seem practical to begin with 2. it still leaves the issue of software not being on the same version all the time across the internet, and updating the unicode database is a *breaking* update [1], so it won’t happen e.g. on debian [1]: https://postgresql.verite.pro/blog/2018/08/27/glibc-upgrade.html
flow 07:47:31
jonas’, I did not suggest to change all existing unicode interface implementation, nor to perform it system wide. The point is, I could take CSH's precis library, add a feature which allows to (dynamically) load a new Unicode Character Database (UCD) and be done. Yes, probably not everyone will always be on the same Unicode Version, but there is not much you can do about it, besides reducing the amount of services which are not on the same unicode version. And increasing agility regarding the supported unicode standard helps here
Daniel 07:55:07
yes to my understanding this is the benefit of precis. you can just ship a new unicode database with a minor update of the app. you don’t have to write new code
Daniel 07:55:30
that should be as easy as increasing a build dependency somewhere (or something)
jonas’ 07:55:31
let’s load the unicode database from the xmpp server
Daniel 07:55:46
so one could hope that proliferation of new databases goes rather fix
jonas’ 07:55:55
well, at least the only implementation of Precis I know for python it just uses the db shipped with python
Daniel 07:56:49
well if we identify that early as a pontential source of problems we could hand out implementation notes that say please make it pluggable
jonas’ 07:56:57
true
flow 07:57:12
jonas’, i actually pondered about putting the unicode database into dns as PoC. It may be not suited for clients, due privacy reasons, but maybe for servers. And it's fun ;)
dwd 07:57:24
flow, I agree that your proposal would be an improvement, but like others, I see it as a mitigation given the circumstances rather than an outright solution.
flow 07:58:48
dwd, point of view i guess, I don't think there is a solution which does not involve agility of the supported unicode standard, so I see agility as a solution
flow 07:58:53
but happy to be proven wrong
dwd 07:59:36
flow, I think we have to be aiming for what you're suggesting. I do not think it'll solve things, just lessen their occurence.
jonas’ 07:59:42
https://github.com/byllyfish/precis_i18n/issues/8
jonas’ 08:00:00
let’s see how they react :)
dwd 08:01:41
jonas’, I suspect it's a matter of having unicodedata able to dynamically load.
jonas’ 08:02:08
dwd, that’s what I’m asking for, ain’t I?
dwd 08:03:07
jonas’, I mean, in the standard Python library, not so much in precis_i18n.
jonas’ 08:03:31
dwd, python uses it for parsing, so I doubt that’s happening
jonas’ 08:03:40
(I think)
jonas’ 08:04:41
having unicodedata load dynamically in a separate object (like you can instantiate separate `random` objects) would be a measure of course
jonas’ 08:12:36
Daniel, you have a working prototype for MIX? Is it using :core:0 or :core:1 for joining? MIX-PAM uses :core:0, but MIX-CORE uses :core:1 :(
Daniel 08:13:49
haven’t notice that. i seem to be using core:0 and pam:0
jonas’ 08:14:00
nice, it’s pam:1 in the spec though
Daniel 08:14:13
and it was working with the one and only server implementation
Daniel 08:14:37
it has been 3/4 year though since i tested things
ralphm 08:18:18
dwd: I'm curious if we could something with service discovery and/or stream features.
dwd 08:23:21
ralphm, Possibly. But - taking "schloß@example.org" for a moment - if we assume that the caonical form is decided by "example.org", then a local user might type "schloss@example.org", and at some point we ask example.org what the correct version is. At that point, do we need heavyweight canonicalisation on other servers? (We might do - I'm just exploring).
Zash 08:26:06
Dear lazyxmpp, I wish for a PRECIS implementation suitable for use in C
ralphm 08:28:02
Did the normalization of ß change between Unicode versions?
U-061C 08:29:32
that response on the ietf mailing list, i believe, misses the point. this is a security issue of denial-of-service, not just compatibility
ralphm 08:30:29
Indeed Nameless RTL Person.
Zash 08:31:17
Should we adopt U+061C instead of the speech bubbles? 🙂
Zash 08:31:25
Or maybe U+00DF
Ge0rG 08:31:59
Zash: U+FFFD was also suggested.
U+061C 08:32:51
(btw, U+061C was the first character that I could find to have an empty nickname. some other ones were already blocked when i tried)
U+061C 08:33:08
so it's not that bad ^^
Zash 08:34:51
Space characters that are in Unicode 3,2 (or whatever stringprep uses) would be normalized to U+20 and then the nickname is forbidden if that's all there is
Zash 08:35:08
Space-ish codepoints outside of that tho
ralphm 08:35:58
Well, sure. That's why PRECIS came about
U+061C 08:35:58
well, a nickname that looks empty
Ge0rG 09:42:41
At least it doesn't crash iPhones
moparisthebest 12:19:50
Nice one jonas’ : > it is required that users of PRECIS have precise control over the unicode database version used.
Seve 12:21:06
precisely
flow 12:22:05
what a precious comment
jonas’ 12:26:23
oh my god, I didn’t even notice
Seve 12:30:13
suuure :D
ralphm 12:34:45
Nice one