XSF Discussion - 2019-09-13


  1. jonas’ flow, I don’t see how your solution suggested on xmpp@ietf is practical. 1. "let’s change all the existing unicode interface implementations!" does not seem practical to begin with 2. it still leaves the issue of software not being on the same version all the time across the internet, and updating the unicode database is a *breaking* update [1], so it won’t happen e.g. on debian [1]: https://postgresql.verite.pro/blog/2018/08/27/glibc-upgrade.html
  2. flow jonas’, I did not suggest to change all existing unicode interface implementation, nor to perform it system wide. The point is, I could take CSH's precis library, add a feature which allows to (dynamically) load a new Unicode Character Database (UCD) and be done. Yes, probably not everyone will always be on the same Unicode Version, but there is not much you can do about it, besides reducing the amount of services which are not on the same unicode version. And increasing agility regarding the supported unicode standard helps here
  3. Daniel yes to my understanding this is the benefit of precis. you can just ship a new unicode database with a minor update of the app. you don’t have to write new code
  4. Daniel that should be as easy as increasing a build dependency somewhere (or something)
  5. jonas’ let’s load the unicode database from the xmpp server
  6. Daniel so one could hope that proliferation of new databases goes rather fix
  7. jonas’ well, at least the only implementation of Precis I know for python it just uses the db shipped with python
  8. Daniel well if we identify that early as a pontential source of problems we could hand out implementation notes that say please make it pluggable
  9. jonas’ true
  10. flow jonas’, i actually pondered about putting the unicode database into dns as PoC. It may be not suited for clients, due privacy reasons, but maybe for servers. And it's fun ;)
  11. dwd flow, I agree that your proposal would be an improvement, but like others, I see it as a mitigation given the circumstances rather than an outright solution.
  12. flow dwd, point of view i guess, I don't think there is a solution which does not involve agility of the supported unicode standard, so I see agility as a solution
  13. flow but happy to be proven wrong
  14. dwd flow, I think we have to be aiming for what you're suggesting. I do not think it'll solve things, just lessen their occurence.
  15. jonas’ https://github.com/byllyfish/precis_i18n/issues/8
  16. jonas’ let’s see how they react :)
  17. dwd jonas’, I suspect it's a matter of having unicodedata able to dynamically load.
  18. jonas’ dwd, that’s what I’m asking for, ain’t I?
  19. dwd jonas’, I mean, in the standard Python library, not so much in precis_i18n.
  20. jonas’ dwd, python uses it for parsing, so I doubt that’s happening
  21. jonas’ (I think)
  22. jonas’ having unicodedata load dynamically in a separate object (like you can instantiate separate `random` objects) would be a measure of course
  23. jonas’ Daniel, you have a working prototype for MIX? Is it using :core:0 or :core:1 for joining? MIX-PAM uses :core:0, but MIX-CORE uses :core:1 :(
  24. Daniel haven’t notice that. i seem to be using core:0 and pam:0
  25. jonas’ nice, it’s pam:1 in the spec though
  26. Daniel and it was working with the one and only server implementation
  27. Daniel it has been 3/4 year though since i tested things
  28. ralphm dwd: I'm curious if we could something with service discovery and/or stream features.
  29. dwd ralphm, Possibly. But - taking "schloß@example.org" for a moment - if we assume that the caonical form is decided by "example.org", then a local user might type "schloss@example.org", and at some point we ask example.org what the correct version is. At that point, do we need heavyweight canonicalisation on other servers? (We might do - I'm just exploring).
  30. Zash Dear lazyxmpp, I wish for a PRECIS implementation suitable for use in C
  31. ralphm Did the normalization of ß change between Unicode versions?
  32. U-061C that response on the ietf mailing list, i believe, misses the point. this is a security issue of denial-of-service, not just compatibility
  33. ralphm Indeed Nameless RTL Person.
  34. Zash Should we adopt U+061C instead of the speech bubbles? 🙂
  35. Zash Or maybe U+00DF
  36. Ge0rG Zash: U+FFFD was also suggested.
  37. U+061C (btw, U+061C was the first character that I could find to have an empty nickname. some other ones were already blocked when i tried)
  38. U+061C so it's not that bad ^^
  39. Zash Space characters that are in Unicode 3,2 (or whatever stringprep uses) would be normalized to U+20 and then the nickname is forbidden if that's all there is
  40. Zash Space-ish codepoints outside of that tho
  41. ralphm Well, sure. That's why PRECIS came about
  42. U+061C well, a nickname that looks empty
  43. Ge0rG At least it doesn't crash iPhones
  44. moparisthebest Nice one jonas’ : > it is required that users of PRECIS have precise control over the unicode database version used.
  45. Seve precisely
  46. flow what a precious comment
  47. jonas’ oh my god, I didn’t even notice
  48. Seve suuure :D
  49. ralphm Nice one