-
Ge0rG
Also something about backward compatibility and changing unicode versions
-
lovetox
but is precis not better equiped to deal with changing unicode versions?
-
lovetox
i read precis defines what code points are valid, while stringprep defines what code points are illegal
-
lovetox
so whats legal changes with each unicode version in stringprep
-
jonas’
lovetox, no, stringprep is pinned to unicode 3.2, so that’s no big deal
-
lovetox
ah ok
-
lovetox
didnt know that
-
jonas’
lovetox, PRECIS on the other hand is not pinned to any unicode version, has no logic to deal with differing versions and thus PRECIS on Unicode X.Y may easily produce/validate strings which are not allowed with PRECIS Unicode X'.Y' for X' != X
-
jonas’
stringprep being pinned to 3.2 is why we cannot have robotface ;)
-
lovetox
sure? when precis defines whats legal instead of whats illegal
-
lovetox
it does not matter what unicode version
-
jonas’
it defines it in terms of unicode categories
-
lovetox
if we assume that new unicode versions only add codepoints
-
jonas’
yes
-
jonas’
if you run PRECIS on 3.2, it will reject strings PRECIS on 9.0 allows
-
jonas’
and updating unicode versions in an app is even harder than updating TLS, I’m afraid
-
flow
could be as easy as "apt-get install unicode-data"
-
lovetox
as i understand the problem is not some unicode libraray✎ -
lovetox
as i understand the problem is not some unicode library, or a dependency ✏
-
lovetox
its simply that in the standard sometimes things change
-
flow
the standard changes because with newer unicode versions codepoints that where previously unassigned become assigned
-
lovetox
yes, but thats not the only way it changes, and thats not the problem with precis
-
flow
I am not aware of other ways
-
lovetox
Changes to the properties of Unicode code points can occur as the Unicode Standard is modified from time to time. For example, three code points underwent changes in their GeneralCategory between Unicode 5.2 (current at the time IDNA2008 was originally published) and Unicode 6.0, as described in [RFC6452]
-
lovetox
new unicode versions are not just adding stuff on top, sometimes existing stuff changes
-
lovetox
or at least there is no guarante that existing stuff does not change by the unicode consortium or whoever decides the stuff
-
flow
true, and that could mean that a string that was previously valid as JID part becomes invalid
-
flow
I am not sure how frequent that is
-
flow
What usually happens is that a string that was previously invalid as JID part becomes valid
-
lovetox
yes, also im not sure if this is really something that should hold us back
-
lovetox
we are not designing something for eternety here
-
flow
I'd expect Unicode tries to prevent re-assigning codepoints whenever possible, for obvious reasons, and only does so if it is decided that the advantages of the re-assignment outweigh the disadvantages✎ -
flow
I'd expect that Unicode tries to prevent re-assigning codepoints whenever possible, for obvious reasons, and only does so if it is decided that the advantages of the re-assignment outweigh the disadvantages ✏
-
lovetox
also i think adding new codepoints is not a problem for precis, as its already defined on classes
-
lovetox
and there will likely be no new classes
-
lovetox
what is a problem are for example mapping rules
-
mdosch👁🗨
> Future version of Prosody won't allow 👁🗨 or robot face in nicknames, thus solving that problem. Seems it's not yet in trunk.
-
lovetox
which can’t be set in stone without knowing all future codepoints
-
mdosch
/can't change his nick back.✎ - mdosch can't change his nick back.
-
flow
right, but it means that apps should use the system unicode database, e.g. via pythons unicodedata library that IIRC uses /usr/share/unicode as source
-
mdosch
Ah, it worked.
-
lovetox
i think its very unlikely with precis that something that was valid becomes invalid
-
lovetox
hence i dont see why not use precis
-
lovetox
yes servers and appilication need a recent unicode version
-
flow
yeah, it also does not concern me much, and there is nothing you can do about it
-
lovetox
but they need also X other recent librarys
-
Zash
mdosch: It's the MUC that enforces it, and this channel isn't on trunk.
-
Ge0rG
The real problem isn't which version to check against but who's responsible for the check at which border. If the check is only performed by the server owning any given JID, no problems will arise.
-
Ge0rG
If you enforce precis on another server's JIDs, you'll end up kicking people because of mdosch👁🗨 in your MUC
-
Zash
Hence being strict on creation of users, chat rooms, MUC nicknames.
-
Ge0rG
We could solve the problem by requiring a baseline set that's forward compatible, like no " in JIDs, and leave everything extended unicode to the authoritative server
-
Ge0rG
But I want 🤖
-
Zash
AFAIK the problem is mostly about unassigned codepoints, where you don't know if it's valid or forbidden.
-
Zash
So someone on Unicode 3.2 doesn't know whether 🤖 is valid or not, so it'll end up allowing it in JIDs received by others, while forbidding local things from using it.
-
Ge0rG
Zash: allowed from remote servers, forbidden on yours
-
Zash
Exactly
-
Ge0rG
And then your admin can install unicode 11
-
Ge0rG
And allow fancy new names
-
Zash
I wonder if Someone™ should write an Informational XEP on this whole mess.
-
Ge0rG
Zash: you should
-
lovetox
why do we need to write a XEP
-
lovetox
either all should use precis or all should use stringprep
-
Zash
That's not what reality looks like
-
lovetox
finished, the details that there is one codepoint in a million that from unicode version X to Y changed
-
lovetox
really, thats only a problem in some people minds
-
Zash
I'm not that worried about Unicode redefining characters between versions.
-
Zash
But versions add new characters, which moves code points from Undefined to either Allowed or Forbidden (for each JID part)
-
lovetox
Zash i dont think thats how it works
-
Zash
Ok.
-
Zash
I revert to my earlier statement of not wanting to discuss this.
-
lovetox
but even if, whats the problem with that
-
lovetox
say an unassigned codepoint is moved to valid
-
lovetox
your server simply does not accept it because you are on a older unicode version
-
lovetox
there is no problem there
-
lovetox
its like jabber.org does not allow connection with another server because it is weird and runs not current software or is misconfigured
-
lovetox
the solution is not, to find a standard where this can never ever happen
-
lovetox
its server need to upgrade from time to time
-
lovetox
and it happens that we do this already
-
Zash
Oh look, it's been a year https://mailarchive.ietf.org/arch/msg/xmpp/a-WhzOTyOq168GujQHgzQ1-DURI/
-
lovetox
yeah i really dont know what the problem here is
-
lovetox
its like, a server comes along that supports only TLS 1.3, but the other server does not yet support TLS 1.3
-
lovetox
and the question answer is probably b)
-
lovetox
server should validate jids
-
lovetox
this means sending errors if the validation fails
-
Zash
That's what jabber.org does, which was the problem highlighted earlier.
-
Zash
^ happens
-
Zash
For those who don't have joins & parts shown: ---> jabberdotorguser joined the room <--- jabberdotorguser has left the room due to an error (Kicked: jid malformed)
-
Ge0rG
lovetox: enforcing validation on entities outside of the user's control is going to cause pain. This is what it's all about
-
Ge0rG
Which is why "allowed" and "forbidden" are too few decision choices
-
lovetox
about what pain are you talking?
-
lovetox
informing the user he cant join this channel because the jid is not valid
-
lovetox
is not pain in my book
-
lovetox
its a 5 second thing to change the nick to something else
-
Zash
They can't change *someone elses nickname*
-
lovetox
?! the user cant change his nickname?!
-
lovetox
and yes also the MUC can change his nick, its in 0045
-
lovetox
simply remove the offending chars
-
Zash
You still misunderstand.
-
Zash
This is not about the one that is joining a MUC
-
Zash
This is about someone else that is already a participant.
-
Zash
When the MUC sends the participant list, their server rejects that stanza and the MUC responds to that error by kicking YOU.
-
lovetox
yeah and ? as a client a validate JIDs, and of course simply drop all invalid
-
lovetox
you can fill a whole MUC with invalid participants, not a problem in my book
-
lovetox
but even that should not happen
-
Zash
But it does.
-
lovetox
So i cant connect to servers, if mine is outdated and uses old unicode data
-
Zash
Get a jabber.org account, join this MUC, get kicked the instant the presence of the participant with "👁🗨" in their nick is sent.
-
lovetox
its the same, right now, i cant even connect to must mucs because my cert is expired
-
Ge0rG
lovetox: but you can't change a remote server and which level of unicode that accepts.
-
Ge0rG
Also the unicode level supported by a server is neither indicated nor negotiated
-
Ge0rG
Instead your connection gets terminated later on due to somebody else sending presence
-
Ge0rG
And just moving on with the latest and greatest unicode will break your interop
-
Ge0rG
In all sorts of non obvious ways
-
Ge0rG
it's like showing an annoying popup every time you receive something from an "invalid" JID :D
-
eevvoor
> Get a jabber.org account, join this MUC, get kicked the instant the presence of the participant with "👁🗨" in their nick is sent. What a joy.
-
lovetox
Ge0rG, i still dont see the "pain", all that stuff is dependent on how often this happens
-
lovetox
and i would say it does probably happens as often as you want to use a muc on a server like jabber.org
-
lovetox
you try it, ok server doesnt work, is outdated, whatever, then you simply dont use it anymore
-
Ge0rG
lovetox: some implementations don't switch from stringprep to precis because of this sort of issues that it would cause.
-
Ge0rG
yeah, let's just abandon large parts of our ecosystem
- larma notes how this would be solved by not using the resource part of the JID for the nickname and instead use something like 172 and random resource for joining (like Jitsi Meet does)
-
lovetox
but stringprep causes the same issues
-
Ge0rG
larma: how do you prvent everyone from using the same nickname with 0172?
-
lovetox
stringprep is obsoleted, no new client would implement it, there is no note that says: Hold up please implement stringprep
-
lovetox
if a client uses precis, and the server validates for stringprep
-
lovetox
you have the same issue already, now
-
larma
Ge0rG, either not allow it server side (filter stanzas that do try to mimic another user) or use 0421 to spot the different users
-
Ge0rG
lovetox: yes, and I bet most clients won't even tell the user what the problem is
-
eevvoor
yeay, just like my problem persists with trashserver <-> jabber.fr
-
lovetox
so you acknowledge that the problem is already here right now, and *not* changing to precis
-
lovetox
does not make anything better
-
Ge0rG
lovetox: the problem is there because some implementations changed to precis, yes.
-
Ge0rG
lovetox: what you ask for is called a "forklift upgrade" and is not going to work.
-
lovetox
it already works
-
lovetox
users use precis day in and out
-
lovetox
this is a drop in the bucket of s2s problems out there
-
lovetox
you make it seem like the whole xmpp ecosystem breaks down, because people cant join mucs anymore
-
Ge0rG
it's also about contacts with JIDs according to a different spec
-
larma
IMO clients should never try to join using a unicode resource, but servers still need to handle it. Yet every client that allows to do it should be named as the main issue
-
larma
(which means about every client nowadays is to blame)
-
lovetox
ok larma interesting take, no client should allow a user to use a valid JID as per RFC.
-
larma
the weird thing is that resource is meant to be something "technical" yet it's also used as a display name
-
lovetox
i guess you wont win that one
-
Ge0rG
larma: what about that clients should only warn the user when they try to set a nickname that is outside of the client's supported PRECIS, but the servers have ultimate authority?
-
Ge0rG
congratulations for finding out that MUC is a set of dirty hacks.
-
🅶🅴0🆁🅶
hi!
-
larma
so let's add some other dirty hack to solve that problem? Clients somehow encode unicode chars using ascii as a resource and add a 0172 nick. Clients that see that the 0172 nick matches the ascii encoding will display and use the 0172 nick instead
-
larma
As long as the ascii encoding is somewhat human readable, this would be sufficiently backwards compatible
-
larma
then we would need MUCs to not allow joining with non-ascii resources and issue is mostly solved
-
mdosch
Forbidding non ASCII is bad for Russians, Arabs, Vietnamese, Thai…
-
🅶🅴0🆁🅶
also for emoji
-
larma
mdosch, it's not forbidden, you can still read it in the 0172 nick field
-
larma
just like domains don't forbid non ascii, you just need to encode using punycode
-
🅶🅴0🆁🅶
what about encoding punycode nicknames in the resource?
-
larma
also fine with me, but I believe there could be better legacy fallbacks than punycode
-
🅶🅴0🆁🅶
like... PRECIS?
-
Zash
Went for a walk. TL;DR let's solve the problem of not everyone upgrading at the exact same time with "just upgrade at the same time"? :)
-
lovetox
its not a matter of upgrading at a certain point in my opinion
-
lovetox
right now prosody does not do jid validation at all or?
-
lovetox
that means it already is a upgraded precis like server, it does exactly what Ge0rG fears, it sends precis muc resources to other servers that dont understand it
-
Zash
It does, but it allows unassigned characters.
-
Zash
Mostly because this is the library default
-
lovetox
yeah, so nobody cared, the ecosystem did not break down
-
Zash
No PRECIS
-
🅶🅴0🆁🅶
Somebody needs to care about the small things as well
-
lovetox
you allowed resources that were not stringprep valid
-
lovetox
and send them to other servers and clients
-
Zash
This goes under "historical reasons" now
-
Zash
The plan is to change it to be strict about things created locally.
-
lovetox
and in all the years i never saw one issue, of clients or server operators
-
🅶🅴0🆁🅶
yeah, let's just pin XMPP to stringprep and carve that in stone
-
lovetox
that complained that users cant join your MUCs
-
🅶🅴0🆁🅶
Zash: strict according to what spec?
-
Zash
This /did/ happen, for years, if you compiled your Prosody differently
-
Zash
🅶🅴0🆁🅶: Ancient STRINGPREP
-
🅶🅴0🆁🅶
Zash: did I get this right, you want to make prosody strict according to stringprep?
-
Zash
Yes
-
🅶🅴0🆁🅶
Insanity!
-
Zash
No
-
🅶🅴0🆁🅶
But why?
-
Zash
I'm running a version with this enabled and I'm having no problems.
-
Zash
You just can't use 🤖 as nickname on my local MUC instance
-
🅶🅴0🆁🅶
do you even have a MUC domain?
-
Zash
It's strict about *local entities*
-
🅶🅴0🆁🅶
I want my robot face back!
-
Zash
Local users, local MUC JIDs, local MUC participant nicknames.
-
Zash
Anything coming from a remote server and isn't known to be invalid is accepted.
-
🗨
Zash: yes, but then you release it and everybody goes back into 2002
-
🗨
I know there are people who wish for xmpp to be like it was in 2002.
-
Zash
ITYM 2006*
-
lovetox
seems ejabberd does validate strictly for stringprep
-
lovetox
cant join any muc with robotface
-
🗨
Zash: https://tools.ietf.org/html/rfc3454 "December 2002"
-
lovetox
although it does with a weird error
-
Zash
🤷
-
lovetox
bad-request
-
lovetox
instead of jid-malformed
-
lovetox
maybe the XMPP needs its own Precis profile
-
lovetox
that is in some way a better upgrade path
-
lovetox
for mucs the discovery problem is easily solved
-
lovetox
just add a feature into disco info
-
Zash
This topic causes me endless pain, I'll be under my desk, crying, for the rest of the weekend.
-
lovetox
only in our heads, i dont think there are actual users having problems with that
- mdosch is watching 🏒 rolls some beers under Zаshs desk.
-
lovetox
i guess there is not even someone out there that gets the idea that weird emojis are allowed in jids
-
Ge0rG
I do
-
Ge0rG
and other people as well
-
mdosch is watching 🏒
Ask Rixon 👁🗨:
-
Ge0rG
as you can see on the occupant list. Unless your client filters out "invalid" JIDs from MUCs
-
mdosch is watching 🏒
I wonder why Rixon 👁🗨 never participated in this discussion although being highlighted frequently in the last days.
-
Ge0rG
mdosch: maybe their client fails to highlight on complex unicode? 😁
-
mdosch is watching 🏒
😂
-
Neustradamus
I will do a little test for MattJ
-
Neustradamus
^ connection and disconnection in less 1s
-
sss
Hey anyone here
-
sss
heyy
-
sss
hey sony
-
sss
andrey.g andrey.g \