Also something about backward compatibility and changing unicode versions
Lancehas left
mukt2has left
Lancehas joined
mukt2has joined
Lancehas left
adityaborikarhas joined
Andrzejhas joined
paulhas left
Andrzejhas left
krauqhas left
krauqhas joined
neshtaxmpphas joined
Mikaelahas joined
lovetoxhas joined
adityaborikarhas left
adityaborikarhas joined
lovetox
but is precis not better equiped to deal with changing unicode versions?
lovetox
i read precis defines what code points are valid, while stringprep defines what code points are illegal
lovetox
so whats legal changes with each unicode version in stringprep
serge90has left
adityaborikarhas left
adityaborikarhas joined
Lancehas joined
krauqhas left
krauqhas joined
thorstenhas left
krauqhas left
krauqhas joined
krauqhas left
krauqhas joined
krauqhas left
krauqhas joined
mimi89999has left
mimi89999has joined
serge90has joined
thorstenhas joined
adityaborikarhas left
adityaborikarhas joined
Lancehas left
mukt2has left
Andrzejhas joined
karoshihas left
mukt2has joined
Lancehas joined
lovetoxhas left
Mikaelahas left
Andrzejhas left
Mikaelahas joined
Lancehas left
adityaborikarhas left
adityaborikarhas joined
peetahhas left
peetahhas joined
lovetoxhas joined
krauqhas left
krauqhas joined
neshtaxmpphas left
neshtaxmpphas joined
Andrzejhas joined
paulhas joined
neshtaxmpphas left
Nano4BeingYouhas joined
mimi89999has left
Andrzejhas left
jonas’
lovetox, no, stringprep is pinned to unicode 3.2, so that’s no big deal
lovetox
ah ok
lovetox
didnt know that
jonas’
lovetox, PRECIS on the other hand is not pinned to any unicode version, has no logic to deal with differing versions and thus PRECIS on Unicode X.Y may easily produce/validate strings which are not allowed with PRECIS Unicode X'.Y' for X' != X
jonas’
stringprep being pinned to 3.2 is why we cannot have robotface ;)
lovetox
sure? when precis defines whats legal instead of whats illegal
lovetox
it does not matter what unicode version
jonas’
it defines it in terms of unicode categories
lovetox
if we assume that new unicode versions only add codepoints
jonas’
yes
neshtaxmpphas joined
jonas’
if you run PRECIS on 3.2, it will reject strings PRECIS on 9.0 allows
jonas’
and updating unicode versions in an app is even harder than updating TLS, I’m afraid
pasdesushihas joined
lovetoxhas left
moparisthebesthas left
moparisthebesthas joined
mdoschhas left
mdoschhas joined
wurstsalathas joined
pasdesushihas left
Lancehas joined
pasdesushihas joined
marc0shas left
marc0shas joined
karoshihas joined
Lancehas left
marc0shas left
emushas joined
marc0shas joined
adityaborikarhas left
adityaborikarhas joined
Tobiashas joined
mukt2has left
debaclehas joined
pasdesushihas left
APachhas left
APachhas joined
antranigvhas left
lovetoxhas joined
Andrzejhas joined
j.rhas left
pasdesushihas joined
mukt2has joined
flow
could be as easy as "apt-get install unicode-data"
Andrzejhas left
pasdesushihas left
thorstenhas left
thorstenhas joined
lovetox
as i understand the problem is not some unicode libraray✎
lovetox
as i understand the problem is not some unicode library, or a dependency ✏
lovetox
its simply that in the standard sometimes things change
Andrzejhas joined
Shellhas left
mukt2has left
karoshihas left
karoshihas joined
flow
the standard changes because with newer unicode versions codepoints that where previously unassigned become assigned
lovetox
yes, but thats not the only way it changes, and thats not the problem with precis
flow
I am not aware of other ways
dwdhas joined
alameyohas left
alameyohas joined
winfriedhas left
winfriedhas joined
lovetox
Changes to the properties of Unicode code points can occur as the
Unicode Standard is modified from time to time. For example, three
code points underwent changes in their GeneralCategory between
Unicode 5.2 (current at the time IDNA2008 was originally published)
and Unicode 6.0, as described in [RFC6452]
lorddavidiiihas joined
lovetox
new unicode versions are not just adding stuff on top, sometimes existing stuff changes
lovetox
or at least there is no guarante that existing stuff does not change by the unicode consortium or whoever decides the stuff
flow
true, and that could mean that a string that was previously valid as JID part becomes invalid
flow
I am not sure how frequent that is
flow
What usually happens is that a string that was previously invalid as JID part becomes valid
lovetox
yes, also im not sure if this is really something that should hold us back
lovetox
we are not designing something for eternety here
flow
I'd expect Unicode tries to prevent re-assigning codepoints whenever possible, for obvious reasons, and only does so if it is decided that the advantages of the re-assignment outweigh the disadvantages✎
flow
I'd expect that Unicode tries to prevent re-assigning codepoints whenever possible, for obvious reasons, and only does so if it is decided that the advantages of the re-assignment outweigh the disadvantages ✏
lovetox
also i think adding new codepoints is not a problem for precis, as its already defined on classes
lovetox
and there will likely be no new classes
lovetox
what is a problem are for example mapping rules
mdosch👁🗨
> Future version of Prosody won't allow 👁🗨 or robot face in nicknames, thus solving that problem.
Seems it's not yet in trunk.
mdosch👁🗨has left
lovetox
which can’t be set in stone without knowing all future codepoints
right, but it means that apps should use the system unicode database, e.g. via pythons unicodedata library that IIRC uses /usr/share/unicode as source
mdosch
Ah, it worked.
lovetox
i think its very unlikely with precis that something that was valid becomes invalid
lovetox
hence i dont see why not use precis
lovetox
yes servers and appilication need a recent unicode version
flow
yeah, it also does not concern me much, and there is nothing you can do about it
lovetox
but they need also X other recent librarys
lskdjfhas joined
lovetoxhas left
lovetoxhas joined
karoshihas left
karoshihas joined
lorddavidiiihas left
lorddavidiiihas joined
pasdesushihas joined
Zash
mdosch: It's the MUC that enforces it, and this channel isn't on trunk.
pasdesushihas left
pasdesushihas joined
pasdesushihas left
pasdesushihas joined
mukt2has joined
Lancehas joined
pasdesushihas left
mukt2has left
mukt2has joined
emushas left
emushas joined
Lancehas left
etahas left
etahas joined
mimi89999has joined
j.rhas joined
sonnyhas joined
sonnyhas left
lorddavidiiihas left
j.rhas left
j.rhas joined
sonnyhas joined
xeckshas left
sonnyhas left
sonnyhas joined
mukt2has left
Syndacehas left
mukt2has joined
dwdhas left
sonnyhas left
sonnyhas joined
Syndacehas joined
mukt2has left
sonnyhas left
eevvoorhas joined
mukt2has joined
sonnyhas joined
dwdhas joined
sonnyhas left
Shellhas joined
LNJhas joined
mukt2has left
sonnyhas joined
xeckshas joined
Lancehas joined
lovetoxhas left
alex-a-sotohas left
alex-a-sotohas joined
sonnyhas left
mukt2has joined
Lancehas left
lovetoxhas joined
Marandahas left
Marandahas joined
Ge0rG
The real problem isn't which version to check against but who's responsible for the check at which border. If the check is only performed by the server owning any given JID, no problems will arise.
mukt2has left
Ge0rG
If you enforce precis on another server's JIDs, you'll end up kicking people because of mdosch👁🗨 in your MUC
Zash
Hence being strict on creation of users, chat rooms, MUC nicknames.
Ge0rG
We could solve the problem by requiring a baseline set that's forward compatible, like no " in JIDs, and leave everything extended unicode to the authoritative server
Ge0rG
But I want 🤖
Zash
AFAIK the problem is mostly about unassigned codepoints, where you don't know if it's valid or forbidden.
Zash
So someone on Unicode 3.2 doesn't know whether 🤖 is valid or not, so it'll end up allowing it in JIDs received by others, while forbidding local things from using it.
Ge0rG
Zash: allowed from remote servers, forbidden on yours
Zash
Exactly
Ge0rG
And then your admin can install unicode 11
Ge0rG
And allow fancy new names
Zash
I wonder if Someone™ should write an Informational XEP on this whole mess.
Ge0rG
Zash: you should
sonnyhas joined
eevvoorhas left
Marandahas left
Marandahas joined
mukt2has joined
adityaborikarhas left
adityaborikarhas joined
goffihas joined
sonnyhas left
lorddavidiiihas joined
sonnyhas joined
lskdjfhas left
lskdjfhas joined
lovetox
why do we need to write a XEP
lovetox
either all should use precis or all should use stringprep
Zash
That's not what reality looks like
lovetox
finished, the details that there is one codepoint in a million that from unicode version X to Y changed
lovetox
really, thats only a problem in some people minds
Zash
I'm not that worried about Unicode redefining characters between versions.
Zash
But versions add new characters, which moves code points from Undefined to either Allowed or Forbidden (for each JID part)
dwdhas left
lovetox
Zash i dont think thats how it works
Zash
Ok.
Zash
I revert to my earlier statement of not wanting to discuss this.
lovetox
but even if, whats the problem with that
lovetox
say an unassigned codepoint is moved to valid
lovetox
your server simply does not accept it because you are on a older unicode version
lovetox
there is no problem there
lovetox
its like jabber.org does not allow connection with another server because it is weird and runs not current software or is misconfigured
lovetox
the solution is not, to find a standard where this can never ever happen
lovetox
its server need to upgrade from time to time
lovetox
and it happens that we do this already
adityaborikarhas left
papatutuwawahas left
sonnyhas left
adityaborikarhas joined
papatutuwawahas joined
alex-a-sotohas left
alex-a-sotohas joined
Zash
Oh look, it's been a year
https://mailarchive.ietf.org/arch/msg/xmpp/a-WhzOTyOq168GujQHgzQ1-DURI/
dwdhas joined
neshtaxmpphas left
sonnyhas joined
mukt2has left
mukt2has joined
Mikaelahas left
lovetox
yeah i really dont know what the problem here is
lovetox
its like, a server comes along that supports only TLS 1.3, but the other server does not yet support TLS 1.3
paulhas left
paulhas joined
lskdjfhas left
lskdjfhas joined
lovetox
and the question answer is probably b)
Mikaelahas joined
sonnyhas left
lovetox
server should validate jids
lovetox
this means sending errors if the validation fails
Zash
That's what jabber.org does, which was the problem highlighted earlier.
adityaborikarhas left
Lancehas joined
jabberdotorguserhas joined
jabberdotorguserhas left
Zash
^ happens
andrey.ghas joined
Zash
For those who don't have joins & parts shown:
---> jabberdotorguser joined the room
<--- jabberdotorguser has left the room due to an error (Kicked: jid malformed)
sonnyhas joined
jcbrandhas joined
adityaborikarhas joined
Lancehas left
sonnyhas left
sonnyhas joined
lovetoxhas left
lovetoxhas joined
j.rhas left
sonnyhas left
sonnyhas joined
sonnyhas left
sonnyhas joined
lovetoxhas left
LNJhas left
sonnyhas left
Lancehas joined
debaclehas left
sonnyhas joined
sonnyhas left
florettahas left
Vaulorhas left
Sevehas left
florettahas joined
Vaulorhas joined
Sevehas joined
Lancehas left
etahas left
etahas joined
sonnyhas joined
Ge0rG
lovetox: enforcing validation on entities outside of the user's control is going to cause pain. This is what it's all about
Ge0rG
Which is why "allowed" and "forbidden" are too few decision choices
sonnyhas left
Nekithas left
mukt2has left
eevvoorhas joined
archas joined
andrey.ghas left
lskdjfhas left
sonnyhas joined
LNJhas joined
mukt2has joined
lorddavidiiihas left
sonnyhas left
sonnyhas joined
sonnyhas left
Nano4BeingYouhas left
sonnyhas joined
Marandahas left
sonnyhas left
lovetoxhas joined
eevvoorhas left
eevvoorhas joined
lovetox
about what pain are you talking?
lovetox
informing the user he cant join this channel because the jid is not valid
lovetox
is not pain in my book
lovetox
its a 5 second thing to change the nick to something else
Zash
They can't change *someone elses nickname*
lovetox
?! the user cant change his nickname?!
lovetox
and yes also the MUC can change his nick, its in 0045
lovetox
simply remove the offending chars
Zash
You still misunderstand.
Zash
This is not about the one that is joining a MUC
Zash
This is about someone else that is already a participant.
Zash
When the MUC sends the participant list, their server rejects that stanza and the MUC responds to that error by kicking YOU.
lovetox
yeah and ? as a client a validate JIDs, and of course simply drop all invalid
lovetox
you can fill a whole MUC with invalid participants, not a problem in my book
lovetox
but even that should not happen
Zash
But it does.
lovetox
So i cant connect to servers, if mine is outdated and uses old unicode data
Zash
Get a jabber.org account, join this MUC, get kicked the instant the presence of the participant with "👁🗨" in their nick is sent.
lovetox
its the same, right now, i cant even connect to must mucs because my cert is expired
sonnyhas joined
Marandahas joined
Ge0rG
lovetox: but you can't change a remote server and which level of unicode that accepts.
j.rhas joined
Ge0rG
Also the unicode level supported by a server is neither indicated nor negotiated
Ge0rG
Instead your connection gets terminated later on due to somebody else sending presence
Ge0rG
And just moving on with the latest and greatest unicode will break your interop
Ge0rG
In all sorts of non obvious ways
sonnyhas left
dwdhas left
Ge0rG
it's like showing an annoying popup every time you receive something from an "invalid" JID :D
paulhas left
lskdjfhas joined
eevvoor
> Get a jabber.org account, join this MUC, get kicked the instant the presence of the participant with "👁🗨" in their nick is sent.
What a joy.
dwdhas joined
lovetox
Ge0rG, i still dont see the "pain", all that stuff is dependent on how often this happens
lovetox
and i would say it does probably happens as often as you want to use a muc on a server like jabber.org
lovetox
you try it, ok server doesnt work, is outdated, whatever, then you simply dont use it anymore
Ge0rG
lovetox: some implementations don't switch from stringprep to precis because of this sort of issues that it would cause.
Ge0rG
yeah, let's just abandon large parts of our ecosystem
larmanotes how this would be solved by not using the resource part of the JID for the nickname and instead use something like 172 and random resource for joining (like Jitsi Meet does)
lovetox
but stringprep causes the same issues
Ge0rG
larma: how do you prvent everyone from using the same nickname with 0172?
lovetox
stringprep is obsoleted, no new client would implement it, there is no note that says: Hold up please implement stringprep
lovetox
if a client uses precis, and the server validates for stringprep
lovetox
you have the same issue already, now
larma
Ge0rG, either not allow it server side (filter stanzas that do try to mimic another user) or use 0421 to spot the different users
Ge0rG
lovetox: yes, and I bet most clients won't even tell the user what the problem is
eevvoor
yeay, just like my problem persists with trashserver <-> jabber.fr
lovetox
so you acknowledge that the problem is already here right now, and *not* changing to precis
lovetox
does not make anything better
Ge0rG
lovetox: the problem is there because some implementations changed to precis, yes.
Ge0rG
lovetox: what you ask for is called a "forklift upgrade" and is not going to work.
lovetox
it already works
lovetox
users use precis day in and out
lovetox
this is a drop in the bucket of s2s problems out there
lovetox
you make it seem like the whole xmpp ecosystem breaks down, because people cant join mucs anymore
Ge0rG
it's also about contacts with JIDs according to a different spec
larma
IMO clients should never try to join using a unicode resource, but servers still need to handle it. Yet every client that allows to do it should be named as the main issue
larma
(which means about every client nowadays is to blame)
paulhas joined
lovetox
ok larma interesting take, no client should allow a user to use a valid JID as per RFC.
larma
the weird thing is that resource is meant to be something "technical" yet it's also used as a display name
lovetox
i guess you wont win that one
Ge0rG
larma: what about that clients should only warn the user when they try to set a nickname that is outside of the client's supported PRECIS, but the servers have ultimate authority?
Ge0rG
congratulations for finding out that MUC is a set of dirty hacks.
sonnyhas joined
🅶🅴0🆁🅶
hi!
larma
so let's add some other dirty hack to solve that problem? Clients somehow encode unicode chars using ascii as a resource and add a 0172 nick. Clients that see that the 0172 nick matches the ascii encoding will display and use the 0172 nick instead
Lancehas joined
larma
As long as the ascii encoding is somewhat human readable, this would be sufficiently backwards compatible
larma
then we would need MUCs to not allow joining with non-ascii resources and issue is mostly solved
APachhas left
APachhas joined
andrey.ghas joined
mdosch
Forbidding non ASCII is bad for Russians, Arabs, Vietnamese, Thai…
🅶🅴0🆁🅶
also for emoji
larma
mdosch, it's not forbidden, you can still read it in the 0172 nick field
larma
just like domains don't forbid non ascii, you just need to encode using punycode
lorddavidiiihas joined
🅶🅴0🆁🅶
what about encoding punycode nicknames in the resource?
larma
also fine with me, but I believe there could be better legacy fallbacks than punycode
🅶🅴0🆁🅶
like... PRECIS?
archas left
archas joined
lskdjfhas left
sonnyhas left
Sevehas left
Sevehas joined
Zash
Went for a walk. TL;DR let's solve the problem of not everyone upgrading at the exact same time with "just upgrade at the same time"? :)
Lancehas left
sonnyhas joined
lovetox
its not a matter of upgrading at a certain point in my opinion
lovetox
right now prosody does not do jid validation at all or?
lovetox
that means it already is a upgraded precis like server, it does exactly what Ge0rG fears, it sends precis muc resources to other servers that dont understand it
Zash
It does, but it allows unassigned characters.
Zash
Mostly because this is the library default
lovetox
yeah, so nobody cared, the ecosystem did not break down
Zash
No PRECIS
🅶🅴0🆁🅶
Somebody needs to care about the small things as well
lovetox
you allowed resources that were not stringprep valid
lovetox
and send them to other servers and clients
Zash
This goes under "historical reasons" now
Zash
The plan is to change it to be strict about things created locally.
lovetox
and in all the years i never saw one issue, of clients or server operators
🅶🅴0🆁🅶
yeah, let's just pin XMPP to stringprep and carve that in stone
lovetox
that complained that users cant join your MUCs
🅶🅴0🆁🅶
Zash: strict according to what spec?
Zash
This /did/ happen, for years, if you compiled your Prosody differently
Zash
🅶🅴0🆁🅶: Ancient STRINGPREP
🅶🅴0🆁🅶
Zash: did I get this right, you want to make prosody strict according to stringprep?
Zash
Yes
🅶🅴0🆁🅶
Insanity!
Zash
No
🅶🅴0🆁🅶
But why?
Zash
I'm running a version with this enabled and I'm having no problems.
Zash
You just can't use 🤖 as nickname on my local MUC instance
🅶🅴0🆁🅶
do you even have a MUC domain?
Zash
It's strict about *local entities*
🅶🅴0🆁🅶
I want my robot face back!
Zash
Local users, local MUC JIDs, local MUC participant nicknames.
🅶🅴0🆁🅶has left
Zash
Anything coming from a remote server and isn't known to be invalid is accepted.
🗨
Zash: yes, but then you release it and everybody goes back into 2002
🗨
I know there are people who wish for xmpp to be like it was in 2002.
Zash
ITYM 2006*
lovetox
seems ejabberd does validate strictly for stringprep