XSF Discussion - 2019-07-20

pep. 12:09:28
To what <message/> should <origin-id/> be added? All?
pep. 12:10:59
Now that is was ruled out that LMC must refer to the original message, does it make sense to include an origin-id tag in it? As it will most likely not be referenced. (Same applies for other payloads I guess)
lovetox 14:23:51
pep. you need origin-id not for lmc
lovetox 14:24:08
you need it to deduplicate your own mam messages
lovetox 14:25:10
hm ah i remember now
pep. 14:25:17
~~hmm, because I don't get reflected messages in 1:1, I keep the origin-id and netx time I fetch MAM I can see where I stopped?~~ ✎
pep. 14:25:26
hmm, because I don't get reflected messages in 1:1, I keep the origin-id and next time I fetch MAM I can see where I stopped? ✏
lovetox 14:25:36
it was because in the MUC xep was a sentence that MUCs dont have to keep the ids that clients set
lovetox 14:26:03
so message-id was overwritten by muc service, so you had no chance to deduplicate with it
pep. 14:26:14
Well in MUC it wouldn't apply right? As there is stanza-ids? Well, when the MUC does MAM
pep. 14:26:27
Otherwise yeah I can use origin-id
lovetox 14:26:34
no you are joined in a MUC, you set a message id
lovetox 14:26:40
you send the message
lovetox 14:27:05
damn, you are right
lovetox 14:27:09
i forgot why we added this
lovetox 14:27:20
maybe i remember later, im sure there was a reason
pep. 14:27:39
Maybe the XEP could be clarified and add a clear motivation
lovetox 14:27:42
but yeah that reason is maybe gone, maybe ask Daniel i think he remembers
pep. 14:29:29
I have an implementation in slix, I'll wait for feedback a bit more I guess
Ge0rG 14:33:51
Too many ids, too much confusion.
Lance 15:08:19
origin-id was a way to stamp without leaking your jid
Lance 15:08:33
since stanza-id requires a by
adityaborikar 15:57:02
Is this site down, or is there anywhere else I could read this white paper https://www.isode.com/whitepapers/xmpp-constrained-bandwidth.html
adityaborikar 15:57:39
The link is on https://xmpp.org/about/myths.html
mathieui 15:58:57
adityaborikar, https://web.archive.org/web/20160310073118/https://www.isode.com/whitepapers/xmpp-constrained-bandwidth.html
adityaborikar 15:59:46
Thanks 😀
ralphm 17:22:27
I think they replaced it with https://www.isode.com/whitepapers/low-bandwidth-xmpp.html
Zash 17:22:42
Cool URLs don't change!
ralphm 17:22:52
But it would sure be nice if they kept URLs working.
ralphm 17:23:08
FWIW it seems to be a different article
ralphm 17:23:16
As in newer.
ralphm 17:25:14
I'm sure Tobias knows.
lovetox 17:57:33
pep. i think i remember again
lovetox 17:57:53
first everything previous to mam:2 did not inject stanza-id
pep. 17:58:10
yeah I assumed there was something something mam and stanza-id
lovetox 17:58:11
second many clients used message-ids which were not uuids
lovetox 17:58:16
like 1, 2, 3, 4
lovetox 17:58:56
so yeah if we live now in a world where most clients use uuids as message-ids, and almost every server uses mam:2
lovetox 17:59:05
i think you dont need origin-id
larma 18:26:34
message id and origin-id have different uniqueness guarantees. - XEP-0359 origin-id MUST BE globally unique (at least that's the common understanding of XEP-0359), so if everyone plays by the rules it is globally unique, otherwise it's unique per user if at least that user plays by the rules. - RFC 6120 stanza id attribute on messages can be unique within that stream or globally unique (per RFC 6120 § 8.1.3). There is no MUST in there, so one may argue that no uniquness guarantee is provided, but explicitly no global uniqueness. Interestingly in XEP-0045 v1.31+, there is a rule that MUCs SHOULD reflect the original message stanza id on reflected messages, which implies that there is not even a by-stream uniqueness guarantee on the message id as other MUC occupants can reuse prior message ids and thus cause a collision of message stanza id on a stream (and still be fully standards compliant).
larma 18:31:20
Technically there is no uniqueness guarantees on <origin-id> whatsoever, only <stanza-id> are unique per generating entity (as specified in their by attribute)
lovetox 18:44:57
yes larma, but all you need is unique within an archive
lovetox 18:45:13
and this is guaranted by mam
larma 18:48:03
for mam that's guaranteed through <stanza-id>, not message id or <origin-id>
larma 18:48:44
but there are many other use cases that use a message id (or origin/stanza id)
lovetox 18:54:36
yeah but they are not as important as message deduplication
lovetox 18:55:03
so i always thought if a client does not use unique ids, then this feature will just not work good
larma 18:55:10
LMC changing the wrong message?
lovetox 18:56:08
yes, of course you can implement LMC badly, then this results in very bad stuff
lovetox 18:56:18
but normally you have a timeframe where correction is allowed
lovetox 18:56:37
and even if you decide not, then you should always give the user the chance to see all corrections and original messages
lovetox 18:57:01
so worst is, the user sees weird stuff
lovetox 18:57:54
but most clients use uuids
lovetox 18:58:19
and if you implement some timeframe, like message is only allowed to be corrected within 5 minutes
lovetox 18:58:27
the chance of weird stuff happening is almost zero
lovetox 18:58:39
even with clients who dont use uuids
larma 19:00:43
I feel I don't agree with most of what you wrote...
larma 19:01:33
"user sees weird stuff" is literally the worst thing that can happen, after all MAM is only one utility to help not doing this
larma 19:02:30
"message is only allowed to be corrected within 5 minutes" I'd prefer that LMC was not only for the last message and also allow very old messages (up to days are acceptable for me)
larma 19:03:20
And even if you allow the user to display corrections, usually users won't do that anyway...
lovetox 19:03:23
hey i like it to, but this would depend on a uniqunes guarantee that can never exist
lovetox 19:03:29
you can always only trust the other client
lovetox 19:03:34
and if you trust message-id
lovetox 19:03:38
or some other made up element
lovetox 19:03:52
i mean if you feel better, trust origin-id :)
lovetox 19:04:24
i think this is the downside of a federated protocol
lovetox 19:04:37
make up for it by reducing the damage wrong IDs could make
lovetox 19:05:03
1. timeframe, 2. give user the chance to deactivate lmc for bad contacts
lovetox 19:05:27
3. give full transparancy, dont hide or replace messages
lovetox 19:05:53
thats how i deal with this
lovetox 19:06:03
if you got a better idea, im very interested
larma 19:09:19
origin-id should be unique at least by origin, easiest way to guarantee is to use a proper UUIDv4. If an attacker uses the same origin-id for multiple messages, the later ones can just be ignored completely, nothing to worry about as it is not possible to happen if it's not an attack and attackers are allowed to suffer from their wrongdoings.
larma 19:10:23
But, you can't do the same for message ids as they are not unique by origin, so duplicate message ids are allowed to happen
lovetox 19:11:41
im not following, so you plan that i can correct ANY message i ever sent you whenever i want
lovetox 19:11:50
what is there to attack?!
lovetox 19:11:57
i can already replace any message
lovetox 19:14:10
and just in case we are not talking about the same thing, IDs should always every matched in the database with a (jid, id) tuple
lovetox 19:14:50
there is zero abuse potential other then your contact abuses his own messages he sent you, in that case just stop talking with this dude
larma 19:17:46
Maybe the name "attacker" was misleading here, I meant an entity that does not correctly implement origin-id and generates the same origin-id twice. Those would suffer from degraded usability (easiest is to ignore the messages with duplicated origin-id). However a user sending the same message id twice should not suffer from degraded usability (i.e. the message should be displayed as usual) as this is perfectly standards compliant behavior
lovetox 19:21:52
are we still talking about LMC?
lovetox 19:22:14
if someone sends 2 messages with the same id, and afterwards a correction for that id
lovetox 19:22:24
just take the more recent one
lovetox 19:22:52
i dont need origin-id for that and i dont need to ignore a message
larma 19:27:32
There is no reason why message correction should always reference the latest instance of a message id, the user might have intended to do something different. Message correction might be a bad example as those always originate from the sender and usually the same client (though this is not necessarily the case). However if you pick other usages of ids, you'll even get more problems. Like message attaching might attach to the wrong message or something
larma 19:29:44
If you use the message id for message attaching (as suggested as a fallback when there is no origin-id in the XEP) you have the risk of this happening, as a client might not be able to know where to attach that message. So best would be to not allow attaching to messages that don't have an origin-id. But now apparently everyone for some reason tries to not implement origin-id
lovetox 19:30:55
the reason is easy, there are already 2 other ids we manage
lovetox 19:31:11
so if we need a third one, it should be a damn good reason
larma 19:31:34
There is your reason 😉
larma 19:32:18
And: if you don't want to support it on the receving end, at least support it as a sender for those clients that want to use it on the receiving end
larma 19:33:33
If your message id is already generated using UUIDv4, you'd just have to send the origin-id as well and be done with all the work on the sending side (I assume you already store your message id, so you will be able to handle any usages of the origin-id automatically if it is the same)
lovetox 19:39:30
the scenarios you think up seem unrealistic
lovetox 19:39:57
you talk to a contact, that has a client who sends the same message id over and over
lovetox 19:40:13
but for some reason its still so advanced that it supports message attaching
lovetox 19:41:05
so the client should now add origin-id instead of just making his message-id a uuid
lovetox 19:41:29
why cant the XEP not just say, clients who want to support message attaching should generate unique message-id
larma 19:41:58
The client sending the message and the client being able to display the message attaching must not necessarily be the same
pep. 19:42:57
lovetox, yeah I guess that would have the same effect? (mandating that clients generate unique @id)
larma 19:43:10
How do you mandate legacy clients?
larma 19:43:30
They are already out there, we can't change it afterwards.
pep. 19:43:32
If they want to implement XEP-foobar, this XEP can say "if you implement me you need to do X"
pep. 19:43:58
In practice it's a similar solution to origin-id
pep. 19:44:08
I guess XEP-foobar could say "you need to implement origin-id"
lovetox 19:45:12
larma, i dont have a problem with legacy clients having degraded experience
pep. 19:45:40
Same here
lovetox 19:46:01
but yeah message attaching is a bit unique, i have to look into it more
lovetox 19:46:20
LMC is really not a problem at all, with clients that use the same ids over and over
lovetox 19:47:10
i wonder how i would display message attaching on the sender side
lovetox 19:47:22
would i also only link in my own database to the id
lovetox 19:47:39
or would i copy the message im attaching and adding it in some other database column
lovetox 19:48:26
i think i would just copy the message into some "attached_data" column
lovetox 19:48:39
that way i dont even have to care about if the other client uses unique ids
lovetox 19:49:05
if he uses unique ids he sees the correct stuff, if not, he doesnt implement message attaching anyway and will not see it
larma 19:49:57
OK, completely realistic scenario: - On my desktop I am using some legacy client that does reuse message ids and does not support message attaching - On my phone I use a modern client that does support message attaching I send a message from my desktop, the recipient attaches a message to my message, I send another message from my desktop that happenns to have the same message id. I open my phone, the message attachment is attached to the wrong message. Even if the XEP of message attaching would require a unique message id, that would not apply to the legacy client, because it doesn't implement the XEP and the recipient does not know if the message id is unique, so can only guess. If the XEP would require origin-id, it would be missing from the legacy client and thus the recipient would be unable to attach to their message
lovetox 19:51:49
larma i agree origin-id would solve this case
larma 19:52:01
We could also replace origin-id by an empty <origin-id /> to signal that the message id was unique, but we need this information to be transported in the message.
lovetox 19:52:09
just upgrading your desktop client would also
lovetox 19:52:49
hm actually i like that idea, to add an empty element
larma 19:53:01
The desktop client is allowed to not send a unique id as long as they don't implement a XEP that does require them to. And I am pretty certain there are still a lot of clients that don't do unique ids
larma 19:53:38
Maybe we want to add this as on option to XEP-0359
larma 19:54:21
Make 'id' optional on <origin-id> if the id of the surrounding stanza already provides the uniqueness guarantees for <origin-id>
Lance 19:54:23
empty element would reintroduce the problem the origin-id is there to mitigate: MUCs rewriting the message ids. you'd lose the original id iagain n that case
lovetox 19:55:22
Lance, but this was fixed in the MUC XEP
larma 19:55:22
Lance, true for MUCs, but for non-MUCs the 'id' could still be optional
larma 19:55:32
lovetox, it's an optional feature of MUCs
larma 19:55:42
to keep the ID when reflecting
Lance 19:55:46
~~let me know when the jdev follows the latest MUC XEP~~ ✎
Lance 19:55:51
let me know when the jdev room follows the latest MUC XEP ✏
Lance 19:55:59
unfortunately :/
lovetox 19:56:42
Im not sure what your argument is, we have to stay compatible with old outdated software only because it is compliant with some XEP
larma 19:56:44
Introducing new requirements in a MUC is not really possible, there will always be legacy servers, only way would be to not connect to those anymore
lovetox 19:56:54
i couldnt care less about jabber.org
lovetox 19:57:15
if users tell me Gajim doesnt work with jabber.org, i tell them, go use another server
lovetox 19:57:19
not rewrite Gajim
larma 19:58:52
lovetox, > The service SHOULD reflect the message with the same 'id' that was generated by the client, to allow clients to track their outbound messages. If the client did not provide an 'id', the server MAY generate an 'id' and use it for all reflections of the same message (e.g. using a UUID as defined in RFC 4122 [18]).
lovetox 20:00:42
im fine with indicating with origin-id that a client uses a unique id
lovetox 20:00:50
i think thats a fine use for origin-id
lovetox 20:02:21
if you want to change your behavior on that, fine, i think would not have the motiviation
lovetox 20:02:51
and fyi Gajim sets origin-id since forever :)
larma 20:04:10
I wasn't talking about Gajim, I was talking about you suggesting pep. not to do it
lovetox 20:04:54
yes, message attaching is to new, i didnt look into it
lovetox 20:05:11
i stand by my opinion that for all other XEPs, like LMC you dont really need this
lovetox 20:05:54
and i think, all clients who support mam, support unique ids
lovetox 20:06:10
and you would never use a client without mam in a multi client setup
larma 20:06:27
You don't need it for LMC, but need it as soon as it becomes MC without L
lovetox 20:06:50
so yeah i agree there are edge cases like the one you told, but im not sure we should put to much effort into supporting these
lovetox 20:09:33
but adding <origin-id/> to indicate unique ids, is not that much effort :)
larma 20:10:27
I doubt pep. intention was to actually use a different ID for message @id and <origin-id>
lovetox 20:10:34
what i really would like to prevent is adding another db column "origin-id"
larma 20:11:05
I understand your point from a dev perspective 😉
lovetox 20:17:56
omg, jabber.ru does not send the muc subject on join
lovetox 20:18:00
if no subject is set
lovetox 20:18:28
do i interpret this correctly
lovetox 20:18:29
https://xmpp.org/extensions/xep-0045.html#order
lovetox 20:18:43
that a room subject MUST be sent, even if there is none?
Zash 20:18:58
correct
lovetox 20:19:43
i guess users will not join mucs without subject anymore with the next Gajim version :D
Zash 20:21:14
Prosody didn't send empty subjects in some earlier version but it's been fixed for some time. Not sure about other implementations.
eevvoor 20:47:32
Why does jabber.ru not send it? Is is due to the server they use? I thought it is a client issue, so how does it affect jabber.ru lovetox?
lovetox 21:04:27
i think jabber.ru uses some very old ejabberd version
lovetox 21:04:56
so they dont profit from bugfixes
lovetox 21:05:34
eevvoor, the MUC xep mandates that a subject has to be sent, and gajim waits for it
lovetox 21:05:44
and if it never comes, the joining process is never completed
ralphm 21:06:25
That's silly. I've seen Gajim have more issues like that.
eevvoor 21:06:44
ah thus not the server sw is the prob but the missing updates :D. Reminds me of the so beloved ccc server. Badly administrated ...
lovetox 21:06:58
ralphm, whats silly about it?
lovetox 21:07:11
should we expect servers not following xeps now or what
eevvoor 21:07:39
yes lovetox you should support backcompatibility forever!!!elven111! XD
ralphm 21:08:34
lovetox: I agree servers must send it, but waiting on it, even if other stuff for the room comes in, seems silly
lovetox 21:10:16
so i guess self presence is has 110 status code is also silly?
lovetox 21:10:26
i mean if there is other stuff coming in, no need for that
lovetox 21:10:53
subject is the line between history messages, and live messages
lovetox 21:11:10
its the order of events, it indicates that history messages is complete and the join is full completed
lovetox 21:11:30
that means i can tell the user, now you can send messages
ralphm 21:11:33
I understand what it is for
lovetox 21:11:40
and not in between a history message fetch
eevvoor 21:11:47
~~so it carries kin of semantics, lovetox, in your opinion.~~ ✎
eevvoor 21:11:59
so it carries kind of semantics, lovetox, in your opinion. ✏
ralphm 21:12:13
What I don't understand is that it stays in limbo forever
eevvoor 21:12:32
But of course it is nice if the client is robust and works well if the xep is not met, ralphm ?
eevvoor 21:12:48
So to avoid limbo?
lovetox 21:13:07
ralphm, of course i could add a timeout
lovetox 21:13:22
but all clients have to add this workaround
eevvoor 21:13:22
also not nice, in case it is just delayed ...
lovetox 21:13:27
or one server can just send a subject
lovetox 21:13:28
...
eevvoor 21:13:57
many combinations to be considered ...
ralphm 21:14:01
lovetox: this is what I mean. A timeout would take care of not keeping the user in limbo in the face of broken servers
eevvoor 21:14:33
Hm but timeouts can result in strange bugs.
ralphm 21:15:02
Another example: when Slack still had an XMPP gateway, it didn't respond to the iq for private storage. This would also hang Gajim indefinitely.
ralphm 21:15:28
I manually patched Gajim to work around this.
eevvoor 21:16:09
dining philosophers ...
lovetox 21:16:36
working around servers that dont answer IQs, ok
ralphm 21:16:58
Why? That's a MUST too
lovetox 21:17:43
but you still accept it
lovetox 21:17:49
and worked around it
lovetox 21:18:15
i mean answering IQs is one of the most basic rules in all of XMPP
lovetox 21:18:28
if we cant depend on that anymore ..
ralphm 21:19:49
I am more of the 'expect failure' variety.
lovetox 21:20:28
its a fine line
ralphm 21:20:54
And this example was extra tricky, because it is part of the Gajim connect sequence, not a random iq that is sent amongst other things.
ralphm 21:21:22
Of course I also reported it as a serious bug with Slack.
ralphm 21:23:06
And not getting a response for private storage shouldn't block the UI to look like the connection is not ready.
ralphm 21:23:40
So a timeout would have been a better approach, or async handling.
lovetox 21:23:49
yes i agree, its not anymore btw, requesting bookmarks is now independent of connection process
lovetox 21:24:28
only server disco info, roster, and roster delimiter iqs are now in the connection process
ralphm 21:28:57
Good