Would something go horribly, horribly wrong, if a server simply adds a stable/unique stanza ID to any message that it processes?
MattJ
It makes client life hard/impossible, sadly
MattJ
Clients would no longer know when a message is archived or not
MattJ
So they don't know if the id can/should be stored for later querying (e.g. for catch-up)
Guus
I do not like the fact that we're deducing that something is archived by merely detecting the presence of something that's supposed to be an opaque identifier.
Guus
(also, I don't have a better suggestion)
MattJ
Guus: that's why <archived> existed
Guus
What's the impossibility for clients, exactly?
larmahas left
Holger
Hmm I don't quite see this problem. There's no guarantee archived messages will remain in the archive forever anyway.
Holger
And I don't quite see how the info whether a locally stored message is also in the archive helps the client.
Holger
In my book it's fine to add a stanza ID to all messages. It may actually help with non-MAM use cases.
larmahas joined
gengarhas joined
Guus
It'd make my implementation a lot easier...
Andrew Nenakhov
> Would something go horribly, horribly wrong, if a server simply adds a stable/unique stanza ID to any message that it processes?
In short that's the basis of our XEP that we use to ensure message delivery. Works well.
Nekithas left
Guus
Andrew Nenakhov the basis is that something goes wrong if we do (and you found an alternative), or: you do it, and you've seen that nothing goes wrong?
Andrew Nenakhov
Client sends stanza with provisional id, server stamps it with 0359 unique and stable id, sends this id to client as a confirmation.
Andrew Nenakhov
Guus, define wrong )
Guus
Mushroom-clouds on the horizon.
Andrew Nenakhov
We centralize everything to work via server archive. If archive breaks, kaput, yes
gengarhas left
neshtaxmpphas left
Andrew Nenakhov
> Mushroom-clouds on the horizon.
In case of an all scale nuclear weapon attack, im protocols will hardly matter anymore
neshtaxmpphas joined
mabihirhas left
mabihirhas joined
Guus
Andrew Nenakhov I do like to make sure though that IM protocol design decisions do not cause an all scale nuclear weapon attack.
gengarhas joined
MattJ
Holger, you're right (there is no guarantee archived messages will remain in the archive forever)
MattJ
But if it's not in the archive, the client can assume it was purged and it needs to re-fetch the archive
gengarhas left
404.cityhas joined
j.rhas left
rtq3has joined
mimi89999has left
tahas left
tahas joined
Guus
MattJ: how can it fetch a purged archive?
jubalhhas joined
MattJ
Guus, I mean in the sense that old messages are purged
Guus
Mind you that it is Sunday, I'm an idiot, and did not have enough coffee
Guus
I don't understand.
MattJ
Messages in the archive are not kept forever on most deployments
MattJ
The oldest messages are removed after some expiry time (let's say 30 days)
gengarhas joined
Guus
with you so far.
Guus
but why does the client need to re-fetch the archive if that happens?
Guus
need/would want to
jubalhhas left
jubalhhas joined
MattJ
So a client that wants to receive all messages works by continuously remembering the id of the last archived message it received
MattJ
When it goes offline for a couple of days, it will come back online and request all messages since the last id it saw
Guus
if anything, it'd do a massive amount of data transfer only to end up with _less_ local history?
MattJ
With me still?
Guus
yes
MattJ
So now it goes offline for two months
MattJ
The last id it saw is no longer in the server's archive
MattJ
So it performs the query and gets item-not-found
MattJ
so it knows that the message has been expired, and any messages in the archive are messages it has never seen before
MattJ
because they are all newer
Guus
right. And if we'd slap on a stanza-id on every message without archiving, it'd _always_ get a item-not-found, assume its local cache is older than that's on the server, and it'd download all history, every time.
Guus
that's what you're saying, right?
MattJ
Yes
!xsf_Martinhas left
gengarhas left
Guus
I'm guessing that it'd not do this if the server doesn't advertise the MAM feature though.
olihas left
MattJ
Sure
olihas joined
waqas
I feel like the whole "item-not-found means get full archive" thing is a hack. A server could lose a message for other reasons, e.g., storage failure causing recent stuff to be lost, or deletion of specific message due to gdpr, or some bug, etc.
MattJ
waqas, it's not allowed to
pep.
> and it'd download all history, every time.
It would download up to the date it just requested
Guus
waqas I was trying to formulate a similar remark in my head.
MattJ
waqas, it can replace with placeholders if it needs to
pep.
Which may or may not be the whole history
waqas
MattJ: Storage failure isn't something someone can't be allowed to have.
MattJ
waqas, handling storage failure in defined ways is entirely sensible
Guus
sure, but soleely depening on 'item-not-found' based on a last-known ID still seems ... hackish...
Guus
sure, but solelely depening on 'item-not-found' based on a last-known ID still seems ... hackish...
MattJ
Guus, it's defined by the XEP to be this way, it's absolutely not a hack
MattJ
I mean, what else would you guys propose??
waqas
MattJ: Not really. If you lose a disk and restore from recent'ish backup, you'll have a situation where supposedly every recent message would be item-not-found..
Guus
MattJ it's a lot easier to disagree with stuff without having to suggest better alternatives 😉
gengarhas joined
MattJ
waqas, you can't just rewind time like that in most systems without consequences
waqas
Yes, and given that laws of physics disallow "messages can't be removed from archive after acked", a protocol shouldn't rely on that.
j.rhas joined
Guus
what if the client asks for the last-known ID archived by the server?
waqas
MattJ: To be clear, I think a sane recommendation would be if item-not-found, get archive by some timestamp based setup, but trying to get archive from beginning of time is silly in such a case.
pep.
(What I said above?)
Guus
(removed bad idea)
waqas
Yep, listen to pep.
MattJ
Yes, but the server was relocated to a different timezone and the admin forgot to set it to UTC
pep.
Dates don't include TZs? :s
waqas
Almost all popular dbs people use (mysql, postgres) in their default replica settings, when the master node is lost and another takes over (or a restoration from backup happens) will potentially lose recent writes. If the MAM XEP wants to assume that wouldn't happen, I'd consider it pretty silly.
gengarhas left
debaclehas joined
mabihirhas left
ThibGhas left
MattJ
waqas, if you want to write your own XEP go ahead
ThibGhas joined
waqas
MattJ: Do you see the problem I'm pointing out?
Guus
Maybe 'silly' isn't the best classifier here.
Guus
> Yes, but the server was relocated to a different timezone and the admin forgot to set it to UTC
do we need the XEP to account for this?
j.rhas left
frainzhas left
MattJ
Guus, do we need the XEP to account for any of this?
frainzhas joined
Guus
Well, if we can modify it somehow to be more resilient against data corruption, and allow for easier re-use of stanza-id, I think it'd add considerable value.
j.rhas joined
MattJ
waqas, I don't think a server that can't provide a durable store should be able to claim it does
MattJ
There's a simple fix for this, the XEP already has a flag to tell the client that the results are not necessarily persisted
mimi89999has joined
waqas
MattJ: I'm asserting that the vast majority of MAM deployments can't guarantee durability in a disk-lost scenario. Recent writes being lost is a fact of life, you can't spec your way around it without mandating things you have no way to mandate.
MattJ
I look forward to your PR
waqas
Note that I don't think the MAM XEP has to change, just the assumption that item-not-found always means MAM storage was deleted up to that item is wrong.
gengarhas joined
MattJ
So yet another hidden thing for client devs to think about
frainzhas left
frainzhas joined
debaclehas left
waqas
In a world where fsync doesn't necessarily mean data was durably stored, and SQL dbs multi-master replication defaults to async mode (and is rarely used anyway), that's reality.
j.rhas left
jubalhhas left
gengarhas left
Guus
MattJ where in the XEP is the what I called 'hack' described?
Guus
I was looking to see if the exact wording would make me think of hints for improval
MattJ
Guus, it quite possibly isn't
Guus
ah ok.
gengarhas joined
vanitasvitaehas left
vanitasvitaehas joined
Guus
I'd love to be able to add stanza-id's everywhere, without implying that this means that MAM is available.
lumihas joined
Guus
but doesn't service discovery sufficiently guard against that?
MattJ
Adding stanza-id doesn't imply MAM is available
MattJ
Buf it MAM is available, it implies you can't put stanza-id on every stanza
gengarhas left
krauqhas left
krauqhas joined
gengarhas joined
j.rhas joined
gengarhas left
j.rhas left
lskdjfhas joined
gengarhas joined
Guus
I'd like be able to. Is a feasible solution one that allows the client to request the id of the most-recent MAM entry, in order to verify if it has that one in its local archive?
Guus
If the XEP doesn't currently define the 'store the id of the last message, assuming that it is the last ID in your server-sided archive', there might be room for a change like that?
krauqhas left
MattJ
Guus, one of the main premises of the XEP is history sync, this would break it
gengarhas left
MattJ
Forget the message purging issue for the moment
MattJ
If the client records the id of the last message it received, and then later uses this to query an archive, what would you propose it do if the id it happened to remember wasn't an archived one?
Guus
item-not-found
MattJ
and then what?
Guus
Naively (I'm not client builder): I'd see up until what date I'd have a local archive, and retrieve from there.
MattJ
So fetch by timestamp?
Guus
with some wiggle-room, but yes.
MattJ
That way you'll either get duplicates or miss messages
MattJ
And that's not hackish?
gengarhas joined
Guus
Duplicates I can de-dupe with the message ID
MattJ
We could have just built the whole XEP on timestamps instead of ids if we're happy with that
Guus
misses would be bad.
MattJ
It's an ugly hack
Guus
well, let's not rewrite everything just yet - I'm fairly certain you've put way more thoughts into this than I have 🙂
MattJ
This is not something I would accept a rewrite for, for certain
MattJ
The correct fix is to re-introduce a way for the client to know whether the message is in the server's archive or not
Holger
> Buf it MAM is available, it implies you can't put stanza-id on every stanza
Depends on server implementation, no? The server just must be able to respond to the before/after requests.
Guus
so, why can't it ask for the last-recorded message id in the archive?
MattJ
Guus, how does that help?
Guus
what's my last message? do I have this? no: resync everything.
MattJ
Guus, that's broken
MattJ
Just because you don't have the last message in the archive doesn't mean you don't have the first
Holger
E.g. ejabberd uses timestamps as IDs, so it doesn't matter whether the queried ID is archived, before/after still does the right thing.
Guus
resync everything from the last one that you have, I mean.
MattJ
Guus, you don't know what the last one you have is
MattJ
Holger, multiple stanzas with the same timestamp?
Holger
Microsecond accuracy, if you hit that in practice then yes it breaks.
MattJ
Holger, what about clock drift then?
Guus
MattJ how don't you know what your last message is? You can order your local archive chronologically, use the last one?
MattJ
I'm not against using timestamps *in* the id, but it's wrong to use them as the id with no extra logic
MattJ
Guus, the last what? I don't know which ones the server archived
Holger
MattJ: Clock drift across cluster nodes? That would break as well yes.
Guus
Hmm, my parents just walked in. Wife is preparing for 'the stare' again.
gengarhas left
Holger
(Or do you mean clock jumping backwards? That can't happen with our clock except server restarts.)
Guus
Mattj, but if archiving is enabled, you can assume that the messages that you have in ... aah, I don't have the time to further discuss this now, sorry.
Guus
('stare')
Guus
I'd love to pick this up later.
MattJ
Holger, using the system's monotonic clock? or something custom?
Guus
got to go now
Holger
Erlang has a thing that doesn't jump back, not sure how it's implemented.
Holger
Anyway yes this is not the most robust solution against such pathological cases of course (it just has other nice properties). Whatever I just wanted to say that MAM doesn't imply only archives messages have an ID per se.
404.cityhas left
gengarhas joined
Holger
(ejabberd doesn't actually add IDs to non-archived messages, though I keep pondering with it.)
MattJ
Holger, as discussed, things will break (read: get hard/impossible) for clients if you add stanza-id to non-archived stanzas
MattJ
which is not a good situation, and should be fixed
Holger
Maybe I misunderstood the breakage vector. I would've thought things will be fine as long as the server is aware how the non-archived IDs are ordered compared to the archived messages.
gengarhas left
contrapunctushas left
gengarhas joined
waqashas left
Holger
MattJ, just in case you're interested, this sounds like custom clock that (attempts to) adjusts towards OS clock by changing frequency (up to 1%) while avoiding jumps: http://erlang.org/doc/apps/erts/time_correction.html#No_Time_Warp_Mode
MattJ
Fun
Holger
(At the cost of risking incorrect offsets of course, so they warn against doing this.)
MattJ
Holger, the server knowing how to interpret the ids is not really relevant... unless you're saying it should not return item-not-found but quietly accept ids that don't actually exist in the archive
waqashas joined
MattJ
That would cause weirdness with clients that try to fill holes
MattJ
and probably other stuff
Holger
> unless you're saying it should not return item-not-found but quietly accept ids that don't actually exist in the archive
Ah yes that's what I'm saying. IIRC 0059 suggests doing just that (wasn't it even a SHOULD?).
Holger
But I'm on my phone and the sun is shining. Gonna shut up now 🙂
MattJ
Is it too late to start over with MAM?
MattJ
Not using RSM for a start
MattJ
Trying to use existing building blocks has just caused confusion and unintended consequences
gengarhas left
pep.
Well MAM is still experimental :-°
pep.
What about another bump?
MattJ
Everyone would love that
gengarhas joined
pep.
That's a thing I don't like in general. The XEP is still experimental but in reality it's just as if it was almost Final. If you change anything everybody is going to grump
olihas left
MattJ
It certainly still has open issues, as a spec
pep.
Sure. I'm not just talking about MAM, that's how I feel about our specs in general
MattJ
Can't have it both ways
MattJ
Just this morning it was mentioned that XEP-0313 being Experimental is a reason Pidgin doesn't have support
pep.
I'd say that's an issue with developer expectations. If you implement it as experimental, know that it's likely going to change
pep.
(And even more, really, draft, even a final spec can be amended with another spec, so..)
404.cityhas joined
pep.
MattJ, isn't that just an excuse from pidgin devs? :p
MattJ
It's not a viewpoint I share, but I'm biased
gengarhas left
waqas
Are devs expected to implement experimental xeps?
MattJ
If a standard explicitly has a big red warning at the top, and warning or no warning is subject to radical change... if I had a limited amount of free time, would I want to implement it?
waqas
"While implementation of an Experimental protocol is encouraged in order to determine the feasibility of the proposed solution, it is not recommended for such implementations to be included in the primary release for a software product (as opposed to an experimental branch)." — https://xmpp.org/extensions/xep-0001.html#states-Experimental
pep.
waqas, in the meantime, it's a needed feature
pep.
And it's even in the compliance suite..
MattJ
That's the real problem (that experimental or not, it's a needed feature)
pep.
I'd say both these criteria (needed feature / compliance suite) put even more pressure on the XEP to go to draft/final. I'm not saying I like it
andrey.ghas left
pep.
And as you say there are still areas that need to be improved
gengarhas joined
pep.
Maybe there should be a rule that compliance suites can't recommend draft specs. In the hope that people focus/provide feedback on XEPs that are needed
Zash
I thought there was
pep.
Well if there was, MAM shouldn't be in there
pep.
nor carbons? (last call ended but it's still proposed)
atterohas joined
atterohas left
waqas
We should stop calling them "compliance" suites
gengarhas left
krauqhas joined
gengarhas joined
andrey.ghas joined
gengarhas left
tahas left
tahas joined
gengarhas joined
rainslidehas joined
krauqhas left
gengarhas left
krauqhas joined
kokonoehas left
kokonoehas joined
rainslidehas left
debaclehas joined
Nekithas joined
gengarhas joined
gengarhas left
gengarhas joined
!xsf_Martinhas joined
!xsf_Martinhas left
!xsf_Martinhas joined
!xsf_Martinhas left
!xsf_Martinhas joined
!xsf_Martinhas left
!xsf_Martinhas joined
rtq3has left
gengarhas left
alacerhas left
gengarhas joined
alacerhas joined
!xsf_Martinhas left
alacerhas left
alacerhas joined
rtq3has joined
gengarhas left
alacerhas left
alacerhas joined
gengarhas joined
gengarhas left
debaclehas left
rtq3has left
alacerhas left
gengarhas joined
alacerhas joined
gengarhas left
404.cityhas left
gengarhas joined
rtq3has joined
gengarhas left
404.cityhas joined
404.cityhas left
gengarhas joined
gengarhas left
404.cityhas joined
404.cityhas left
gengarhas joined
karoshihas left
gengarhas left
rtq3has left
gengarhas joined
gengarhas left
Tobiashas left
alacerhas left
gengarhas joined
alacerhas joined
gengarhas left
j.rhas joined
gengarhas joined
valohas left
j.rhas left
j.rhas joined
valohas joined
j.rhas left
karoshihas joined
igoosehas left
ThibGhas left
ThibGhas joined
igoosehas joined
rtq3has joined
Tobiashas joined
mikaelahas joined
contrapunctushas joined
gengarhas left
gengarhas joined
404.cityhas joined
404.cityhas left
404.cityhas joined
gengarhas left
gengarhas joined
mimi89999has left
gengarhas left
gengarhas joined
gengarhas left
igoosehas left
mikaelahas left
mikaelahas joined
waqashas left
gengarhas joined
alacerhas left
alacerhas joined
mikaelahas left
rtq3has left
mikaelahas joined
igoosehas joined
gengarhas left
moparisthebesthas left
rtq3has joined
ThibGhas left
moparisthebesthas joined
ThibGhas joined
mikaelahas left
mikaelahas joined
gengarhas joined
gengarhas left
gengarhas joined
kokonoehas left
mimi89999has joined
flow
MattJ, I am not sure if using existing building blocks caused confusion. It appears to me that not clarifying how they are intended to use and are allowed to use (think for example if <before/> and <after/> can be used in the same query) is causing confusion
kokonoehas joined
gengarhas left
j.rhas joined
rtq3has left
gengarhas joined
goffihas joined
MattJ
flow: they can't, the end :)
flow
That is what I would also say, but it is at least underspecified in XEP-RSM