-
jonas’
larma, lovetox, I am of the opinion that '394 is a dead-end and we should instead re-vive a different subset of XHTML, with clearer and stricter rules plus maybe a reference cleanup implementation in JavaScript.
-
Seve
+1
-
MattJ
I used to agree, but these days I see all the problems that come with allowing multiple representations of the body
-
larma
jonas’: why do you think so? I think 394 has great potential because it's extensible and has a properly defined fallback behavior. It's a good design and easy to implement. What's the issues you see?
-
jonas’
larma, I doubt the "easy to implement" part
-
jonas’
it has nasty corner cases (though counting itself is not one of them, since it’s clearly defined to count code points), e.g. what happens if a markup span ends right between two codepoints belonging to one emoji?
-
jonas’
I also don’t particularly like the fallback characters, they are bound to cause annoyances.
-
lovetox
jonas’, thats not a nasty corner case
-
lovetox
you try to find *something* that maybe does not work
-
lovetox
which never happens in real life
-
jonas’
lovetox, it may not be for an emoji, but I don’t dare to say it can’t cause problems in some scripts. Unicode is a strange thing.
-
lovetox
and even if, people would not see as "OMG i dont even can end a span between a emoji
-
jonas’
lovetox, yeah, that’s how you design robust things, not by saying "ah, that’s never going to happen!!"
-
jonas’
lovetox, of course, nobody will complain that in XHTML-IM, they can’t do onload="alert()". But if an attacker does, all hell breaks loose.
-
lovetox
btw i also would love a subset of xhtml
-
lovetox
but i guess everyone can already implement that
-
lovetox
there is no need for a new XEP?
-
jonas’
how can everyone implement that already?
-
lovetox
just implement only, a subset ...
-
lovetox
like everyone does already
-
jonas’
but which subset?
-
jonas’
ok, I don’t want to discuss this with you right now
-
lovetox
the one you care about
-
lovetox
thats why im asking, i can just trim my xhtml impl
-
lovetox
or do a totally new 394
-
larma
jonas’: iirc, the same issue of multi-codepoint emoji being split across multiple spans is not well defined in any popular markup system including HTML.
-
larma
So honestly that seems to be a non-issue. And I can hardly imagine a proper design that would have specific handling and not cause immense developer work as you'd have to tap into the font rendering library which you usually try not to do
-
jonas’
larma, it is irrelevant as long as there’s only a single possible representation of the text. However, with 394, there are two: one with markup applied and one without.
-
jonas’
if the goal of '394 is to avoid different meaning of text with and without markup applied, these corner cases need to be investigated
-
dwd
MattJ, I think the root problem is multiple displays of the body - that is, a message that can mean one thing to one person (or intermediary filtering system) and something else to another. I think that problem is way worse with XHTML-IM and similar, minimal with 393, and 394 represents a reasonable comprmise (assuming it gains the "this was markup tag"; without it should be minimal as well).
-
MattJ
dwd: that's basically my opinion too
-
jonas’
larma, can I convince you to take over '394? ;)
-
larma
I still have my work on "sims 2" and stickers pending ;)
-
larma
Oh, and reactions
-
larma
But beside that, I'd be OK to take it over
-
jonas’
larma, sims 2 the game?
-
larma
No
-
jonas’
oh damn
-
larma
Sims as in 385
-
jonas’
right
-
larma
http://larma.de/xeps/sfs.html#intro
-
jonas’
I like most of that, though I do see some use for mixed content.
-
jonas’
larma, immediate feedback: - allow more than one <file-sharing/> element per message; - add an ID to each file-sharing element so that it can be referenced by future specifications.
-
jonas’
then it would be trivial to build a combination of that + 394 which allows inline images :)
-
emus
> http://larma.de/xeps/sfs.html#intro 👍
-
LNJ
larma: Great work, I really like the changes you made and especially the attaching of new sources. Have you thought about using message fastening for this? And another point that I was also missing in SIMS is that there is no example for including the thumbnail data (BoB would also allow communication of the data via IQs).
-
LNJ
+1 for allowing multiple files
-
Daniel
Looks good on first glance.
-
Daniel
Cleans up the issues I have with sims
-
Daniel
Will take a closer look later in the day
-
Daniel
Oh 2.3 is interesting. especially but not only in a group context. would be interesting to see if and how succesful that will actually get implemented
-
Daniel
but good downwards comptability to the x-oob+body method
-
Daniel
and will solve the weird standstill we currently have with SIMS
-
larma
jonas’: mixed content only makes sense for media files, you can't mix a random binary with normal text. So mixed content is intentionally out of scope there.
-
dwd
"Here's the PDF you wanted" - seems pretty useful to me.
-
dwd
The other problem that we run into is "Here is the current COVID-19 protocol" - we want later "hits" on that file identifier to get the latest version, ideally.
-
larma
dwd: you can still send message and file and files can have descriptions
-
dwd
Yes, you can (and that's what we do), but it means a search for the message doesn't naturally locate the file.
-
larma
If you use file description it could.
-
dwd
Yes, true. But that means the file description would end up being used as a message, if you're not careful. Depends very heavily on the UI; a file sharing extension on iOS, for example, is likely to end up using a description, whereas sharing a file inline to a chat is more likely to be showing a message+attachment kind of metaphor.
-
larma
Also searching for files is inherently a huge problem. For PDFs you'd want to actually search the content of the file as well. For pictures this would also be great but you'd need to OCR + image detection which is not easy.
-
dwd
Well, that's another problem entirely.
-
dwd
But anyway, I think overall, our greater problem is (essentially) versioning rather than the message+file case.
-
larma
I totally agree this XEP is not to solve all possible scenarios. It intentionally does not replace sims, but provides an alternative that is more like an evolution of what we are doing now (oob)
-
larma
Versioning to me seems like a total sidecase. You are typically also not versioning messages (even with lmc, you don't edit messages from weeks ago)
-
larma
If you need file versioning, oob seems like a sane approach to me
-
dwd
I know you're not versioning messages. But it would be nice if this could just share, essentially, a link in the same format.
-
dwd
Which ought to be a simple matter of having much of the file metadata element optional, and possibly a mechanism [given some stable identifier] of finding the current detailed metadata.
-
larma
My suggestion for versioning files if you want to stick with xmpp only would be to put the file metadata on a pubsub node and then just send a reference to that pubsub node in the message. You can update it (including getting a history) at any time and have a pointer to the actual file content (as http link) in it. Thereby you get proper versioning with possibility to fetch historic file versions.
-
larma
But that's totally overkill for the one shot file sharing I believe most users want.
-
dwd
Oh, sure, we *could*. But then the client has to be able to understand that.
-
larma
You are talking about a niche feature. It's not going to be implemented by all clients, no matter how bard you push, so better to keep the basic feature alone for those only interested in that.✎ -
larma
You are talking about a niche feature. It's not going to be implemented by all clients, no matter how hard you push, so better to keep the basic feature alone for those only interested in that. ✏
-
dwd
Yeah, sure, but I'm suggesting that allowing some metadata to be optional means versioning is then supported (alongside named URL sharing, actually) but in a uniform format that will "just work" for most receiving clients.
-
dwd
So I don't *need* to push on receivers to support anything.
-
larma
Well, you'd need to get rid of the file hash at least which means the file is not authenticated through the message anymore (assuming the message is) and you also can't make use of the 2.3 feature
-
jonas’
larma, I always find it annoying when I have to write text messages separately from the media or files I send, since the file upload typically happens asynchronously.
-
jonas’
so I can’t know when my text message arrives related to the blob
-
jonas’
which is meh, also for the flow of reading on the receiving end
-
!XSF_Martin
Yeah, the way other messengers do it is nice.
-
!XSF_Martin
Where the image and your comment are one message.
-
eevvoor
!XSF_Martin how do they do it exactly?
-
eevvoor
Are the technical details known?
-
!XSF_Martin
On the protocol level no idea. Threema and WhatsApp are closed source. Maybe signal does this too? Then one might have a look how they do it.
-
MattJ
It's not exactly rocket science to put text and a URL in a single message :)
-
Zash
Inline vs attached vs singletons?
-
!XSF_Martin
Probably they have some uploaded file url in some field and the text/body in another.
-
dwd
MattJ, Yes, but what about a series of hashes in multiple algorithms?
-
dwd
Also, yes, that problem of slow links and file/image uploads in a problem we have to solve as well.
-
MattJ
In MIX the messages are in the participants' archives... does the MIX server *also* keep an archive? Would clients ever query it?
-
jonas’
yes, always
-
jonas’
because we don’t have a reliable s2s sync protocol, so the user’s archive is bound to be incomplete.
-
jonas’
also for history before you joined
-
MattJ
So why store them in the user's archive then?
-
jonas’
no idea :)
-
jonas’
(that was a bit of a snark. Actually, having them in the user’s archive is very convenient for the user and we should maybe see if we can fix the s2s sync)
-
dwd
MattJ, Yes, for example if a MIX channel were used as a pager replacement, someone newly allocated to the pager will query the archive to find previous messages in a conversation.
-
MattJ
Sounds simple™
-
MattJ
from a client perspective
-
dwd
Dealing with S2S sync is certainly a good problem to tackle, BTW.
-
Zash
S2S MAM?
-
jonas’
for certain definitions of "good"
-
Zash
"sync"... ugh
-
dwd
Well, reliability.
-
dwd
I think we don't want to be tackling more than dropped connections. Sustained disconnected-mode operation for S2S isn't something that's worth tackling - things like FMUC handle those kinds of cases.
-
Zash
So do we need more than s2s stream management?
-
dwd
Zash, Probably not, actually.
-
jonas’
Zash, yes, because you’ll miss messages while the server is restarting
-
jonas’
unless you can persist s2s state && the remote will hold their state long enough so that your kernel upgrade + 10 minute reboot time is covered
-
dwd
jonas’, So do we need an outgoing buffer and retry semantics rather than bound-with-error?✎ -
dwd
jonas’, So do we need an outgoing buffer and retry semantics rather than bounce-with-error? ✏
-
jonas’
dwd, I don’t think that’d be a good idea
-
jonas’
maybe s2s SM + pull-based MAM sync from MIX-enabled user-servers after a server restart would be sufficient?
-
Zash
MUX-side delivery tracking ?
-
Ge0rG
what about a thing that's somewhere in between 0198 and MAM; keep a queue of "important" stanzas, use unique IDs instead of counter values, resync after session setup.
-
jonas’
dwd, problem with the outbound queues is their size will be limited at some point, pushing the problem only back. A highly active MIX could exhaust that limit during your kernel upgrade.
-
jonas’
though SM and outbound queues are still problematic anyways...
-
jonas’
so in fact, a server would have to do pull-based MAM sync whenever it is not able to SM-resume with the other side, no matter the reason.
-
jonas’
look, that sounds like what clients do!
-
Zash
MIX is server side MUC?
-
dwd
jonas’, For all its users?
-
jonas’
and then the server would have to both sync the messages into MAM *and* replay them live to already-connected clients (which may have synced with the *local* MAM already and think they’re up-to-date) *and* queue and delay any *live* messages from the MIX so that everything arrives in order
-
dwd
jonas’, And you think this is beter scaling than a 0198 queue?
-
jonas’
dwd, I think this serves a different purpose than a '198 queue
-
dwd
jonas’, Is that purpose to make everyone's life harder? If so, mission accomplished. :-)
-
jonas’
dwd, the purpose is to achieve reliable message delivery
-
jonas’
I’m not sure how you’re going to achieve that with '198 alone. It hasn’t sufficed for c2s, it won’t suffice for s2s either.
-
Zash
Ugh, sync :(
-
jonas’
dwd, there are two key guarantees which need to be held which make this very hard: - In-order message delivery from the MIX to the client - No insertions into the middle of the user’s MAM archive
-
Ge0rG
Can't we just have forever-persistent 0198?
-
jonas’
and this is why Ge0rG (sorry for putting words in your mouth again) and I have been saying that the user’s local archive is a terrible idea and only going to cause pain.
-
Zash
Can't we just embrace fast delivery or failure notification?
-
MattJ
Notify the MIX that delivery failed
-
Ge0rG
Zash: there was a one-message thread on message errors some time ago...
-
Ge0rG
MattJ: and then the MIX can kick the user out! Win-win!
-
MattJ
Sounds like a plan
-
Ge0rG
And when the user rejoins, they just do a full sync!
-
Zash
Message attachments in the form of delivery statuses?
-
MattJ
I knew MIX could solve all the problems of MUC in the simplest possible way
-
Ge0rG
Or you just add a tombstone to all local user archives whenever s2s fails
-
dwd
Ge0rG, Put in a tombstone for every missed message. Seems legit.
-
Ge0rG
dwd: but you don't know which / how many messages you missed!
-
dwd
Ge0rG, You would if you had tombstones.
-
jonas’
dwd, how is the server to know how many messages it missed?
-
dwd
jonas’, Because of the tombstones. I fail to see how you can fault my logic here.
-
Ge0rG
dwd: aaah, right! The tombstones! It's obvious to me now!
-
jonas’
sorry, my sarcasmometer is out-of-service due to the heatwave
-
dwd
jonas’, Storm Ellen here, my smilies have been blown away.
-
Ge0rG
surely just a random weather phenomenon not related in any way to ocean heating or the CO2 amounts in the atmosphere
-
jonas’
Ge0rG, surely.
-
jonas’
Ge0rG, ThiS iS a SAfE sPaCe gO AWaY wiTH yOUr ClIMAtE ChaNGE IdeOLogY!
-
jonas’
s/ChaNGE/CaTAsTrOPHy/, too
-
Ge0rG
</trigger-warning>
-
Holger
> and this is why Ge0rG (sorry for putting words in your mouth again) and I have been saying that the user’s local archive is a terrible idea and only going to cause pain. I'm saying that all day long (can't remember saying anything else since I first heard of MIX) and would still very much prefer to keep this feature optional.
-
Ge0rG
Holger: but you are not the XEP author.
-
MattJ
Still, if we have community consensus that this design is flawed, we can still change it, right? If the MIX has an archive anyway, clients just need to query that instead
-
MattJ
...right...?
-
MattJ
Relatedly: https://framapiaf.org/@debacle/104713005724817353 (thanks debacle)
-
Kev
The only reason for MIX to need to be in the user's archive is for search.
-
Kev
(From memory)
-
Kev
Well, bandwidth too, but mostly search.
-
MattJ
If that's the case, I'm happy with that
-
Holger
MattJ: At least during the Summit it seemed to me the consensus is to have user archives. But yes I'd think we should just have clients check a feature (IIRC there is one in MIX-PAM) to decide which archive to query. Increases complexity on the client side which I'm usually all for avoiding at all cost, but seems the least evil to me in this case.
-
Kev
If someone could come up with a decent solution to search it would be nice to drop it.
-
Kev
(Ok, I think I'm up to three reasons - search, bandwith, and persistence)
-
MattJ
Yeah, I think I'd rather tackle the search problem than turn the current architecture on its head and face a whole set of other problems
-
Holger
Kev: I also remember scalability arguments, i.e. the case where the client is joined to thousands of rooms.
-
Holger
And persistence, yes.
-
Holger
I get all that but I think we'd need to solve a couple of problems before at least I would be able to implement user archives. And it would be nice if that wouldn't block MIX.
-
Kev
You really don't want to be in a room that keeps history for a day, come back two days later and not see the replies to your messages.
-
Kev
It's one of those 'no ideal solutions' things, I think, that just hurts because of federated architectures.
-
Holger
Right now the user server would either duplicate to death or have to implement black dedup magic. Plus the sync issues.
-
Kev
User archives seemed like the least bad solution.
-
jonas’
Kev, so a user archive which seems complete, but has gaps nobody knows about is better than an archive which tells you "sorry, I don’t have your newest message, there is a gap here"
-
Holger
So I think right now it's a horrible combination of the downsides of Matrix with those of XMPP.
-
jonas’
that logic seems flawed to me
-
Holger
I think.
-
Kev
jonas’: Dear Strawman, love...?
-
eta
so, the thing I never got with XEP-0045 is
-
jonas’
Kev, I can’t follow, sorry
-
jonas’
my english fails me
-
Kev
jonas’: You're presenting a position that isn't mine, then arguing that it's wrong, and therefore I am.
-
eta
why can't the server just keep a log of what was said after a resource left, persist that, and then replay it as join history?
-
jonas’
Kev, sorry, not my entention, maybe I missed something
-
Kev
I don't think user archives should have gaps in them, which is why we needed the sync logic between MIX and User archive.
-
Holger
eta: It could. MAM is just nicer because the paging.
-
jonas’
Kev, ok, I missed that, sorry
-
jonas’
ignore me :)✎ -
jonas’
ignore me & carry on :) ✏
-
Kev
It was a discussion at the Summit about how we needed to ensure we could detect holes in the user's view of the MIX archive, and plug them.
-
MattJ
eta, paging and tracking is harder than you think (usually people "leave" the MUC long after they lost their connection)
-
Kev
Maybe it was a side-room discussion, I forget at this point.
-
jonas’
I might’ve missed that discussion :/
-
eta
MattJ, ah right, reasonable enough
-
jonas’
eta, also unbounded storage requirements on the server side. What if the resource never comes back?
-
eta
jonas’, well I guess you'd need some cap
-
eta
but nvm, I'm convinced MAM is probably ideal
-
jonas’
eta, also, MAM-MUC is pretty much that, except that the client has to explicitly ask ;)
-
MattJ
Just like with MAM. In fact Prosody (and I'd be surprised if not other servers) uses the MAM archive to fulfil MUC history requests these days
-
Holger
BTW 0369, 7.2.1 says the user's server MUST archive, 7.2.2 says it MAY?
-
dwd
Holger, Take the average, it's a SHOULD.
-
Holger
🙂
-
dwd
Although if it says MUST twice and MAY once, it's a REALLY SHOULD.
-
Holger
Makes sense.
-
Holger
I'm okay with MUST archive as long as I MAY apply an arbitrary expiry period of 0 or more seconds.
-
jonas’
:>
-
jonas’
Holger, you are aware that you can’t do that as a user server, right?
-
jonas’
not without purging all the other messages too
-
jonas’
due to how MAM works ;)
-
Holger
Well with our specific implementation (stanza ID being a timestamp) things wouldn't break.
-
jonas’
holes in MAM are forbidden
-
Holger
Yes my statement was just that "things wouldn't break".
-
jonas’
only with an expiry of exactly 0s implemented by not visibly storing the messages
-
jonas’
otherwise you have to violate '313 by not returning the correct response for an ID not in your archive.
-
jonas’
> If any UID requested by the client in any of the 'before-id', 'after-id' or 'ids' form fields is not present in the archive, the server MUST return an item-not-found error in response to the query.
-
Holger
Yes that's what I'm doing.
-
jonas’
Shame on you! Violating a MUST!
-
jonas’
(also, yes it is friday)
-
Holger
Wasn't in earlier revisions IIRC, and isn't in 0059. We have generic 0059 code to do 0059 with MAM and other things, and I'm not keen on "if MAM then this else that" special casing.
-
jonas’
nothing to do with '59 that bit of text
-
jonas’
gotta go now tho
-
Holger
Not sure how to parse that. 0059 says what to do if a UID requested is not present in the archive. 0313 says something else on the same topic.
-
Holger
Generic implementation is one reasoning, the other is avoiding the additional SQL query on each and every MAM request.
-
MattJ
The error is to allow clients to detect gaps in the sync
-
MattJ
if that's not obvious
-
Holger
I understand the reasoning but that doesn't make me like the solution ;-)
-
Guus
Does anyone know an administrator for jabber.cz?
-
Guus
They might want to review their user using smash55 for its username
-
dwd
What would things look like iof MIX messages did not go into the user's archive ever, and instead were fetched on demand - could this be managed by the server or would it have to be managed by the client?
-
Guus
dwd: re your question on twitter: try Greg and Dele
-
!XSF_Martin
Guus: What's the issue with that useername?
-
Guus
!XSF_Martin: it is the point of contact that's advertised in spam messages
-
!XSF_Martin
Ah I misinterpreted > review their user … for its username So I thought the username is the problem itself. Sorry, not a native speaker.
-
dwd
Guus, Oh, good call. Though i think that's maybe the wrong bit of BT entirely.
-
Guus
dwd: could be, but I believe they've both moved around
-
Guus
Worth a shot