-
lovetox
whats the strategy on preventing duplicates on mam fetch vs life messages in MUCs?
-
lovetox
Previously i simply checked for each message if the stanza id is already known in the database
-
lovetox
but lately we learned that biboumi reuses stanzaids
-
Guus
'regular' stanza IDs, or XEP-0359 stanza IDs? iirc, MAM states that unique archive IDs MUST be used server-sided, so hopefully you can use the XEP-0359 ones?
-
Guus
if you're XEP-0359 stanza IDs are not unique (per archive), well, ... then... err.... let me put it this way; people once suggested to me that I start a career in potato farming instead.
-
lovetox
yes it reuses xep-0✎ -
lovetox
yes it reuses xep-0359 ids ✏
-
lovetox
as this is the most used irc bridge, and its not a very active project at the moment
-
lovetox
i really dont want to break it
-
singpolyma
They reuse mam IDs to the same user for unrelated messages? That's far to broken to be worth a workaround imo, I wonder how it works at all
-
Zash
Wasn't it multi-line messages broken into many?
-
lovetox
if you send a message which is over the irc protocol limit, they split the message, and send you back 2 messages with the same stanza id
-
lovetox
also same message-id
-
singpolyma
Well, that's a bug and I suppose unexpected things will happen
-
Zash
It's a feature!
-
singpolyma
lol
-
lovetox
what unexpected things .. if you dont have a deduplicating mechanism on the stanza id
-
lovetox
everything works ..
-
singpolyma
But as you say you murt have such a mechanism if you do mam sync with muc✎ -
singpolyma
But as you say you must have such a mechanism if you do mam sync with muc ✏
-
lovetox
my question was not really about biboumi, i just wanted to know in general how you deal with getting live messages while you fetching history from mam
-
lovetox
singpolyma, do you depulicate on stanza-id?
-
singpolyma
I use mam id as primary key, so yes
-
lovetox
ok, so at least in the irc case, we could say this happens very rarley, because you need to type messages over the protocol limit
-
MattJ
The stanza-id MUST be unique
-
singpolyma
Yeah, this is a bug in biboumi that needs fixed
-
MattJ
Previously I assumed biboumi was only showing this behaviour on the @id
-
singpolyma
On the @id it's a bad choice but *maybe* not a bug
-
singpolyma
On mam id it's a bug
-
lovetox
yeah .. i cannot work around this without breaking other stuff ..
-
lovetox
its just too much of a hassle
-
singpolyma
I'm not sure there's even any workaround possible
-
lovetox
you could special case behavior in your code when joined a IRC channel
-
lovetox
but yeah dont want to do that
-
singpolyma
To what? Show dupes with live messages instead of dedupe?
-
lovetox
yes, having a duplicate on that rare occasion where you send messages over 512 bytes
-
lovetox
is probably less of a problem, then missing messages✎ -
lovetox
is probably less of a problem, than missing messages ✏
-
singpolyma
You'll only miss trailing parts of messages from yourself
-
singpolyma
The other way could get dupes in every join since no dedupe in place
-
lovetox
really ? i didnt try how biboumi sends this message to other participants, but i would assume also with doubled stanza id
-
singpolyma
I didn't test either, but biboumi does not have multiple participants really. There is no shared state. Seperate mam storage etc
-
singpolyma
Other participants see only what came over their own IRC connection
-
lovetox
there is one storage per room or not?
-
lovetox
if mutiple xmpp users are connected to the same bridge service, they will all draw messages from that storage
-
Zash
no
-
singpolyma
Onebstorage per user per channel✎ -
singpolyma
One storage per user per channel ✏
-
lovetox
ok, so this only affects messages i send
-
singpolyma
Yeah, I expect so
-
lovetox
then i could replace the stanza id with something different, only in the case i find a duplicate on messages i sent, or something
-
singpolyma
But how do you know it's not just a duplicate?
-
lovetox
as said there is probably not a perfect way, but one duplicate in this rare case is the lesser problem
-
lovetox
i could also limit messages to irc channels to 512 bytes
-
lovetox
so the user does not get into this situation at all
-
singpolyma
And also limit newlines? And any other rules this IRC server might have
-
lovetox
i tested newlines, funny thing is the newline split results in a message with a new stanza id
-
singpolyma
Wut
-
lovetox
its only the split inside one line
-
singpolyma
So it really is a small bug. Should be not hard to fix on biboumi side
-
singpolyma
This explains why such long messages always look broken to me in every client I use
-
lovetox
https://pastebin.com/3hCjfnir
-
singpolyma
Because they are 😛
-
lovetox
here a sample exchange
-
lovetox
first stanza is sent message outgoing, a really long text with multiline and long lines
-
lovetox
notice how the first returned message has a different stanza id as the second one, and the second and third one are the same again
-
lovetox
seems this is a trivial bug then, it seems to support generating new stanza ids on splitting on new line
-
lovetox
i added a issue https://codeberg.org/poezio/biboumi/issues/3471
-
lovetox
hope some soul that can write C++ will tackle this
-
lovetox
i think i go the route with deduplicating on stanza-id, and limiting user to smaller messages
-
lovetox
to not get into this situation
-
rom1dep
seems to be broken in Conversations also. I just got one instance of this where a long message sent from gajim and that got split didn't have its tail shown on Conversations. That said the MUC reflection is weird in Gajim, because the sent message appeared whole, and then the split messages came repeating parts of it.
-
rom1dep
> This explains why such long messages always look broken to me in every client I use what I was replying to ↑ ↺
-
singpolyma
rom1dep: specifically it turns out to be broken in biboumi
-
rom1dep
Agreed, I was just meaning to say that Conversations seems a bit better than gajim at updating its message history with what is actually received by others (as long as it was sent from that client in the first place)
-
lovetox
as seen in this paste https://pastebin.com/3hCjfnir
-
lovetox
you see the message we sent out with a specific message id, but biboumi returns the first message with a different message id
-
lovetox
its another thing biboumi should *not* do, modifying message ids for no reason
-
singpolyma
Should be the same when the same and different when different. Right now seems arbitrarily broken both ways
-
praveen
singpolyma, Thilo confirms we can indeed go with xep 0389 https://github.com/monal-im/Monal/issues/565#issuecomment-2084344689
-
singpolyma
I think no one knows until it's built :) then I guess we change the xep to match whatever got done
-
praveen
singpolyma, thanks for your feedback, we can add any missing bits to the xep I guess
-
rom1dep
so, I have the impression that the interesting bits about biboumi's line splitting misery happens here: https://codeberg.org/poezio/biboumi/src/branch/master/src/bridge/bridge.cpp#L212 but that doesn't make sense (to me, a layman): on one hand biboumi splits messages by lines (with \n as separator) and stores that (L230) and sends that to XMPP clients (L244) but the actual cutting of long messages (and what ends up being sent to IRC) happens elsewhere: https://codeberg.org/poezio/biboumi/src/branch/master/src/irc/irc_client.cpp#L562
-
rom1dep
so my take is that biboumi isn't faithful to what IRC users see (news to me), and I wonder if that should be fixed before fiddling with IDs
-
rom1dep
oh, now I think I get it (maybe? I'm still not a C++ person), IRC's send_message ( https://codeberg.org/poezio/biboumi/src/branch/master/src/irc/irc_client.cpp#L464 ) takes a callback from https://codeberg.org/poezio/biboumi/src/branch/master/src/bridge/bridge.cpp#L236 which is how the reflection dance works
-
rom1dep
lovetox: would one problem be that the ID on lines 23 and 31 of https://pastebin.com/3hCjfnir is the same? (i.e. d8348140-2773-49a1-97b1-29f8bf992730)
-
lovetox
no 24 and 32 is the problem
-
lovetox
stanza-id needs always be unique per message
-
rom1dep
got it
-
rom1dep
at least I now understand why they are identical
-
lovetox2
but just so you know in Gajim i think i found a fix, you should not experience the problem anymore
-
rom1dep
that's great! I guess it would be even better if every client didn't have to add new quirks
-
rom1dep
I suspect movim also gets pretty annoyed by those judging from repeated weird behaviors of biboumi chans (CC edhelas )
-
singpolyma
Indeed, fixing biboumi is better. Workarounds in clients will just delay the fix
-
rom1dep
can those IDs be arbitrary? Like, could I make it so that 0fb4ee3a-f714-4b09-b092-d0cfc03085be becomes 0fb4ee3a-f714-4b09-b092-d0cfc03085be.1?
-
singpolyma
Yes
-
lovetox
yeah just add a number, close enough :d
-
rom1dep
if I knew how to concatenate strings with ints, we'd have a dirty fix
-
lovetox
does anyone has merge rights on the biboumi repo?
-
lovetox
last time someone told me the maintainer was lost✎ -
lovetox
last time someone told me the maintainer went missing ✏
-
singpolyma
It's been forked under the same name effectively i think
-
rom1dep
I think Link Mauve merged something some months ago?
-
moparisthebest
Yep the whole poezio team I think, the French gang we call them
-
rom1dep
moparisthebest: you know, for a while I was expecting you were French because of what I was reading as "Paris" in your nick :)✎ -
rom1dep
moparisthebest: you know, for a while I was expecting you to be French because of what I was reading as "Paris" in your nick :) ✏
⚰️ 1 -
moparisthebest
> moparisthebest: you know, for a while I was expecting you to be French because of what I was reading as "Paris" in your nick :) ⚰️ ↺
-
rom1dep
moparisthebest: don't tell me I'm the first one to tell you that :)
-
rom1dep
> [100%] Built target biboumi damn, what did I do.
-
rom1dep
https://codeberg.org/rom1dep/biboumi/commit/98fa07fa3e4a1db6971492b38b8790b894e3424b
-
lovetox
👍️
-
moparisthebest
rom1dep: sadly you are not 😁💀
-
wgreenhouse
lovetox: https://codeberg.org/poezio/biboumi is the friendly fork afaik
-
rom1dep
> https://codeberg.org/rom1dep/biboumi/commit/98fa07fa3e4a1db6971492b38b8790b894e3424b don't use this branch, I've yet to work around some weird things ↺