
Martin
From the Ox XEP 373: > The <signcrypt/> and <crypt/> elements SHOULD furthermore contain a 'rpad' element which text content is a randomlength randomcontent padding. Are there any best practices about the length range for rpads?

Martin
Mine is in the range of 20 to 49 chars, but as long as it is random I think I could just use a 0 to 10 chars rpad and don't add unnecessary bloat.

flow
Martin, depends on your paranoia level

flow
flow
For example, you could calculate the length of the pad so that a certain minimum total length is guaranteed ✏

Zash
beware statistics!

Martin
> For example, you could calculate the length of the pad so that a certain minimum total length is guaranteed Yes, but is 20 to 40 chars better than 0 to 20 chars? I don't think so, but I'm no cryptography expert.

Zash
I feel like the thing is to pad up to the next multiple of X, but best ask some cryptogopher about how to safely use padding

flow
Martin, the idea is that you take the actualy payload length into account when calculating rpad's length

flow
There are near endless possiblities how to determine rpad, and given that most cryptographic messaging systems don't even have a thing like rpad, it is potentially not super important, but still nice to have

flow
especially in IM communcation where the length of the reponse may provide some insights to an outside observer

Zash
What's it for here?

Zash
AIUI you can counteract padding meant to hide the length of a message using statistics

flow
primary to conceal the length of the plaintext

flow
Zash, I am happy about some pointers to reserach in that direction

Zash
"yes".length+rand(10) > "no".length+rand(10) given enough samples, that kind of thing

Zash
I've got no pointers, sorry. RFCs for TLS &c might have references

Zash
Wikipedia! ðŸ™‚

flow
the one true truth

flow
:)

thomaslewis
Couldnâ€™t you randomize the argument to rand()? ðŸ¤”

pep.
Then one would "just" have to run stats with the same method? :p

thomaslewis
Well, if the padding is of random length and content, it would make statistical analysis near meaningless, no?

jonasâ€™
incorrect

jonasâ€™
the point of statistics is to get signal out of noise :)

Martin
So rpadlength=messagelength%100 would make sense?

Martin
Instead of using random length.

Zash
Martin, that's what I think is the sensible thing.

jonasâ€™
Martin, there is a `100` missing in that, but yes.

jonasâ€™
`rpadlength = 100  (messagelength%100)`

Zash
that reduces the available entropy, right?

Martin
Why? Using modulo 100 would make messagelength + rpadlength always a multiple of 100 or am I wrong?
 Zash throws some math and statistics words around

jonasâ€™
Martin, you're wrong

Martin
Oh yeah

jonasâ€™
len = 120, 120 % 100 = 20, 120 + 20 != 200

Zash
if you pad or crop all messages to the same size, that would leak the least data, right?

jonasâ€™
yes

jonasâ€™
requires you to pad them all to the maximum stanza size though, otherwise you lose data, obviously :)

Zash
indeed

Zash
hence the pad to multiple of X

jonasâ€™
I'm still trying to figure out how much entropy that gives you

Zash
depends on ... distribution of message sizes, no?

jonasâ€™
possibly

jonasâ€™
jonasâ€™
also possibly on the maximum message size ✏

jonasâ€™
another way to think about it: it conceals log_2(modulus) bits of the real message length

jonasâ€™
(the logic of which is obvious, if you assume modulus = 2^n (i.e. a power of two); the message length L is a kbit number. if you pad to a multiple of 2^n, the new message length is L' = m*2^n, with L+2^n >= L' >= L. multiplication by 2^n is identical to left shift by n bits, hence the lowest n bits of the message length are zero, hence the lowest n bits are concealed).

Martin
ðŸ˜³