bencoding - Bencoded string length in java -


i bit confused bencoding.

according specification when bencode string need use following format:

length:string

string spam becomes 4:spam

my question: 4 qty of symbols of bencoded string, or qty of utf-8 bytes?

for instance, if going bencode string gâteau

what number should specified length of string?

i think have specify 7, , final form should 7:gâteau

it because symbol â took 2 bytes accoring utf-8 encoding, , rest symbols in string took 1 byte according utf-8 encoding.

also heard not recommended store bencoded data in java string instance.

in other words, when bencode data block, should store byte array , should not convert java string value avoid encoding issues.

are assumptions correct?

according specification, bencoded string sequence of bytes, , have specify qty of bytes sequence it's length.

and, specification: "all character string values utf-8 encoded".specification

and case "gâteau" should specify 7 length, because character â takes 2 bytes.


Comments

Popular posts from this blog

javascript - Using jquery append to add option values into a select element not working -

Android soft keyboard reverts to default keyboard on orientation change -

jquery - javascript onscroll fade same class but with different div -