Serialization
Serialization of consensus-relevant data types
Amounts
Amounts may take any integral value between 0 and 2^64 - 1. Amount serialization takes into account the fact that assets are often manipulated in quantities that are multiples of powers of ten.
Amount serialization has the following properties:
Every value with 3 or fewer significant decimal figures has a two-byte representation.
Every value with 8 or fewer significant decimal figures has a four-byte representation.
Every value less than 2^56 has an eight-byte representation.
Every value up to 2^64 - 1 has a nine-byte representation.
This compares to an uncompressed uint64 occupying eight bytes.
Two byte/16 bit representation: up to 3 significant figures
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | Value | Notes |
0 | 1 | 1 | 1 | 1 | 1 | 0 | x | x | x | x | x | x | x | x | x | NaN | "quiet" NaN |
0 | 1 | 1 | 1 | 1 | 0 | x | x | x | x | x | x | x | x | x | x | Infinity | Or overflow, underflow. Causes deserialization failure. |
0 | a | b | e | e | e | t | t | t | t | t | t | t | t | t | t | tttttttttt_2 * 10^(abeee_2) | Subject to ab != 11 (base 2) |
0 | 1 | 1 | a | b | e | e | e | t | t | t | t | t | t | t | t | 100tttttttt_2 * 10^(abeee_2) | Subject to ab != 11 (base 2). This is a non-canonical representation (the client will not make it). |
Four byte/32 bit representation: up to 8 significant figures
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | Value | Notes |
1 | 1 | 1 | 1 | 1 | 1 | 0 | x | NaN | quiet NaN |
1 | 1 | 1 | 1 | 1 | 0 | x | x | Infinity | Or overflow/underflow. Deserialization error. |
1 | 0 | e | e | e | e | t | t | ((tt_2)*2^24 + L) * 10^(eeee_2) | 'L' is the unsigned big-endian value of the next 3 bytes. |
1 | 1 | 1 | a | b | e | e | t | ((10t_2)*2^24 + L) * 10^(abee_2) | Subject to ab != 11 (base 2). This is a non-canonical representation (the client will not make it). |
1 | 1 | 0 | e | e | e | e | t | ((10t_2)*2^24 + L) * 10^(eeee_2) | Canonical representation of values with large significands (>= 2^26 and less than 10^8). |
Eight byte/64 bit representation: up to 2^56 - 1
Highest byte | Seven lower bytes | Value | Notes |
0x7E/0x7F | L | L | 'L' is a 7-byte unsigned big-endian value |
Nine byte/72 bit representation: at least 2^56
Highest byte | Eight lower bytes | Value | Notes |
0xFE/0xFF | L | L | 'L' is an 8-byte unsigned big-endian value |
Examples
Assets
Assets have a version, a contract hash and a subtype. The version and contract hash together form the contract id. The version may take any uint32 value, the contract hash any 32-byte array value, and the subtype any 32-byte array value.
As a special case, the Zen native token has version 0, contract hash with all bytes set to zero, and subtype with all bytes set to zero.
Asset serialization is optimized to represent the Zen native token efficiently, as well as to efficiently represent assets with low version numbers and with subtypes that have many trailing zero bytes. As the subtype is under the control of the contract generating the asset, contract writers can gain some efficiency for the assets their contracts make, by giving them short subtypes.
Serialized assets use between one and 65 bytes. The first byte uses two bits for signalling, and six to encode the version. Between zero and four more bytes encode the rest of the version, followed by either zero or 32 bytes to represent the contract hash, followed by between zero and 32 bytes to represent the subtype.
First byte
7 | 6 | Rest of byte |
Uncompressed representation? | Subtype present? | Version or upper part of version |
First byte (base 2) | Version | Contract Hash | Subtype | More bytes? | Notes |
00000000 | Zero | Zero | Zero | No | Zen native token |
10000000 | Zero | Non-zero | Zero | Yes, 32 more | Contract's default asset |
11000000 | Zero | Any | Non-zero | Yes, 64 more | Uncompressed subtype (32 bytes). Canonical iff the uncompressed subtype has 0 or 1 trailing zero bytes. |
01000000 | Zero | Any | Non-zero | Yes, between 34 and 63 more | Compressed subtype (< 32 bytes). Canonical iff the uncompressed subtype has at least two trailing zero bytes. |
xy0abcde | abcde (base 2) | As above | As above | Only if x <> 0 or y <> 0 | Represents versions between 0 and 31. |
xy1abcde | >=32 | As above | As above | Yes | Represents versions greater than or equal to 32. See below. |
Version bytes
If the version is between 0 and 31, inclusive, there are no additional version bytes. The third most significant bit of the first byte is set to 0, and the lower five bits represent the version.
Versions are serialized by an algorithm similar to that used for protocol buffers' varint
type. Between one and five bytes are used to represent the 32-bit version, including the first byte described above.
Bit patterns
Version range | Big-endian uncompressed version | Bytes |
0 <= v < 32 | 000xxxxx | ??0xxxxx |
32 <= v < 2^12 | 0000xxxx xyyyyyyy | ??1xxxxx 0yyyyyyy |
2^12 <= v < 2^19 | 00000xxx xxyyyyyy yzzzzzzz | ??1xxxxx 1yyyyyyy 0zzzzzzz |
2^19 <= v < 2^26 | 000000xx xxxyyyyy yyzzzzzz zwwwwwww | ??1xxxxx 1yyyyyyy 1zzzzzzz 1wwwwwww |
2^26 <= v | xxxxyyyy yyyzzzzz zzwwwwww wvvvvvvv | ??10xxxx 1yyyyyyy 1zzzzzzz 1wwwwwww 1vvvvvvv |
The two most significant bits of the first byte, marked above as ??
, signal the type of compression used for the contract hash and subtype.
Remaining bytes
Zen native asset – contract hash and subtype both zero
There are no further bytes. The top two bits of the first byte are set to zero.
Default contract asset – subtype is zero
The two first bits of the first (version) byte are set to 10
. After the version bytes, the next 32 bytes encode the contract hash.
Uncompressed subtype – subtype has at most one trailing zero byte
The first two bits of the first byte are set to 11
. After the version bytes, the next 32 bytes encode the contract hash, and a further 32 bytes encode the subtype.
Compressed subtype – subtype has at least two trailing zero bytes.
The first two bits of the first byte are set to 01
. After the version bytes, the next 32 bytes encode the contract hash. One byte encodes the 'size' of the subtype – i.e., 32 minus the number of trailing zero bytes. A further size
bytes encode the leading bytes of the subtype.
Examples
Versions are written as integers (decimal). The contract hash and subtype are written as arrays of bytes, of length 32.
Last updated