Serialization

Last updated 3 months ago

Serialization of consensus-relevant data types

Amounts

Amounts may take any integral value between 0 and 2^64 - 1. Amount serialization takes into account the fact that assets are often manipulated in quantities that are multiples of powers of ten.

Amount serialization has the following properties:

  • Every value with 3 or fewer significant decimal figures has a two-byte representation.

  • Every value with 8 or fewer significant decimal figures has a four-byte representation.

  • Every value less than 2^56 has an eight-byte representation.

  • Every value up to 2^64 - 1 has a nine-byte representation.

This compares to an uncompressed uint64 occupying eight bytes.

Two byte/16 bit representation: up to 3 significant figures

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

Value

Notes

0

1

1

1

1

1

0

x

x

x

x

x

x

x

x

x

NaN

"quiet" NaN

0

1

1

1

1

0

x

x

x

x

x

x

x

x

x

x

Infinity

Or overflow, underflow. Causes deserialization failure.

0

a

b

e

e

e

t

t

t

t

t

t

t

t

t

t

tttttttttt_2 * 10^(abeee_2)

Subject to ab != 11 (base 2)

0

1

1

a

b

e

e

e

t

t

t

t

t

t

t

t

100tttttttt_2 * 10^(abeee_2)

Subject to ab != 11 (base 2). This is a non-canonical representation (the client will not make it).

Four byte/32 bit representation: up to 8 significant figures

31

30

29

28

27

26

25

24

Value

Notes

1

1

1

1

1

1

0

x

NaN

quiet NaN

1

1

1

1

1

0

x

x

Infinity

Or overflow/underflow. Deserialization error.

1

0

e

e

e

e

t

t

((tt_2)*2^24 + L) * 10^(eeee_2)

'L' is the unsigned big-endian value of the next 3 bytes.

1

1

1

a

b

e

e

t

((10t_2)*2^24 + L) * 10^(abee_2)

Subject to ab != 11 (base 2). This is a non-canonical representation (the client will not make it).

1

1

0

e

e

e

e

t

((10t_2)*2^24 + L) * 10^(eeee_2)

Canonical representation of values with large significands (>= 2^26 and less than 10^8).

Eight byte/64 bit representation: up to 2^56 - 1

Highest byte

Seven lower bytes

Value

Notes

0x7E/0x7F

L

L

'L' is a 7-byte unsigned big-endian value

Nine byte/72 bit representation: at least 2^56

Highest byte

Eight lower bytes

Value

Notes

0xFE/0xFF

L

L

'L' is an 8-byte unsigned big-endian value

Examples

Amount | Serialization
----------------------------------------------
0 | 0x0000
1 | 0x0001
5 | 0x0005
10 | 0x0401
20 | 0x0402
100 | 0x0801
200 | 0x0802
1000 | 0x0C01
1001 | 0x800003E9
1999 | 0x800007CF
2000 | 0x0C02
1000000 | 0x1801
1000001 | 0x800F4241
1500000 | 0x140F
74230000 | 0x90001CFF
1000000000 | 0x2401
1000000001 | 0x7E0000003B9ACA01
1000000000000 | 0x3001
10760000000000000000UL | 0xB0A42F40
18446744073709551615UL | 0xFEFFFFFFFFFFFFFFFF

Assets

Assets have a version, a contract hash and a subtype. The version and contract hash together form the contract id. The version may take any uint32 value, the contract hash any 32-byte array value, and the subtype any 32-byte array value.

As a special case, the Zen native token has version 0, contract hash with all bytes set to zero, and subtype with all bytes set to zero.

Asset serialization is optimized to represent the Zen native token efficiently, as well as to efficiently represent assets with low version numbers and with subtypes that have many trailing zero bytes. As the subtype is under the control of the contract generating the asset, contract writers can gain some efficiency for the assets their contracts make, by giving them short subtypes.

Serialized assets use between one and 65 bytes. The first byte uses two bits for signalling, and six to encode the version. Between zero and four more bytes encode the rest of the version, followed by either zero or 32 bytes to represent the contract hash, followed by between zero and 32 bytes to represent the subtype.

First byte

7

6

Rest of byte

Uncompressed representation?

Subtype present?

Version or upper part of version

First byte (base 2)

Version

Contract Hash

Subtype

More bytes?

Notes

00000000

Zero

Zero

Zero

No

Zen native token

10000000

Zero

Non-zero

Zero

Yes, 32 more

Contract's default asset

11000000

Zero

Any

Non-zero

Yes, 64 more

Uncompressed subtype (32 bytes). Canonical iff the uncompressed subtype has 0 or 1 trailing zero bytes.

01000000

Zero

Any

Non-zero

Yes, between 34 and 63 more

Compressed subtype (< 32 bytes). Canonical iff the uncompressed subtype has at least two trailing zero bytes.

xy0abcde

abcde (base 2)

As above

As above

Only if x <> 0 or y <> 0

Represents versions between 0 and 31.

xy1abcde

>=32

As above

As above

Yes

Represents versions greater than or equal to 32. See below.

Version bytes

If the version is between 0 and 31, inclusive, there are no additional version bytes. The third most significant bit of the first byte is set to 0, and the lower five bits represent the version.

Versions are serialized by an algorithm similar to that used for protocol buffers' varint type. Between one and five bytes are used to represent the 32-bit version, including the first byte described above.

Bit patterns

Version range

Big-endian uncompressed version

Bytes

0 <= v < 32

000xxxxx

??0xxxxx

32 <= v < 2^12

0000xxxx xyyyyyyy

??1xxxxx 0yyyyyyy

2^12 <= v < 2^19

00000xxx xxyyyyyy yzzzzzzz

??1xxxxx 1yyyyyyy 0zzzzzzz

2^19 <= v < 2^26

000000xx xxxyyyyy yyzzzzzz zwwwwwww

??1xxxxx 1yyyyyyy 1zzzzzzz 1wwwwwww

2^26 <= v

xxxxyyyy yyyzzzzz zzwwwwww wvvvvvvv

??10xxxx 1yyyyyyy 1zzzzzzz 1wwwwwww 1vvvvvvv

The two most significant bits of the first byte, marked above as ??, signal the type of compression used for the contract hash and subtype.

Remaining bytes

Zen native asset – contract hash and subtype both zero

There are no further bytes. The top two bits of the first byte are set to zero.

Default contract asset – subtype is zero

The two first bits of the first (version) byte are set to 10. After the version bytes, the next 32 bytes encode the contract hash.

Uncompressed subtype – subtype has at most one trailing zero byte

The first two bits of the first byte are set to 11. After the version bytes, the next 32 bytes encode the contract hash, and a further 32 bytes encode the subtype.

Compressed subtype – subtype has at least two trailing zero bytes.

The first two bits of the first byte are set to 01. After the version bytes, the next 32 bytes encode the contract hash. One byte encodes the 'size' of the subtype – i.e., 32 minus the number of trailing zero bytes. A further size bytes encode the leading bytes of the subtype.

Examples

Versions are written as integers (decimal). The contract hash and subtype are written as arrays of bytes, of length 32.

Version: 0u
Hash:0x0000000000000000000000000000000000000000000000000000000000000000
Subtype:0x0000000000000000000000000000000000000000000000000000000000000000
==> 0x00
Version: 0u
Hash:0x1BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Subtype:0x0000000000000000000000000000000000000000000000000000000000000000
==> 0x801BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Version: 0u
Hash:0x1BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Subtype:0x0000000000000000000000000000000000000000000000000000000000000000
==> 0xC01BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9CA42B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2BCB2C
Version: 0u
Hash:0x1BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Subtype:0x1B2A000000000000000000000000000000000000000000000000000000000000
==> 0x401BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C021B2A
Version: 7u
Hash:0x1BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Subtype:0x0000000000000000000000000000000000000000000000000000000000000000
==> 0x871BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Version: 31u
Hash:0x1BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Subtype:0x0000000000000000000000000000000000000000000000000000000000000000
==> 0x9F1BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Version: 32u
Hash:0x1BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Subtype:
0x0000000000000000000000000000000000000000000000000000000000000000
==> 0xA0201BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Version: 170u
Hash:0x1BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Subtype:0x0000000000000000000000000000000000000000000000000000000000000000
==> 0xA12A1BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Version: 4096u
Hash:0x1BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Subtype:0x0000000000000000000000000000000000000000000000000000000000000000
==> 0xB4001BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Version: 4096u
Hash:0x1BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C
Subtype:0x1B2A000000000000000000000000000000000000000000000000000000000000
==> 0x74001BFA2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B2B9C021B2A