Serialize/Deserialize (SerDe)
A SerDe is a class used to serialize items sketches to a bytes object in binary.
Several example SerDes are provided as references.
The use of binary-compatible SerDes in different languages is critical for cross-language compatibility.
Each implementation must extend the PyObjectSerDe class and override all three of its methods.
- class PyObjectSerDe(*args, **kwargs)
- An abstract base class for serde objects. All custom serdes must extend this class. - get_size(self, item: object) int
- Returns the size in bytes of an item - Parameters:
- item (object) – The specified object 
- Returns:
- The size of the item in bytes 
- Return type:
- int 
 
 - to_bytes(self, item: object) bytes
- Retuns a bytes object with a serialized version of an item - Parameters:
- item (object) – The specified object 
- Returns:
- A - bytesobject with the serialized object
- Return type:
- bytes 
 
 - from_bytes(self, data: bytes, offset: int) tuple
- Reads a bytes object starting from the given offest and returns a tuple of the reconstructed object and the number of additional bytes read - Parameters:
- data (bytes) – A - bytesobject from which to deserialize
- offset (int) – The offset, in bytes, at which to start reading 
 
- Returns:
- A - tuplewith the reconstructed object and the number of bytes read
- Return type:
- tuple(object, int) 
 
 
The provided SerDes are:
- class PyStringsSerDe(*args, **kwargs)
- Bases: - PyObjectSerDe- Implements a simple string-encoding scheme where a string is written as <num_bytes> <string_contents>, with no null termination. This format allows pre-allocating each string, at the cost of additional storage. Using this format, the serialized string consumes - 4 + len(item)bytes.
- class PyIntsSerDe(*args, **kwargs)
- Bases: - PyObjectSerDe- Implements an integer encoding scheme where each integer is written as a 32-bit (4 byte) little-endian value. 
- class PyLongsSerDe(*args, **kwargs)
- Bases: - PyObjectSerDe- Implements an integer encoding scheme where each integer is written as a 64-bit (8 byte) little-endian value. 
- class PyFloatsSerDe(*args, **kwargs)
- Bases: - PyObjectSerDe- Implements a floating point encoding scheme where each value is written as a 32-bit floating point value. 
- class PyDoublesSerDe(*args, **kwargs)
- Implements a floating point encoding scheme where each value is written as a 64-bit floating point value.