class Cluster(BindableMixIn):
Known subclasses: pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapper, pyzim.cluster.OffsetRememberingCluster
Constructor: Cluster(zim, offset)
Implementation of a cluster in a ZIM file.
A cluster contains the blobs (=content) of content entries. As these are compressed together, it allows for higher compression rates.
A cluster can be extended, which means that it allows to be larger than 4 GiB, but will have a larger overhead.
| Method | __init__ |
The default constructor. |
| Method | generate |
Generate the infobyte for this cluster. |
| Method | get |
Get the size of a blob. |
| Method | get |
Return the content size of this cluster. |
| Method | get |
Return the number of blobs in this cluster. |
| Method | get |
Return the number of offsets in this cluster. |
| Method | get |
Return the offset with the specified index. |
| Method | get |
Return the total compressed size of the cluster. |
| Method | get |
Return the total decompressed size of this cluster. |
| Method | get |
Return the total size of the offsets. |
| Method | iter |
Read the blob offsets, yielding them as an iterator. |
| Method | iter |
Iteratively read the specified blob. |
| Method | parse |
Parse the cluster information byte, setting the attributes of this cluster as necessary. |
| Method | read |
Read the entirety of the specified range in the specified blob and return the content. |
| Method | read |
Read the cluster information byte, returning it. |
| Method | read |
Read and parse the infobyte if this has not yet happened. |
| Method | reset |
Reset all internal state except the cluster offset, causing said offset to be read again the next time it is required. |
| Instance Variable | compression |
compression to use, None when unknown |
| Instance Variable | is |
whether this cluster is extended, None if not set |
| Instance Variable | offset |
absolute offset of the cluster |
| Property | did |
True if the infobyte was already read and parsed. |
| Method | _get |
Return a compressor suitable to compress this cluster. |
| Method | _get |
Return a decompressing reader that can be sued to decompress the content. |
| Method | _get |
Return a decompressor suitable to decompress this cluster. |
| Method | _seek |
Seek to the specified position (relative to the cluster start) in the file only if it is needed. |
| Instance Variable | _decompressing |
Undocumented |
| Property | _pointer |
The pointer format. |
Inherited from BindableMixIn:
| Method | bind |
Bind this object to a ZIM file. |
| Method | unbind |
Unbind this object. Can be called multiple times. |
| Property | bound |
Whether this object is bound to a ZIM file or not. |
| Property | zim |
The bound ZIM archive, if any is bound. Otherwise None. |
| Instance Variable | _zim |
the bound ZIM archive or None |
pyzim.bindable.BindableMixIn.__init__pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapper, pyzim.cluster.OffsetRememberingClusterThe default constructor.
| Parameters | |
zim:pyzim.archive.Zim | if specified, bind this ZIM immediately. |
offset:int or None | absolute offset of the cluster |
| Raises | |
ValueError | if offset was specified but zim was not specified. |
pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapperGet the size of a blob.
| Parameters | |
i:int | index of blob to get size for |
| Returns | |
int | the size of the uncompressed blob |
| Raises | |
pyzim.exceptions.BindRequired | if cluster is unbound |
pyzim.exceptions.BlobNotFound | if the specified blob does not exists |
pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapperReturn the content size of this cluster.
This is the uncompressed size of the content of this cluster, not including the offsets and infobyte.
| Returns | |
int | the size of the content of this cluster |
| Raises | |
pyzim.exceptions.BindRequired | if cluster is unbound |
pyzim.cluster.EmptyClusterReturn the number of blobs in this cluster.
This value differs from the number of offsets in the cluster.
| Returns | |
int | the number of blobs in this cluster |
| Raises | |
pyzim.exceptions.BindRequired | if cluster is unbound |
pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapperReturn the number of offsets in this cluster.
This value differs from the number of blobs in the cluster.
| Returns | |
int | the number of offsets in this cluster. |
| Raises | |
pyzim.exceptions.BindRequired | if cluster is unbound |
pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapperReturn the offset with the specified index.
| Parameters | |
i:int | index of blob to get offset for |
| Raises | |
IndexError | if i < 0 or i >= len(offsets) |
pyzim.exceptions.BindRequired | if cluster is unbound |
pyzim.cluster.EmptyClusterReturn the total compressed size of the cluster.
This includes the entirety of the cluster, including the infobyte.
NOTE: this method is horribly inefficient, as it requires decompressing the entire cluster
| Returns | |
int | the size of this cluster |
| Raises | |
pyzim.exceptions.BindRequired | if cluster is unbound |
pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapperReturn the total decompressed size of this cluster.
This is the uncompressed size of the content of this cluster, including the offsets but not the infobyte.
| Returns | |
int | the size of the content of this cluster including offsets |
| Raises | |
pyzim.exceptions.BindRequired | if cluster is unbound |
pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapperReturn the total size of the offsets.
| Returns | |
int | the total size of the offsets in bytes |
pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapper, pyzim.cluster.OffsetRememberingClusterRead the blob offsets, yielding them as an iterator.
The order of blob_numbers does not matter, all offsets are always yielded in regular order (offfset 1, offset 2, ...).
| Parameters | |
blobNone or list of int | if specified, load only these offsets |
| Yields | |
int | the offset of each blob in the decompressed body, relative to cluster start |
| Raises | |
pyzim.exceptions.BindRequired | if cluster is unbound |
pyzim.cluster.EmptyCluster, pyzim.cluster.InMemoryCluster, pyzim.cluster.ModifiableClusterWrapperIteratively read the specified blob.
The parameters 'start' and 'end' can be used to specify a range within the blob to read. In this case, both values are interpreted relative to the actual blob start. Similar to how python slices work, the 'start' value will be inclusive and the 'end' value exclusive. If start >= size of the blob, the return value will be b"". If the end lies outside the blob, read only up until the end of the blob.
| Parameters | |
i:int | index of blob to read |
buffersize:int | number of bytes to read at once |
start:None or int | if specified, the offset relative to the start of the blob to start reading from |
end:None or int | if specified, the offset relative to the start of the blob to stop reading at |
| Yields | |
bytes | chunks of the blob content |
| Raises | |
pyzim.exceptions.BindRequired | if cluster is unbound |
pyzim.exceptions.BlobNotFound | if the blob index is out of range |
pyzim.cluster.ModifiableClusterWrapperParse the cluster information byte, setting the attributes of this cluster as necessary.
| Parameters | |
infobyte:bytes of length 1 | the cluster information byte |
| Raises | |
pyzim.exceptions.UnsupportedCompressionType | if the compression type is unknown. |
pyzim.cluster.EmptyCluster, pyzim.cluster.InMemoryCluster, pyzim.cluster.ModifiableClusterWrapperRead the entirety of the specified range in the specified blob and return the content.
The parameters 'start' and 'end' can be used to specify a range within the blob to read. In this case, both values are interpreted relative to the actual blob start. Similar to how python slices work, the 'start' value will be inclusive and the 'end' value exclusive. If start >= size of the blob, the return value will be b"". If the end lies outside the blob, read only up until the end of the blob.
| Parameters | |
i:int | index of blob to read |
start:None or int | if specified, the offset relative to the start of the blob to start reading from |
end:None or int | if specified, the offset relative to the start of the blob to stop reading at |
| Returns | |
bytes | the content of the blob |
| Raises | |
pyzim.exceptions.BindRequired | if cluster is unbound |
pyzim.exceptions.BlobNotFound | if the blob index is out of range |
pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapperRead the cluster information byte, returning it.
| Returns | |
bytes | the byte containing cluster information |
| Raises | |
pyzim.exceptions.BindRequired | if cluster is unbound |
pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapper, pyzim.cluster.OffsetRememberingClusterReset all internal state except the cluster offset, causing said offset to be read again the next time it is required.
pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrappercompression to use, None when unknown
pyzim.cluster.EmptyCluster, pyzim.cluster.ModifiableClusterWrapperwhether this cluster is extended, None if not set
pyzim.cluster.ModifiableClusterWrapperTrue if the infobyte was already read and parsed.
Return a compressor suitable to compress this cluster.
| Returns | |
a compressor-like object. See pyzim.compression.BaseCompressionInterface for more info. | a compressor suitable to compress this cluster. |
Return a decompressing reader that can be sued to decompress the content.
If offset is specified, the decompressor will have read to that offset. This may reuse the decompressor, depending on the implementation and the offset.
| Parameters | |
offset:int | offset, relative to the start of the compressed data (cluster start + 1) |
| Raises | |
pyzim.exceptions.BindRequired | if cluster is unbound |
Return a decompressor suitable to decompress this cluster.
| Returns | |
a decompressor-like object. See pyzim.compression.BaseCompressionInterface for more info. | a decompressor suitable to decompress this cluster |
Seek to the specified position (relative to the cluster start) in the file only if it is needed.
Needs to be bound.
| Parameters | |
| f:file-like object | file to seek |
offset:int | offset to seek, relative to the start of the cluster |