class documentation

A ZIM archive.

This object can be used to read, write and/or modify a ZIM file.

NOTE on modifying ZIM archives: to ensure optimal compression, some modifications will not immediately be written. This also means that reading previously modified entries may not be immediately effective. You can force-write all outstanding changes by calling pyzim.archive.Zim.flush. This will be done automatically on ZIM close.

Class Method open Open the Zim archive at the specified path.
Method __enter__ Called upon entering a with-statement. Provides self as object for the context.
Method __exit__ Called upon exiting a with-statement. Closes self.
Method __init__ The default constructor for opening a ZIM file.
Method acquire_file A context manager that locks the file access and provides the wrapped file object for the context.
Method add_full_url_redirect Add a redirect from the source (full) url to the target (full) url.
Method add_item Add an item to this archive.
Method add_redirect Add a redirect from the source (non-full) url to the target (non-full) url.
Method calculate_checksum Calculate the checksum of this ZIM file and return it.
Method close Close the ZIM file. Can be safely called multiple times.
Method entry_at_url_is_article Check if the entry at the specified full url is an article.
Method flush Write all changes to disk.
Method get_checksum Read the checksum of this ZIM file and return it.
Method get_cluster_at Return the cluster at the specified location (offset) in the ZIM file.
Method get_cluster_by_index Return the cluster for the specified index.
Method get_cluster_index_by_offset Return the cluster index for the cluster at the specified offset.
Method get_content_entry_by_url Return the entry at the specified (non-full) URL in the "C" namespace.
Method get_disk_size Calculate the size of this object when written to a file.
Method get_entry_at Return the entry at the specified location (offset) in the ZIM file.
Method get_entry_by_full_url Return the entry at the specified full URL.
Method get_entry_by_url Return the entry at the specified (non-full) URL.
Method get_entry_by_url_index Return the entry at the specified index in the URL pointer list.
Method get_mainpage_entry Return the entry for the mainpage.
Method get_metadata Read a metadata entry, returning its value.
Method get_metadata_dict Return a dict containing all metadata of this ZIM.
Method get_metadata_keys Read all metadata keys, returning them as a list.
Method get_mimetype_by_index Return the mimetype with the specified index.
Method get_mimetype_of_entry Return the mimetype of the specified entry.
Method get_search Return an object that can be used to search this ZIM.
Method has_entry_for_full_url Return True if this ZIM file contains an entry for the specified full URL.
Method install_processor Install a processor on this archive.
Method iter_articles Iterate over all article entries in this ZIM.
Method iter_clusters Iterate over all clusters in this ZIM.
Method iter_entries Iterate over all entries in this ZIM.
Method iter_entries_by_url Iterate over all entries in this ZIM, ordered by full URL.
Method iter_mimetypes Iterate over all mimetypes in this archive.
Method new_cluster Add a new cluster to this archive.
Method remove_cluster_by_index Remove the cluster with the specified index.
Method remove_entry_by_full_url Remove the entry at the specified url.
Method set_mainpage_url Set the mainpage url.
Method set_metadata Set metadata of the ZIM archive.
Method update_checksum Calculate and write the checksum.
Method write_cluster Update an existing cluster in this zim.
Method write_entry Write an entry to this archive.
Instance Variable cluster_cache internal cache for clusters, mapping the full location to each cluster
Instance Variable compression_strategy compression strategy for assigning new items to clusters
Instance Variable entry_cache internal cache for entries, mapping the full location to each cluster
Instance Variable filelock a lock to ensure file access works with multiple threads. Acquire if whenever any work is done on the file.
Instance Variable header header of this ZIM file.
Instance Variable mimetypelist the mimetype list
Instance Variable mutable Undocumented
Instance Variable policy policy to use
Instance Variable spaceallocator an object responsible for managing storage space within the ZIM file, may be None if ZIM is read-only
Instance Variable uncompressed_compression_strategy compression strategy for assigning new items to clusters that are explicity uncompressed
Property closed Return True if this archive has already been closed, False otherwise.
Property counter Return the counter used for counting mimetype occurences.
Method _check_closed Check to ensure this ZIM file has not already been closed.
Method _get_full_url_for_entry_at Return the full URL for the entry with at the specified location.
Method _get_namespace_title_for_entry_by_url_index Return the namespace+title for the entry at the specified index in the URL pointer list.
Method _get_title_for_entry_by_url_index Return the title for the entry at the specified index in the URL pointer list.
Method _init_caches Initializes internal caches according to policy.
Method _init_new Initiate as a new, empty archive.
Method _load_header Read the header.
Method _load_mimetypelist Load the mimetypelist.
Method _load_pointerlists Load the URL and title pointer lists.
Method _new_cluster_num Return the number of the next new cluster.
Method _on_cluster_cache_leave Called when a cluster leaves the cache.
Method _on_entry_cache_leave Called when an entry leaves the cache.
Method _update_url_pointers Update references to URL pointers.
Instance Variable _article_title_pointer_list a pointerlist to article entries ordered by title
Instance Variable _base_offset base offset of ZIM archive within the underlying file object
Instance Variable _closed a flag indicating whether this archive has already been closed
Instance Variable _cluster_num next cluster number to assign
Instance Variable _cluster_pointer_list a pointer list to the individual clusters
Instance Variable _counter the counter counting mimetype occurences
Instance Variable _entry_title_pointer_list a pointerlist to entries ordered by title
Instance Variable _f the underlying file object
Instance Variable _mode the mode this archive has been opened in
Instance Variable _operation_buffer Undocumented
Instance Variable _operationbuffer buffer for not-yet-completable operations
Instance Variable _processors list of processors to that have been installed on this zim
Instance Variable _url_pointer_list a pointer list to entries ordered by URL
Instance Variable _writable a flag indicating whether this archvie can be written to.

Inherited from ModifiableMixIn:

Method add_submodifiable Add another modifiable object as a child of this one.
Method after_flush_or_read This method should be called after this object has been read and/or flushed to disk. In other words, it should be called at least once whenever this object matches the state of the object on the disk.
Method dirty.setter Setter for ModifiableMixIn.dirty
Method ensure_mutable If this object is non-mutable, raise an Exception.
Method get_initial_disk_size Return the size of this object on disk as it has been read.
Method get_unmodified_disk_size Return the size of this object when written to a file before any modifications has been made since the last read/flush.
Method mark_dirty Convenience function to mark this object as dirty.
Method remove_submodifiable Remove a submodifiable from this object.
Instance Variable dirty True if this object or a sub-modifiable has been modified.
Instance Variable _dirty a boolean flag that's nonzero if this object has been modified
Instance Variable _old_disk_size the size of this object on disk before any modifications since the last flush/read
Instance Variable _submodifiables a list of child objects, whose dirty state will affect this objects dirty state.
def open(cls, path, mode='r', offset=0, policy=DEFAULT_POLICY):

Open the Zim archive at the specified path.

In addition to the modes listed in the documentation of pyzim.archive.Zim.__init__, the mode "x" is also supported. It behaves like mode "w", but raises an exception should the file already exists.

Parameters
path:strpath to open
mode:strmode of the Zim archive (currently, only reading is supported)
offset:intoffset of the ZIM archive within the file.
policy:pyzim.policy.Policypolicy to use, default to pyzim.policy.DEFAULT_POLICY
Returns
pyzim.archive.Zimthe Zim archive opened from the file
Raises
FileExistsErrorif mode == "x" and path already exists
ValueErroron invalid mode
def __enter__(self):

Called upon entering a with-statement. Provides self as object for the context.

def __exit__(self, exc_type, exc_value, exc_traceback):

Called upon exiting a with-statement. Closes self.

def __init__(self, f, offset=0, mode='r', policy=DEFAULT_POLICY):

The default constructor for opening a ZIM file.

Multiple modes are supported:

  • "r": read-only
  • "w": create a new file for writing, truncating the old file
  • "u"/"a": modify the existing file
Parameters
f:file-like objectfile-like object to read from (NOTE: must support reading)
offset:intoffset of the ZIM archive within the file.
mode:strin which mode to open the ZIM file (e.g. read)
policy:pyzim.policy.Policypolicy to use, default to pyzim.policy.DEFAULT_POLICY
Raises
ValueErroron invalid value for a parameter
TypeErroron invalid type for value
def acquire_file(self):

A context manager that locks the file access and provides the wrapped file object for the context.

Raises
pyzim.exceptions.ZimFileClosedwhen the ZIM file is already closed.
def add_full_url_redirect(self, source, target, title=None):

Add a redirect from the source (full) url to the target (full) url.

This method uses full urls. You'll likely want to use pyzim.archive.Zim.add_redirect if you want to work with non-full urls in the "C" namespace.

Be warned that a redirect that can not be resolved will be buffered. This will not only result in an increased memory usage, but may also cause an exception to be raised later on if the url redirect can not be resolved during the next flush.

Parameters
source:strfull url to redirect from
target:strfull url to redirect to
title:str or Nonetitle for the redirect, defaulting to the target entry title
Raises
TypeErroron type error
ValueErroron invalid value
pyzim.exceptions.ZimFileClosedif archive is already closed
pyzim.exceptions.NonMutableif this zim file is not mutable
def add_item(self, item, force_uncompressed=False):

Add an item to this archive.

The write may not happen immediately.

Parameters
item:pyzim.item.Itemitem to write
force_uncompressed:boolif nonzero, add the item to the compression strategy for uncompressed content, regardless of other options
Raises
TypeErroron type error
pyzim.exceptions.ZimFileClosedif archive is already closed
pyzim.exceptions.NonMutableif this zim file is not mutable
def add_redirect(self, source, target, title=None):

Add a redirect from the source (non-full) url to the target (non-full) url.

This method uses non-full urls and operates in the "C" namespace. Use pyzim.archive.Zim.add_full_url_redirect to work with full urls.

Parameters
source:strnon-full url to redirect from
target:strnon-full url to redirect to
title:str or Nonetitle for the redirect, defaulting to the target entry title
Raises
TypeErroron type error
ValueErroron invalid value
pyzim.exceptions.EntryNotFoundif target url does not yet exists
pyzim.exceptions.ZimFileClosedif archive is already closed
pyzim.exceptions.NonMutableif this zim file is not mutable
def calculate_checksum(self):

Calculate the checksum of this ZIM file and return it.

NOTE: this reads the entire ZIM file and calculates the ZIM file. If you want to read the checksum listed in the ZIM file, use pyzim.archive.Zim.get_checksum instead.

Returns
bytesthe calculated (md5) checksum of this ZIM
def close(self):

Close the ZIM file. Can be safely called multiple times.

def entry_at_url_is_article(self, full_url):

Check if the entry at the specified full url is an article.

Articles are always in C namespace, thus the full url must start with a C.

This method returns False if the entry does not exists at all.

Parameters
full_url:strfull url of entry to check
Returns
boolwhether the entry is an article or not
Raises
TypeErroron type error
ValueErorron value error.
pyzim.exceptions.ZimFileClosedif archive is already closed
def flush(self):

Write all changes to disk.

Raises
pyzim.exceptions.ZimFileClosedwhen the ZIM file is already closed.
pyzim.exceptions.NonMutableif this ZIM file is set to be non-mutable
def get_checksum(self):

Read the checksum of this ZIM file and return it.

NOTE: this reads the checksum from the ZIM file, it does not calculate the actual checksum of the file. If you want to calculate the checksum of the ZIM, use pyzim.archive.Zim.calculate_checksum instead.

Returns
bytesthe (md5) checksum of this ZIM
def get_cluster_at(self, location):

Return the cluster at the specified location (offset) in the ZIM file.

If caching is configured, an instance of a previous cluster may be returned. This entry may already be modified and/or bound (even if bind=False).

Parameters
location:intlocation/offset of the cluster in the ZIM file
Returns
pyzim.cluster.Clusterthe entry at the specified location
def get_cluster_by_index(self, i):

Return the cluster for the specified index.

Parameters
i:intindex of cluster to get
def get_cluster_index_by_offset(self, offset):

Return the cluster index for the cluster at the specified offset.

Note that the offset must match exactly the offset of the cluster. This is not the full offset (base offset must be substracted manually).

This method is mostly used as a helper by clusters to determine their own index.

Returns
intthe index of the cluster at the offset in the cluser pointer list
Raises
KeyErrorif the offset does not refer to a cluster.
def get_content_entry_by_url(self, url):

Return the entry at the specified (non-full) URL in the "C" namespace.

NOTE: "content" refers to an entry in the "C" namespace. This function may still return any type of pyzim.entry.BaseEntry and is NOT restricted to pyzim.entry.ContentEntry.

Parameters
url:strurl of entry to get
Returns
pyzim.entry.BaseEntrythe entry at the specified url
Raises
pyzim.exceptions.EntryNotFoundwhen no entry matches the specified URL
def get_disk_size(self):

Calculate the size of this object when written to a file.

NOTE: in this context, size refers to the direct size of the object. If this object contains references to other objects, their sizes will not be included. For example, a pyzim.entry.ContentEntry also links to a blob, but this function will only return the size of the entry itself, excluding the referenced blob.

Returns
intthe size, in bytes
def get_entry_at(self, location, bind=True, allow_cache_replacement=True):

Return the entry at the specified location (offset) in the ZIM file.

If caching is configured, an instance of a previous entry may be returned. This entry may already be modified and/or bound (even if bind=False).

Parameters
location:intlocation/offset of the entry in the ZIM file
bind:boolif nonzero (default), bind this entry
allow_cache_replacement:boolif nonzero (default), allow cached entries to be replaced
Returns
pyzim.entry.BaseEntrythe entry at the specified location
def get_entry_by_full_url(self, full_url):

Return the entry at the specified full URL.

Parameters
full_url:strfull URL of entry to get
Returns
pyzim.entry.BaseEntrythe entry at the specified URL
Raises
pyzim.exceptions.EntryNotFoundwhen no entry matches the specified URL
def get_entry_by_url(self, namespace, url):

Return the entry at the specified (non-full) URL.

Parameters
namespace:str of length 1namespace of entry to get
url:strurl of entry to get
Returns
pyzim.entry.BaseEntrythe entry at the specified url
Raises
pyzim.exceptions.EntryNotFoundwhen no entry matches the specified URL
def get_entry_by_url_index(self, i, allow_cache_replacement=True):

Return the entry at the specified index in the URL pointer list.

Parameters
i:intindex of entry in URL pointer list
allow_cache_replacement:boolif nonzero (default), allow cached entries to be replaced
Returns
pyzim.entry.BaseEntrythe entry at the specified location
Raises
pyzim.exceptions.EntryNotFoundwhen no entry matching the index was found
def get_mainpage_entry(self):

Return the entry for the mainpage.

Returns
pyzim.entry.BaseEntrythe entry for the mainpage
Raises
pyzim.exceptions.EntryNotFoundwhen no mainpage exists
def get_metadata(self, key, as_unicode=True):

Read a metadata entry, returning its value.

See https://wiki.openzim.org/wiki/Metadata for metadata keys and values.

By default, this method returns unicode. You can set as_unicode=False to prevent this. If the key is not found, return None.

Parameters
key:strkey/URL of metadata
as_unicode:boolwhether to decode value or not
Returns
str or bytes (or None if not found)the metadata value
Raises
pyzim.exceptions.ZimFileClosedif archive is already closed
def get_metadata_dict(self, as_unicode=True):

Return a dict containing all metadata of this ZIM.

NOTE: values of certain metadata keys won't be decoded. This prevents the decoding of binary content of images..

Parameters
as_unicode:boolwhether to decode strings or not
Returns
dict of str or bytes -> bytes or stra dict containing the metadata
def get_metadata_keys(self, as_unicode=True):

Read all metadata keys, returning them as a list.

By default, this method returns unicode. You can set as_unicode=False to prevent this. If the key is not found, return None.

Parameters
as_unicode:boolwhether to decode value or not
Returns
list of str or bytesthe metadata keys
def get_mimetype_by_index(self, i):

Return the mimetype with the specified index.

Parameters
i:intindex of mimetype to get
Returns
strthe mimetype with the specified index
Raises
IndexErrorwhen the index is invalid
def get_mimetype_of_entry(self, entry):

Return the mimetype of the specified entry.

If the entry is a redirect, this will be pyzim.constants.MIMETYPE_REDIRECT.

Parameters
entry:pyzim.entry.BaseEntryentry to get mimetype for
Returns
strthe mimetype of this entry
def get_search(self):

Return an object that can be used to search this ZIM.

There are various ways to search a ZIM, for which pyzim tries to provide a unified interface. This method will return any available search. Said search may, however, be more limited than other search implementations. It is as such recommended not to use this method and instead manually instanciating one of the child classes of pyzim.search.BaseSearch. Use this method only if you don't care about what search you get.

Currently, this method will try to provide you with a xapian fulltext search, falling back to a xapian title search and finally to a simple titlestart based search.

Returns
pyzim.search.BaseSearcha search object that can be used to search this ZIM
def has_entry_for_full_url(self, full_url):

Return True if this ZIM file contains an entry for the specified full URL.

Parameters
full_url:strfull URL of entry to check existence of
Returns
boolTrue if an entry for this full URL exists. It may be a redirect.
def install_processor(self, processor):

Install a processor on this archive.

See pyzim.processor for more details.

Parameters
processor:boolprocessor to install
Raises
TypeErroron type error
def iter_articles(self, start=None, end=None):

Iterate over all article entries in this ZIM.

If start and end are specified, they reference the indexes of the first (inclusive) and last (exclusive) entry to return. In other words, this behavior matches the l[start:end] syntax.

This function does not guarantee any specific order of the entries yielded by this function, however it currently *should* be ordered by title.

Parameters
start:intindex of first entry to return (inclusive)
end:intindex of last entry to return (exclusive)
Yields
pyzim.entry.BaseEntrythe entries in the specified range
def iter_clusters(self, start=None, end=None):

Iterate over all clusters in this ZIM.

If start and end are specified, they reference the indexes of the first (inclusive) and last (exclusive) clusters to return. In other words, this behavior matches the l[start:end] syntax.

Parameters
start:intindex of first cluster to return (inclusive)
end:intindex of last cluster to return (exclusive)
Yields
pyzim.cluster.Clusterthe clusters in the specified range
Raises
IndexErroron invalid/out of bound indexes
def iter_entries(self, start=None, end=None):

Iterate over all entries in this ZIM.

If start and end are specified, they reference the indexes of the first (inclusive) and last (exclusive) entry to return. In other words, this behavior matches the l[start:end] syntax.

This function does not guarantee any specific order of the entries yielded by this function, however it currently *should* be ordered by URL.

Before, this method iterated by title, but this has been changed following the removal of the v0 entry title index.

Parameters
start:intindex of first entry to return (inclusive)
end:intindex of last entry to return (exclusive)
Yields
pyzim.entry.BaseEntrythe entries in the specified range
def iter_entries_by_url(self, start=None, end=None):

Iterate over all entries in this ZIM, ordered by full URL.

If start and end are specified, they reference the indexes of the first (inclusive) and last (exclusive) entry to return. In other words, this behavior matches the l[start:end] syntax.

Parameters
start:intindex of first entry to return (inclusive)
end:intindex of last entry to return (exclusive)
Yields
pyzim.entry.BaseEntrythe entries in the specified range
def iter_mimetypes(self, as_unicode=False):

Iterate over all mimetypes in this archive.

Parameters
as_unicode:boolif nonzero, decode mimetypes
Yields
bytes or str if as_unicode is nonzerothe mimetypes in this mimetype list
def new_cluster(self):

Add a new cluster to this archive.

NOTE: the cluster will not be cached until it is written at least once. Consequently, the autoflush function will not work until you've written them at least once.

Returns
pyzim.cluster.ModifiableClusterWrappera new cluster
Raises
pyzim.exceptions.ZimFileClosedif archive is already closed
pyzim.exceptions.NonMutableif this zim file is not mutable
def remove_cluster_by_index(self, i):

Remove the cluster with the specified index.

Parameters
i:intindex of cluster to remove
def remove_entry_by_full_url(self, full_url, blob='empty'):

Remove the entry at the specified url.

You can specify how the associated blob should be treated using the blob parameter:

If the entry has an associated blob, the cluster will be flushed.

Redirects pointing towards this url will also be removed. Buffered operations may interfere with this behavior, so be sure to flush() before.

Parameters
full_url:strfull url of entry to remove
blob:strhow to treat the associated blob
Raises
TypeErroron type error
ValueErorron value error.
pyzim.exceptions.EntryNotFoundif the target entry does not exist
pyzim.exceptions.ZimFileClosedif archive is already closed
pyzim.exceptions.NonMutableif this zim file is not mutable
def set_mainpage_url(self, url):

Set the mainpage url.

An entry for the specified url must already exists.

Parameters
url:str or Nonenon-full url of the mainpage (the mainpage is always in the "C" namespace). Set to None to disable.
Raises
TypeErroron type error
pyzim.exceptions.ZimFileClosedif archive is already closed
pyzim.exceptions.NonMutableif this zim file is not mutable
def set_metadata(self, key, value, mimetype='text/plain'):

Set metadata of the ZIM archive.

Parameters
key:strkey of metadata to set
value:str or bytesvalue of metadata to set
mimetype:str or bytesmimetype of the associated blob
Raises
TypeErroron type error
ValueErroron invalid value
pyzim.exceptions.ZimFileClosedif archive is already closed
pyzim.exceptions.NonMutableif this zim file is not mutable
def update_checksum(self):

Calculate and write the checksum.

NOTE: this prior to this, pyzim.header.Header.checksum_position should already be set to the new position and the header flushed. This method does not take care of this.

def write_cluster(self, cluster, cluster_num=None):

Update an existing cluster in this zim.

The cluster must already be part of this archive. Use Zim.new_cluster for creating new clusters.

Parameters
cluster:ModifiableClusterWrappercluster to write
cluster_numthe number/id of the cluster. Providing it speeds up the method.
Returns
intthe cluster number
Raises
TypeErroron type error
ValueErroron invalid values (e.g. negative cluster numbers)
pyzim.exceptions.ZimFileClosedif archive is already closed
pyzim.exceptions.NonMutableif this zim file is not mutable
pyzim.exceptions.BindingErrorif cluster is not bound to self
def write_entry(self, entry, update_redirects=True, add_to_title_pointer_list=True):

Write an entry to this archive.

Parameters
entry:pyzim.entry.BaseEntryentry to write
update_redirects:boolif nonzero, update redirects to this article if necessary
add_to_title_pointer_list:boolif nonzero (default), add the entry to the title pointer lists
Raises
TypeErroron type error
pyzim.exceptions.ZimFileClosedif archive is already closed
pyzim.exceptions.NonMutableif this zim file is not mutable
pyzim.exceptions.BindingErrorif entry is not bound to self
cluster_cache: pyzim.cache.BaseCache =

internal cache for clusters, mapping the full location to each cluster

compression strategy for assigning new items to clusters

internal cache for entries, mapping the full location to each cluster

filelock: threading.Lock =

a lock to ensure file access works with multiple threads. Acquire if whenever any work is done on the file.

header of this ZIM file.

the mimetype list

mutable =

policy to use

an object responsible for managing storage space within the ZIM file, may be None if ZIM is read-only

uncompressed_compression_strategy: pyzim.compressionstrategy.BaseCompressionStrategy or None =

compression strategy for assigning new items to clusters that are explicity uncompressed

closed: bool =

Return True if this archive has already been closed, False otherwise.

Return the counter used for counting mimetype occurences.

If not counter is available, return None instead.

def _check_closed(self):

Check to ensure this ZIM file has not already been closed.

Raises
pyzim.exceptions.ZimFileClosedwhen the ZIM file is already closed.
def _get_full_url_for_entry_at(self, location):

Return the full URL for the entry with at the specified location.

This is used as the key function for the URL pointer list.

Parameters
location:intlocation of the entry in the ZIM file
Returns
bytesthe full URL of the specified entry
def _get_namespace_title_for_entry_by_url_index(self, i):

Return the namespace+title for the entry at the specified index in the URL pointer list.

This is used as the key function for the entry title pointer list.

Parameters
i:intindex of the entry in the URL pointer list
Returns
strthe <namespace><title> of the entry
def _get_title_for_entry_by_url_index(self, i):

Return the title for the entry at the specified index in the URL pointer list.

This is used as the key function for the article pointer list.

Parameters
i:intindex of the entry in the URL pointer list
Returns
strthe title of the specified entry
def _init_caches(self):

Initializes internal caches according to policy.

def _init_new(self):

Initiate as a new, empty archive.

This instantiated the header, pointerlists, ... .

TODO: find a better name for this method.

def _load_header(self):

Read the header.

def _load_mimetypelist(self):

Load the mimetypelist.

def _load_pointerlists(self):

Load the URL and title pointer lists.

def _new_cluster_num(self):

Return the number of the next new cluster.

This also increments the internal counter.

Returns
intthe number of the next cluster
def _on_cluster_cache_leave(self, cluster_offset, cluster):

Called when a cluster leaves the cache.

If the archive is writable and autoflush is enabled, write the cluster if it is dirty.

Parameters
cluster_offset:inttotal offset of cluster
cluster:pyzim.cluster.Clusterthe cluster leaving the cache
def _on_entry_cache_leave(self, full_location, entry):

Called when an entry leaves the cache.

If the archive is writable and autoflush is enabled, write the entry if it is dirty.

Parameters
full_location:intthe full offset of the entry
entry:pyzim.entry.BaseEntrythe entry leaving the cache
def _update_url_pointers(self, start, diff, edit_etpl=True, edit_atpl=True, update_redirects=True, skip=()):

Update references to URL pointers.

As several pointers point to the position of an entry within the URL pointer list, but said list is sorted, modifying it will likely cause said pointers to point to the wrong entries. This method takes care of updating said references.

Parameters
start:intlowest URL pointer index that needs updating
diff:intinteger to update said references by (e.g. 1)
edit_etpl:boolif nonzero (default), update the entry title pointer list
edit_atpl:boolif nonzero (default), update the article title pointer list
update_redirects:boolif nonzero (default), update redirects
skip:list or tuple of strlist or tuple of full urls not to update recursively
_article_title_pointer_list: pyzim.pointerlist.TitlePointerList =

a pointerlist to article entries ordered by title

_base_offset: int =

base offset of ZIM archive within the underlying file object

_closed: bool =

a flag indicating whether this archive has already been closed

_cluster_num: int =

next cluster number to assign

_cluster_pointer_list: pyzim.pointerlist.SimplePointerList =

a pointer list to the individual clusters

the counter counting mimetype occurences

_entry_title_pointer_list: pyzim.pointerlist.TitlePointerList =

a pointerlist to entries ordered by title

_f: file-like object =

the underlying file object

_mode: str =

the mode this archive has been opened in

_operation_buffer =

Undocumented

buffer for not-yet-completable operations

list of processors to that have been installed on this zim

a pointer list to entries ordered by URL

_writable: bool =

a flag indicating whether this archvie can be written to.