cache_modifiers

The caching_modifiers module holds methods and decorators related to caching.

Functions

class DiskCacheMethod(value)

Bases: IntEnum

An enumeration.

disk_cache(target_dir: str | bytes | PathLike, maxsize: int = 16, allow_mutation: bool = False, case_matters: bool = False, in_bytes: bool = True, write_converter: Callable[[Any, BinaryIO | TextIO], None] | None = None, read_converter: Callable[[BinaryIO | TextIO], Any] | None = None, cache_method: DiskCacheMethod = DiskCacheMethod.LFU, cache_arg: Any | None = None, **override_name_converters: Callable[[...], str])

A decorator which caches results of a function in the form of files. It has multiple ways to customize behavior.

Parameters:
  • target_dir (Union[str, bytes, os.PathLike]) – The directory where results of calling the decorated function should be. stored.

  • maxsize (int) – The maximum number of results to cache. This must be a positive number greater than or equal to 0.

  • allow_mutation (bool) –

    Whether mutation on the returned result should be allowed. If True, it is guaranteed that the same instance of the result is used throughout execution. If False, copy.deepcopy() will be used each time the function is called, and a deepcopy of the result will be returned. Using this can allow for a previously-saved object to be mutated if the function is used for a with statement, or if the function’s mutate method is used.

    Example of effect:

    # Given the following function definition...
    @disk_cache('llamas/hares', allow_mutation=True)
    def func(a: str, b: int, c: list) -> dict:
        return dict(a=a, b=b, c=c)
    
    # and the following call being the first...
    a = func('d', 25, ['e', 'f'])
    
    # a should be {'a': 'd', 'b': 25, 'c': ['e', 'f']} and the object stored on the disk
    # should be {'a': 'd', 'b': 25, 'c': ['e', 'f']}.
    
    # If the function is called again with the same inputs and stored into a different
    # variable, it will still be the same object as a.
    b = func('d', 25, ['e', 'f'])
    
    # a is b should return True.
    # Also the following behavior should be expected if either is mutated...
    
    b['c'].append('g')
    
    # a is b and b == {'a': 'd', 'b': 25, 'c': ['e', 'f', 'g']} should be True. However, the
    # object stored on the disk should still be {'a': 'd', 'b': 25, 'c': ['e', 'f']}.
    
    # If it is desired to propagate the mutations to the saved object, the following
    # notation should be used...
    with func:
        c = func('d', 25, ['e', 'f'])
        c['c'].append('h')
    
    # more code...
    
    # The object stored on the disk after this should be
    # {'a': 'd', 'b': 25, 'c': ['e', 'f', 'g', 'h']}. c should also be equivalent to the
    # same object.
    
    # Alternatively, at the end of execution, the following can be called with the same
    # results...
    func.mutate()
    

  • case_matters (bool) – Whether upper/lower case in file names matters. If it does not matter, all stored values will be in lower case.

  • in_bytes (bool) – Whether the storage method of the data requires reading/writing in bytes. Defaults to True since the default serialization method is pickling.

  • write_converter (Optional[Callable[[DiskCacheType, DiskCacheIO], None]]) – The converter that should be used to write objects to the disk. Should accept the expected type of object to be stored and the IO stream which will write to the disk.

  • read_converter (Optional[Callable[[DiskCacheIO], DiskCacheType]]) – The converter that should be used to read objects from the disk. Should accept the IO stream which will read from the disk.

  • cache_method (The method to use for caching.) –

    The cache method to use for the disk-cached function. Available options are:

    1. DiskCacheMethod.LRU - Uses an LRU caching style, which equates to the least-recently-used result being removed from the cache when making space for a new result.

    2. DiskCacheMethod.LFU - Uses an LFU caching style, which equates to the least-frequently-used result being removed from the cache when making space for a new result.

    3. DiskCacheMethod.WUNC - Uses a weighted-use-and-neglect caching style, which is a bit more complex than the other available methods. Essentially, it uses both the number of hits and the number of misses on a result to calculate how like the value is to be needed again. Misses are divided by maxsize, which will be multiplied by cache_arg (if given), then they are subtracted from hits to get the weight of a result. This can be represented as hits - (misses / (maxsize * cache_arg)).

    4. DiskCacheMethod.CUSTOM - Uses a custom user-defined method to determine caching style. When this is used, cache_arg must be specified, and should consist of an iterable with three functions. For each of these functions, the following parameters will be passed:

      maxsize - int: The maximum size of the cache.

      index - Dict[str, list]: The cache’s index, which should be used to store a list for each function where the true name of the file (determined by the DiskCache itself) is the first element and any info used for decision-making should be stored in further elements.

      These functions are used in the following way:

      Function 1 -

      (maxsize: int, index: Dict[str, List[Any]], filename: str)

      filename - str: The name of the result being requested. Can be used to search index.

      This function should be used to make any changes desired before the target file is determined. It does not need to return anything.

      Function 2 -

      self._cache_arg[1](maxsize: int, index: Dict[str, List[Any]]) -> Tuple[str, str]

      This function should select the correct filename to be replaced when at maxsize. It should return this as well as the corresponding true filename from index in the order true_filename, filename.

      Note

      This function will only be called if the cache is definitely at maxsize.

      Function 3 -

      (maxsize: int, index: Dict[str, List[Any]], filename: str, true_filename: str)

      filename - str: The name of the result being requested. Can be used to search index.

      true_filename - str: The actual file name of the file, generated by the cache and meant to be used if filename is not already in the index.

      This function should be used to perform changes to the index after all other steps are completed. It does not need to return anything.

    5. DiskCacheMethod.AGE - Uses the time when each value is first cached to determine which is removed first. Oldest will always be removed first (unless two are entered in the same microsecond, in which case behavior is uncertain.

  • cache_arg (Any) – An extra argument consumed by some cache methods.

  • override_name_converters (Callable[[...], str]) – Any custom methods to override how parameter values are written to the string.