datazen.classes package#
Submodules#
datazen.classes.data_repository module#
- datazen - An interface for managing on-disk data by loading it and making
discrete changes.
- class datazen.classes.data_repository.DataRepository(root_dir: str, out_type: str = 'yaml', logger: ~logging.Logger = <Logger datazen.classes.data_repository (WARNING)>)[source]#
Bases:
object
A class for interacting with file-backed databases that are built with serialization formats supported by this package.
datazen.classes.file_info_cache module#
datazen - A class for storing metadata about files that have been loaded.
- class datazen.classes.file_info_cache.FileInfoCache(cache_dir: str = None, logger: ~logging.Logger = <Logger datazen.classes.file_info_cache (WARNING)>)[source]#
Bases:
object
Provides storage for file hashes and lists that have been loaded.
- check_hit(sub_dir: str, path: str, also_cache: bool = True) bool [source]#
Determine if a given file already exists with its current hash in the cache, if not return False and optionally add it to the cache.
- get_data(name: str) LoadedFiles [source]#
Get the tuple version of cached data.
- get_hashes(sub_dir: str) Dict[str, Any] [source]#
Get the cached, dictionary of file hashes for a certain key.
- datazen.classes.file_info_cache.cmp_loaded_count(cache_a: FileInfoCache, cache_b: FileInfoCache, name: str) int [source]#
Compute the total difference in file counts (for a named group) between two caches.
- datazen.classes.file_info_cache.cmp_loaded_count_from_set(cache_a: FileInfoCache, cache_b: FileInfoCache, name: str, files: List[str]) int [source]#
Count the number of files uniquely loaded to one cache but not the other.
- datazen.classes.file_info_cache.cmp_total_loaded(cache_a: ~datazen.classes.file_info_cache.FileInfoCache, cache_b: ~datazen.classes.file_info_cache.FileInfoCache, known_types: ~typing.List[str], load_checks: ~typing.Dict[str, ~typing.List[str]] = None, logger: ~logging.Logger = <Logger datazen.classes.file_info_cache (WARNING)>) int [source]#
Compute the total difference in file counts for a provided set of named groups.
- datazen.classes.file_info_cache.copy(cache: FileInfoCache) FileInfoCache [source]#
Copy one cache into a new one.
- datazen.classes.file_info_cache.meld(cache_a: FileInfoCache, cache_b: FileInfoCache) None [source]#
Promote all updates from cache_b into cache_a.
- datazen.classes.file_info_cache.remove_missing_hashed_files(data: Dict[str, Any], removed_data: Dict[str, List[str]]) Dict[str, Any] [source]#
Assign new hash data based on the files that are still present.
- datazen.classes.file_info_cache.remove_missing_loaded_files(data: Dict[str, Any]) Dict[str, Any] [source]#
Audit list elements in a dictionary recursively, assume the data is String and the elements are filenames, assign a new list for all of the elements that can be located.
- datazen.classes.file_info_cache.sync_cache_data(cache_data: Dict[str, Any], removed_data: Dict[str, List[str]]) Dict[str, Any] [source]#
Before writing a cache to disk we want to de-duplicate items in the loaded list and remove hash data for files that were removed so that if they come back at the same hash, it’s not considered already loaded.
datazen.classes.target_resolver module#
datazen - Orchestrates the “parameterized target” capability.
- class datazen.classes.target_resolver.TargetResolver(logger: ~logging.Logger = <Logger datazen.classes.target_resolver (WARNING)>)[source]#
Bases:
object
A class for managing resolution of literal and templated target definitions.
datazen.classes.task_data_cache module#
datazen - A class for storing data from completed operations to disk.
datazen.classes.valid_dict module#
datazen - A dict wrapper that enables simpler schema validation.
- class datazen.classes.valid_dict.ValidDict(name: str, data: ~typing.Dict[str, ~typing.Any], schema: ~vcorelib.schemas.base.Schema, logger: ~logging.Logger = <Logger datazen.classes.valid_dict (WARNING)>)[source]#
Bases:
UserDict
An object that behaves like a dictionary but can have a provided schema enforced.