pylablib.core.fileio package

Submodules

pylablib.core.fileio.bin_transform module

class pylablib.core.fileio.bin_transform.ITransformer[source]

Bases: object

Generic binary data transformer

f2d(fbytes)[source]: Transform stored file bytes into data bytes

d2f(dbytes)[source]: Transform data bytes into stored file bytes

class pylablib.core.fileio.bin_transform.MaskTransformer(mask)[source]

Bases: object

XOR mask transformer

f2d(fbytes)[source]

d2f(dbytes)[source]

pylablib.core.fileio.datafile module

class pylablib.core.fileio.datafile.DataFile(data, filepath=None, filetype=None, creation_time=None, comments=None, props=None)[source]

Bases: object

Describes a single datafile.

Parameters:

data – the main content of the file (usually a numpy array, a pandas DataFrame or a Dictionary).
filepath (str) – absolute path from which the file was read
filetype (str) – a source type (e.g., "csv" or "bin")
creation_time (datetime.datetime) – File creation time
props (dict) – all the metainfo about the file (extracted from comments, filename etc.)
comments (list) – all the comments excluding the ones containing props

get(name, default=None)[source]: Get a property from the dictionary. Use default value if it’s not found

pylablib.core.fileio.dict_entry module

Classes for dealing with the Dictionary entries with special conversion rules when saved or loaded. Used to redefine how certain objects (e.g., tables) inside dictionaries are written into files and read from files.

pylablib.core.fileio.dict_entry.is_dict_entry_branch(branch)[source]: Check if the dictionary branch contains a dictionary entry which needs to be specially converted.

class pylablib.core.fileio.dict_entry.DictEntryBuilder(entry_cls, pred=None, **kwargs)[source]

Bases: object

Object for building dictionary entries from objects.

Parameters:

entry_cls – dictionary entry class
pred – method used to check if an object can be turned into the corresponding entry; if None, use the default entry class checker (entry_class.is_data_valid)
kwargs – keyword arguments passed to the entry constructor along with the data

is_data_valid(data)[source]: Check if a data object can be wrapped by the current entry class

from_data(data)[source]: Build a dictionary entry from the data

class pylablib.core.fileio.dict_entry.DictEntryParser(entry_cls, pred=None, **kwargs)[source]

Bases: object

Object for building dictionary entries from dictionary branches.

Parameters:

entry_cls – dictionary entry class
pred – method used to check if a dictionary branch can be turned into the corresponding entry; if None, use the default entry class checker (entry_class.is_branch_valid)
kwargs – keyword arguments passed to the entry from_dict class method along with the branch

is_branch_valid(branch)[source]: Check if a branch can be parsed by the current entry class

from_dict(dict_ptr, loc)[source]: Build a dictionary entry from the branch and the file location

pylablib.core.fileio.dict_entry.add_dict_entry_builder(builder)[source]: Add an entry builder to the global list of builders

pylablib.core.fileio.dict_entry.add_dict_entry_parser(parser)[source]: Add an entry parser to the global list of parsers

pylablib.core.fileio.dict_entry.add_dict_entry_class(cls)[source]

Add an entry class.

Automatically registers builder and parser, which take no additional arguments and use default class method to determine if an object/branch can be converted into an entry.

pylablib.core.fileio.dict_entry.from_data(data, builders=None)[source]

Build a dictionary entry from the data.

builders can contain an additional list of builder to try before using the default ones.

pylablib.core.fileio.dict_entry.from_dict(dict_ptr, loc, parsers=None)[source]

Build a dictionary entry from the dictionary branch and the file location.

parsers can contain an additional list of parsers to try before using the default ones.

class pylablib.core.fileio.dict_entry.IDictionaryEntry(data)[source]

Bases: object

A generic Dictionary entry.

Contains data represented by the node, as well as the way to represent this data as a dictionary branch.

Parameters:: data – data to be wrapped

classmethod is_data_valid(data)[source]: Check if a data object can be wrapped by the current entry class

classmethod is_branch_valid(branch)[source]: Check if a branch can be parsed by the current entry class

classmethod from_dict(dict_ptr, loc)[source]

Convert a dictionary branch to a specific IDictionaryEntry object.

Parameters:

dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – Location for the data to be loaded.

to_dict(dict_ptr, loc)[source]

Convert data to a dictionary branch on saving.

Parameters:

dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – File location for the data to be saved.

pylablib.core.fileio.dict_entry.parse_stored_table_data(desc=None, data=None, out_type='pandas')[source]

Parse table data corresponding to the given description dictionary and data.

Parameters:

desc – description dictionary; can be None, if no description is given
data – separately loaded data; can be None, if no data is given (in this case assume that it is stored in the description dictionary); can be a tuple (column_data, column_names) (such as the one returned by parse_csv.read_table()), or a an InlineTable object containing such tuple.
out_type (str) – Output format of the data ('array' for numpy arrays or 'pandas' for pandas DataFrame objects).

Returns:

tuple (data, columns), where data is the data table in the specified format, and columns is the list of columns

class pylablib.core.fileio.dict_entry.ITableDictionaryEntry(data, columns=None)[source]

Bases: IDictionaryEntry

A generic table Dictionary entry.

Parameters:

data – Table data.
columns (list) – If not None, list of column names (if None and data is a pandas DataFrame object, get column names from that).

classmethod is_data_valid(data)[source]: Check if a data object can be wrapped by the current entry class

classmethod from_dict(dict_ptr, loc, out_type='pandas')[source]

Convert a dictionary branch to a specific DictionaryEntry object.

Parameters:

dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – Location for the data to be loaded.
out_type (str) – Output format of the data ('array' for numpy arrays or 'pandas' for pandas DataFrame objects), used only if the dictionary doesn’t provide the format.

classmethod is_branch_valid(branch): Check if a branch can be parsed by the current entry class

to_dict(dict_ptr, loc)

Convert data to a dictionary branch on saving.

Parameters:

dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – File location for the data to be saved.

class pylablib.core.fileio.dict_entry.InlineTableDictionaryEntry(data, columns=None)[source]

Bases: ITableDictionaryEntry

An inlined table Dictionary entry.

Parameters:

data – Table data.
columns (list) – If not None, a list of column names (if None and data is a pandas DataFrame object, get column names from that).

to_dict(dict_ptr, loc)[source]: Convert the data to a dictionary branch and write the table to the file.

classmethod from_dict(dict_ptr, loc, out_type='pandas')[source]

Build an InlineTableDictionaryEntry object from the dictionary and read the inlined data.

Parameters:

dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – Location for the data to be loaded.
out_type (str) – Output format of the data ('array' for numpy arrays or 'pandas' for pandas DataFrame objects).

classmethod is_branch_valid(branch): Check if a branch can be parsed by the current entry class

classmethod is_data_valid(data): Check if a data object can be wrapped by the current entry class

class pylablib.core.fileio.dict_entry.IExternalTableDictionaryEntry(data, file_format, name, columns, force_name=True)[source]

Bases: ITableDictionaryEntry

classmethod from_dict(dict_ptr, loc, out_type='pandas')[source]

Convert a dictionary branch to a specific DictionaryEntry object.

Parameters:

dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – Location for the data to be loaded.
out_type (str) – Output format of the data ('array' for numpy arrays or 'pandas' for pandas DataFrame objects), used only if the dictionary doesn’t provide the format.

classmethod is_branch_valid(branch): Check if a branch can be parsed by the current entry class

classmethod is_data_valid(data): Check if a data object can be wrapped by the current entry class

to_dict(dict_ptr, loc)

Convert data to a dictionary branch on saving.

Parameters:

dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – File location for the data to be saved.

class pylablib.core.fileio.dict_entry.ExternalTextTableDictionaryEntry(data=None, file_format='csv', name='', columns=None, force_name=True)[source]

Bases: IExternalTableDictionaryEntry

An external text table Dictionary entry.

Parameters:

data – Table data.
file_format (str) – Output file format.
name (str) – Name template for the external file (default is the full path connected with "_" symbol).
columns (list) – If not None, a list of column names (if None and data is a pandas DataFrame object, get column names from that).
force_name (bool) – If False and the target file already exists, generate a new unique name; otherwise, overwrite the file.

to_dict(dict_ptr, loc)[source]: Convert the data to a dictionary branch and save the table to an external file.

classmethod from_dict(dict_ptr, loc, out_type='pandas')[source]

Build an ExternalTextTableDictionaryEntry object from the dictionary and load the external data.

Parameters:

dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – Location for the data to be loaded.
out_type (str) – Output format of the data ('array' for numpy arrays or 'pandas' for pandas DataFrame objects).

classmethod is_branch_valid(branch): Check if a branch can be parsed by the current entry class

classmethod is_data_valid(data): Check if a data object can be wrapped by the current entry class

class pylablib.core.fileio.dict_entry.ExternalBinTableDictionaryEntry(data=None, file_format='bin', name='', columns=None, force_name=True)[source]

Bases: IExternalTableDictionaryEntry

An external binary table Dictionary entry.

Parameters:

data – Table data.
file_format (str) – Output file format.
name (str) – Name template for the external file (default is the full path connected with "_" symbol).
columns (list) – If not None, a list of column names (if None and data is a pandas DataFrame object, get column names from that).
force_name (bool) – If False and the target file already exists, generate a new unique name; otherwise, overwrite the file.

to_dict(dict_ptr, loc)[source]: Convert the data to a dictionary branch and save the table to an external file.

classmethod from_dict(dict_ptr, loc, out_type='pandas')[source]

Build an ExternalBinTableDictionaryEntry object from the dictionary and load the external data.

Parameters:

dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – Location for the data to be loaded.
out_type (str) – Output format of the data ('array' for numpy arrays or 'pandas' for pandas DataFrame objects).

classmethod is_branch_valid(branch): Check if a branch can be parsed by the current entry class

classmethod is_data_valid(data): Check if a data object can be wrapped by the current entry class

pylablib.core.fileio.dict_entry.table_entry_builder(table_format='inline')[source]

Make an entry builder for tables depending on the table format.

Parameters:: table_format (str) – Default format for table (numpy arrays or pandas DataFrames) entries. Can be 'inline' (table is written inside the file), 'csv' (external CSV file) or 'bin' (external binary file).

class pylablib.core.fileio.dict_entry.IExternalFileDictionaryEntry(data, name='', force_name=True)[source]

Bases: IDictionaryEntry

Generic dictionary entry for data in an external file.

Parameters:

data – Stored data.
name (str) – Name template for the external file (default is the full path connected with "_" symbol).
force_name (bool) – If False and the target file already exists, generate a new unique name; otherwise, overwrite the file.

file_format = None

static add_file_format(subclass)[source]

Register an IExternalFileDictionaryEntry as a possible stored file format.

Used to automatically invoke a correct loader when loading the dictionary file. Only needs to be done once after the subclass declaration.

to_dict(dict_ptr, loc)[source]: Convert the data to a dictionary branch and save the data to an external file

classmethod from_dict(dict_ptr, loc)[source]

Build an IExternalFileDictionaryEntry object from the dictionary and load the external data.

Parameters:

dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – Location for the data to be loaded.

get_preamble()[source]: Generate preamble (dictionary with supplementary data which allows to load the data from the file)

save_file(location_file)[source]

Save stored data into the given location.

Virtual method, should be overloaded in subclasses

classmethod load_file(location_file, preamble)[source]

Load stored data from the given location, using the supplied preamble.

Virtual method, should be overloaded in subclasses

classmethod is_branch_valid(branch): Check if a branch can be parsed by the current entry class

classmethod is_data_valid(data): Check if a data object can be wrapped by the current entry class

class pylablib.core.fileio.dict_entry.ExternalNumpyDictionaryEntry(data, name='', force_name=True, dtype=None)[source]

Bases: IExternalFileDictionaryEntry

A dictionary entry which stores the numpy array data into an external file in binary format.

Parameters:

data – Numpy array data.
name (str) – Name template for the external file (default is the full path connected with "_" symbol).
force_name (bool) – If False and the target file already exists, generate a new unique name; otherwise, overwrite the file.
dtype – numpy dtype to load/save the data (by default, dtype of the supplied data).

file_format = 'numpy'

get_preamble()[source]: Generate preamble (dictionary with supplementary data which allows to load the data from the file)

save_file(location_file)[source]: Save stored data into the given location

classmethod load_file(location_file, preamble)[source]: Load stored data from the given location, using the supplied preamble

static add_file_format(subclass)

Register an IExternalFileDictionaryEntry as a possible stored file format.

Used to automatically invoke a correct loader when loading the dictionary file. Only needs to be done once after the subclass declaration.

classmethod from_dict(dict_ptr, loc)

Build an IExternalFileDictionaryEntry object from the dictionary and load the external data.

Parameters:

dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – Location for the data to be loaded.

classmethod is_branch_valid(branch): Check if a branch can be parsed by the current entry class

classmethod is_data_valid(data): Check if a data object can be wrapped by the current entry class

to_dict(dict_ptr, loc): Convert the data to a dictionary branch and save the data to an external file

class pylablib.core.fileio.dict_entry.ExpandedContainerDictionaryEntry(data)[source]

Bases: IDictionaryEntry

A dictionary entry which expands containers (lists, tuples, dictionaries) into subdictionaries.

Useful when the data in the containers is complex, so writing it into one line (as is default for lists and tuples) wouldn’t work.

Parameters:: data – Container data.

to_dict(dict_ptr, loc)[source]: Convert the stored container to a dictionary branch

classmethod from_dict(dict_ptr, loc)[source]: Build an ExpandedContainerDictionaryEntry object from the dictionary

classmethod is_branch_valid(branch): Check if a branch can be parsed by the current entry class

classmethod is_data_valid(data): Check if a data object can be wrapped by the current entry class

pylablib.core.fileio.loadfile module

Utilities for reading data files.

class pylablib.core.fileio.loadfile.IInputFileFormat[source]

Bases: object

Generic class for an input file format.

Based on file_format or autodetection, calls one of its subclasses to read the file.

Defines a single static method

static detect_file_format(location_file)[source]

read(location_file)[source]: Read a file at a given location

class pylablib.core.fileio.loadfile.ITextInputFileFormat[source]

Bases: IInputFileFormat

Generic class for a text input file format.

Based on file_format or autodetection, calls one of its subclasses to read the file.

static detect_file_format(location_file)[source]

read(location_file): Read a file at a given location

class pylablib.core.fileio.loadfile.CSVTableInputFileFormat(out_type='default', dtype='numeric', columns=None, delimiters=None, empty_entry_substitute=None, ignore_corrupted_lines=True, skip_lines=0)[source]

Bases: ITextInputFileFormat

Class for CSV input file format.

Parameters:

out_type (str) – type of the result: 'array' for numpy array, 'pandas' for pandas DataFrame, or 'default' (determined by the library default; 'pandas' by default)
dtype – dtype of entries; can be either a single type, or a list of types (one per column). Possible dtypes are: 'int', 'float', 'complex', 'numeric' (tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex), 'generic' (accept arbitrary types, including lists, dictionaries, escaped strings, etc.), 'raw' (keep raw string).
columns – either a number if columns, or a list of columns names.
delimiters (str) – Regex string which recognizes entries delimiters (by default r"\s*,\s*|\s+", i.e., commas and whitespaces).
empty_entry_substitute – Substitute for empty table entries. If None, all empty table entries are skipped.
ignore_corrupted_lines (bool) – If True, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raise ValueError.
skip_lines (int) – Number of lines to skip from the beginning of the file.

read(location_file)[source]: Read a file at a given location

static detect_file_format(location_file)

class pylablib.core.fileio.loadfile.DictionaryInputFileFormat(case_normalization=None, inline_dtype='generic', inline_out_type='default', entry_format='value', allow_duplicate_keys=False, skip_lines=0)[source]

Bases: ITextInputFileFormat

Class for Dictionary input file format.

Parameters:

location_file – Location of the data.
case_normalization (str) – If None, the dictionary paths are case-sensitive; otherwise, defines the way the entries are normalized ('lower' or 'upper').
inline_dtype (str) – dtype for inlined tables.
inline_out_type (str) – type of the result of the inline table: 'array' for numpy array, 'pandas' for pandas DataFrame, 'raw' for raw InlineTable data containing tuple (column_data, column_names), or 'default' (determined by the library default; 'pandas' by default).
entry_format (str) – Determines the way for dealing with dict_entry.IDictionaryEntry objects (objects transformed into dictionary branches with special recognition rules). Can be 'branch' (don’t attempt to recognize those object, leave dictionary as in the file), 'dict_entry' (recognize and leave as dict_entry.IDictionaryEntry objects) or 'value' (recognize and keep the value).
allow_duplicate_keys (bool) – if False and the same key is mentioned twice in the file, raise and error
skip_lines (int) – Number of lines to skip from the beginning of the file.

read(location_file)[source]: Read a file at a given location

static detect_file_format(location_file)

class pylablib.core.fileio.loadfile.BinaryTableInputFileFormatter(out_type='default', dtype='<f8', columns=None, packing='flatten', preamble=None, skip_bytes=0)[source]

Bases: IInputFileFormat

Class for binary input file format.

Parameters:

location_file – Location of the data.
out_type (str) – type of the result: 'array' for numpy array, 'pandas' for pandas DataFrame, or 'default' (determined by the library default; 'pandas' by default)
dtype – numpy.dtype describing the data.
columns – either number if columns, or a list of columns names.
packing (str) – The way the 2D array is packed. Can be either 'flatten' (data is stored row-wise) or 'transposed' (data is stored column-wise).
preamble (dict) – If not None, defines binary file parameters that supersede the parameters supplied to the function. The defined parameters are 'dtype', 'packing', 'ncols' (number of columns) and 'nrows' (number of rows).
skip_bytes (int) – Number of bytes to skip from the beginning of the file.

read(location_file)[source]: Read a file at a given location

static detect_file_format(location_file)

pylablib.core.fileio.loadfile.build_file_format(location_file, file_format='generic', **kwargs)[source]

Create file format (IInputFileFormat instance) for given parameters and file locations.

If file_format is already an instance of IInputFileFormat, return unchanged. If file_format is generic (e.g., "generic" or "test"), attempt to autodetect it from the file. **kwargs are passed to the file format constructor.

pylablib.core.fileio.loadfile.load_raw(path=None, loc='file', skip_bytes=0, nbytes=None, encoding=None, transformer=None)[source]

Load raw binary data from the file.

Parameters:

path (str) – path to the file of a file-like object
loc (str) – location type ("file" means the usual file location; see location.get_location() for details)
skip_bytes (int) – Number of bytes to skip from the beginning of the file.
nbytes (int) – Number of bytes to read from the file.
encoding – if a new file location is opened, this specifies the encoding
transformer – binary transformer applied to the file data (note that the whole file is read into memory and transformed if the transformer is specified)

pylablib.core.fileio.loadfile.load_csv(path=None, out_type='default', dtype='numeric', columns=None, delimiters=None, empty_entry_substitute=None, ignore_corrupted_lines=True, skip_lines=0, loc='file', encoding=None, transformer=None, return_file=False)[source]

Load data table from a CSV/table file.

Parameters:

path (str) – path to the file of a file-like object
out_type (str) – type of the result: 'array' for numpy array, 'pandas' for pandas DataFrame, or 'default' (determined by the library default; 'pandas' by default)
dtype – dtype of entries; can be either a single type, or a list of types (one per column). Possible dtypes are: 'int', 'float', 'complex', 'numeric' (tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex), 'generic' (accept arbitrary types, including lists, dictionaries, escaped strings, etc.), 'raw' (keep raw string).
columns – either a number if columns, or a list of columns names
delimiters (str) – regex string which recognizes entries delimiters (by default r"\s*,\s*|\s+", i.e., commas and whitespaces)
empty_entry_substitute – substitute for empty table entries. If None, all empty table entries are skipped
ignore_corrupted_lines (bool) – if True, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raise ValueError
skip_lines (int) – number of lines to skip from the beginning of the file
loc (str) – location type ("file" means the usual file location; see location.get_location() for details)
encoding – if a new file location is opened, this specifies the encoding
transformer – binary transformer applied to the file data (note that the whole file is read into memory and transformed if the transformer is specified)
return_file (bool) – if True, return DataFile object (contains some metainfo); otherwise, return just the file data

pylablib.core.fileio.loadfile.load_csv_desc(path=None, loc='file', encoding=None, return_file=False)[source]

Load data from the extended CSV table file.

Analogous to load_dict(), but doesn’t allow any additional parameters (which don’t matter in this case).

Parameters:

path (str) – path to the file of a file-like object
loc (str) – location type ("file" means the usual file location; see location.get_location() for details)
encoding – if a new file location is opened, this specifies the encoding
return_file (bool) – if True, return DataFile object (contains some metainfo); otherwise, return just the file data

pylablib.core.fileio.loadfile.load_bin(path=None, out_type='default', dtype='<f8', columns=None, packing='flatten', preamble=None, skip_bytes=0, loc='file', encoding=None, transformer=None, return_file=False)[source]

Load data from the binary file.

Parameters:

path (str) – path to the file of a file-like object
out_type (str) – type of the result: 'array' for numpy array, 'pandas' for pandas DataFrame, or 'default' (determined by the library default; 'pandas' by default)
dtype – numpy.dtype describing the data.
columns – either number if columns, or a list of columns names.
packing (str) – The way the 2D array is packed. Can be either 'flatten' (data is stored row-wise) or 'transposed' (data is stored column-wise).
preamble (dict) – If not None, defines binary file parameters that supersede the parameters supplied to the function. The defined parameters are 'dtype', 'packing', 'ncols' (number of columns) and 'nrows' (number of rows).
skip_bytes (int) – Number of bytes to skip from the beginning of the file.
loc (str) – location type ("file" means the usual file location; see location.get_location() for details)
encoding – if a new file location is opened, this specifies the encoding
transformer – binary transformer applied to the file data (note that the whole file is read into memory and transformed if the transformer is specified)
return_file (bool) – if True, return DataFile object (contains some metainfo); otherwise, return just the file data

pylablib.core.fileio.loadfile.load_bin_desc(path=None, loc='file', encoding=None, return_file=False)[source]

Load data from the binary file with a description.

Analogous to load_dict(), but doesn’t allow any additional parameters (which don’t matter in this case).

Parameters:

path (str) – path to the file of a file-like object
loc (str) – location type ("file" means the usual file location; see location.get_location() for details)
encoding – if a new file location is opened, this specifies the encoding
return_file (bool) – if True, return DataFile object (contains some metainfo); otherwise, return just the file data

pylablib.core.fileio.loadfile.load_dict(path=None, case_normalization=None, inline_dtype='generic', entry_format='value', inline_out_type='default', skip_lines=0, allow_duplicate_keys=False, loc='file', encoding=None, transformer=None, return_file=False)[source]

Load data from the dictionary file.

Parameters:

path (str) – path to the file of a file-like object
case_normalization (str) – If None, the dictionary paths are case-sensitive; otherwise, defines the way the entries are normalized ('lower' or 'upper').
inline_dtype (str) – dtype for inlined tables.
inline_out_type (str) – type of the result of the inline table: 'array' for numpy array, 'pandas' for pandas DataFrame, 'raw' for raw InlineTable data containing tuple (column_data, column_names), or 'default' (determined by the library default; 'pandas' by default).
entry_format (str) – Determines the way for dealing with dict_entry.IDictionaryEntry objects (objects transformed into dictionary branches with special recognition rules). Can be 'branch' (don’t attempt to recognize those object, leave dictionary as in the file), 'dict_entry' (recognize and leave as dict_entry.IDictionaryEntry objects) or 'value' (recognize and keep the value).
allow_duplicate_keys (bool) – if False and the same key is mentioned twice in the file, raise and error
skip_lines (int) – Number of lines to skip from the beginning of the file.
loc (str) – location type ("file" means the usual file location; see location.get_location() for details)
encoding – if a new file location is opened, this specifies the encoding
transformer – binary transformer applied to the file data (note that the whole file is read into memory and transformed if the transformer is specified)
return_file (bool) – if True, return DataFile object (contains some metainfo); otherwise, return just the file data

pylablib.core.fileio.loadfile.load_generic(path=None, file_format=None, loc='file', encoding=None, transformer=None, return_file=False, **kwargs)[source]

Load data from the file.

Parameters:

path (str) – path to the file of a file-like object
file_format (str) – input file format; if None, attempt to auto-detect file format (same as 'generic'); can also be an IInputFileFormat instance for specific reading method
loc (str) – location type ("file" means the usual file location; see location.get_location() for details)
encoding – if a new file location is opened, this specifies the encoding
transformer – binary transformer applied to the file data (note that the whole file is read into memory and transformed if the transformer is specified)
return_file (bool) – if True, return DataFile object (contains some metainfo); otherwise, return just the file data

**kwargs are passed to the file formatter used to read the data (see CSVTableInputFileFormat, DictionaryInputFileFormat and BinaryTableInputFileFormatter for the possible arguments). The default format names are:

'generic': Generic file format. Attempt to autodetect, raise IOError if unsuccessful;

'txt': Generic text file. Attempt to autodetect, raise IOError if unsuccessful

'csv': CSV file, corresponds to CSVTableInputFileFormat;

'dict': Dictionary file, corresponds to DictionaryInputFileFormat;

'bin': Binary file, corresponds to BinaryTableInputFileFormatter

pylablib.core.fileio.loadfile_utils module

Miscellaneous utilities for reading data files.

pylablib.core.fileio.loadfile_utils.is_unprintable_character(chn)[source]

pylablib.core.fileio.loadfile_utils.detect_binary_file(stream)[source]: Check if the opened file is binary

pylablib.core.fileio.loadfile_utils.test_row_type(line)[source]

Try to determine whether the line is a comment line, a numerical data row, a dictionary row or an unrecognized row.

Doesn’t distinguish with a great accuracy; useful only for trying to guess file format.

pylablib.core.fileio.loadfile_utils.detect_textfile_type(stream)[source]: Try to autodetect text file type: dictionary or table

pylablib.core.fileio.loadfile_utils.test_savetime_comment(line)[source]: Test if the comment resembles a savetime line

pylablib.core.fileio.loadfile_utils.find_savetime_comment(comments)[source]: Try to find savetime comment

pylablib.core.fileio.loadfile_utils.test_columns_line(line, cols_num)[source]: Test if the line looks like a list of columns for a given columns number

pylablib.core.fileio.loadfile_utils.find_columns_lines(corrupted, comments, cols_num)[source]: Try to find a column line (for a given columns number) among the comment and corrupted lines

class pylablib.core.fileio.loadfile_utils.InlineTable(table)[source]

Bases: object

Simple marker class that denotes that the wrapped numpy 2D array should be written inline

pylablib.core.fileio.loadfile_utils.parse_dict_line(line)[source]: Parse stripped dictionary file line

pylablib.core.fileio.loadfile_utils.read_dict_and_comments(f, case_normalization=None, inline_dtype='generic', allow_duplicate_keys=False)[source]

Load dictionary entries and comments from the file stream.

Parameters:

f – file stream
case_normalization – case normalization for the returned dictionary; None means that it’s case sensitive, "upper" and "lower" determine how they are normalized
inline_dtype – dtype for inline tables; by default, use the most generic type (can include Python objects such as lists or strings)
allow_duplicate_keys – if False and the same key is listed twice, raise and error

Return tuple (data, comment_lines), where data is a dictionary with parsed entries (tables are still represented as ‘raw’, i.e., as a tuple of columns list and column names list), and comment_lines is a list of comment lines

pylablib.core.fileio.location module

Classes for describing a generic file location.

class pylablib.core.fileio.location.LocationName(path=None, ext=None)[source]

Bases: object

File name inside a location.

Parameters:

path – Path inside the location. Gets normalized according to the Dictionary rules (not case-sensitive; '/' and '\' are the delimiters).
ext (str) – Name extension (None is default).

get_path(default_path='', sep='/')[source]

Get the string path.

If the object’s path is None, use default_path instead. If sep is not None, use it to join the path entries; otherwise, return the path in a list form.

get_ext(default_ext='')[source]

Get the extension.

If the object’s ext is None, use default_ext instead.

to_string(default_path='', default_ext='', path_sep='/', ext_sep='|', add_empty_ext=True)[source]

Convert the path to a string representation.

Parameters:

default_path (str) – Use it as path if the object’s path is None.
default_ext (str) – Use it as path if the object’s ext is None.
path_sep (str) – Use it to join the path entries.
ext_sep (str) – Use it to join path and extension.
add_empty_ext (str) – If False and the extension is empty, don’t add ext_sep in the end.

to_path(default_path='', default_ext='', ext_sep='|', add_empty_ext=True)[source]

Convert the path to a list representation.

Extension is added with ext_sep to the last entry in the path.

Parameters:

default_path (str) – Use it as path if the object’s path is None.
default_ext (str) – Use it as path if the object’s ext is None.
ext_sep (str) – Use it to join path and extension.
add_empty_ext (str) – If False and the extension is empty, don’t add ext_sep in the end.

static from_string(expr, ext_sep='|')[source]

Create a LocationName object from a string representation.

ext_sep defines extension separator; the path separators are '/' and '\'. Empty path or extension translate into None.

static from_object(obj)[source]

Create a LocationName object from an object.

obj can be a LocationName (return unchanged), tuple or list (use as construct arguments), string (treat as a string representation) or None (return empty name).

copy()[source]

class pylablib.core.fileio.location.LocationFile(loc, name=None)[source]

Bases: object

A file at a location.

Combines information about the location and the name within this location. Can be opened for reading or writing.

Parameters:

loc – File location.
name – File’s name inside the location.

loc: File location.

name: File’s name inside the location.

opened: Whether the file is currently opened.

open(mode='read', data_type='text')[source]

Open the file.

Parameters:

mode (str) – Opening mode. Can be 'read', 'write' or 'append', as well as standard abbreviation (e.g., "r" or "wb").
data_type (str) – Either 'text' or 'binary'; if mode is an abbreviation, this parameter is ignored (i.e., open("r","binary") still opens file as text).

close()[source]: Close the file

class pylablib.core.fileio.location.IDataLocation[source]

Bases: object

Generic location.

is_free(name=None)[source]: Check if the name is unoccupied

generate_new_name(prefix_name, idx=0)[source]

Generate a new name inside the location using the given prefix and starting index.

If idx is None, check just the prefix_name first before starting to append indices.

open(name=None, mode='read', data_type='text')[source]

Open a location file.

Parameters:

name – File name inside the location (None means ‘default’ location),
mode (str) – Opening mode. Can be 'read', 'write' or 'append', as well as standard abbreviation (e.g., "r" or "wb").
data_type (str) – Either 'text' or 'binary'; if mode is an abbreviation, this parameter is ignored (i.e., open("r","binary") still opens file as text).

close(name)[source]: Close a location file.

list_opened_files()[source]: Get a dictionary {string_name: location_file} of all files opened in this location

class pylablib.core.fileio.location.OpenedFileLocation(f, open_error=False, check_mode=False, check_data_type=True)[source]

Bases: object

File location which corresponds to an already opened file.

is_free(name=None)[source]

generate_new_name(prefix_name, idx=0)[source]

open(name=None, mode='read', data_type='text')[source]

close(name)[source]

list_opened_files()[source]

class pylablib.core.fileio.location.IFileSystemDataLocation(encoding=None, transformer=None)[source]

Bases: IDataLocation

A generic filesystem data location.

A single file name describes a single file in the filesystem.

get_filesystem_path(name=None, path_type='absolute')[source]

Get the filesystem path corresponding to a given name.

path_type can be 'absolute' (return absolute path), 'relative' (return relative path; level depends on the location) or 'name' (only return path inside the location).

is_free(name=None)[source]: Check if the name is unoccupied

open(name=None, mode='read', data_type='text')[source]

Open a location file.

Parameters:

name – File name inside the location (None means ‘default’ location),
mode (str) – Opening mode. Can be 'read', 'write' or 'append', as well as standard abbreviation (e.g., "r" or "wb").
data_type (str) – Either 'text' or 'binary'; if mode is an abbreviation, this parameter is ignored (i.e., open("r","binary") still opens file as text).

close(name)[source]: Close a location file

list_opened_files()[source]: Get a dictionary {string_name: location_file} of all files opened in this location

generate_new_name(prefix_name, idx=0)

Generate a new name inside the location using the given prefix and starting index.

If idx is None, check just the prefix_name first before starting to append indices.

class pylablib.core.fileio.location.SingleFileSystemDataLocation(file_path, encoding=None, transformer=None)[source]

Bases: IFileSystemDataLocation

A location describing a single file.

Any use of a non-default name raises ValueError.

Parameters:: file_path (str) – The path to the file.

get_filesystem_path(name=None, path_type='absolute')[source]

Get the filesystem path corresponding to a given name.

path_type can be 'absolute' (return absolute path), 'relative' (return relative path; level depends on the location) or 'name' (only return path inside the location).

close(name): Close a location file

generate_new_name(prefix_name, idx=0)

Generate a new name inside the location using the given prefix and starting index.

If idx is None, check just the prefix_name first before starting to append indices.

is_free(name=None): Check if the name is unoccupied

list_opened_files(): Get a dictionary {string_name: location_file} of all files opened in this location

open(name=None, mode='read', data_type='text')

Open a location file.

Parameters:

name – File name inside the location (None means ‘default’ location),
mode (str) – Opening mode. Can be 'read', 'write' or 'append', as well as standard abbreviation (e.g., "r" or "wb").
data_type (str) – Either 'text' or 'binary'; if mode is an abbreviation, this parameter is ignored (i.e., open("r","binary") still opens file as text).

class pylablib.core.fileio.location.PrefixedFileSystemDataLocation(file_path, prefix_template='{0}_{1}', encoding=None, transformer=None)[source]

Bases: IFileSystemDataLocation

A location describing a set of prefixed files.

Parameters:

file_path (str) – A master path. Its name is used as a prefix, and its extension is used as a default.
prefix_template (str) – A str.format() string for generating prefixed files. Has two arguments: the first is the master name, the second is the sub_location.

Multi-level paths translate into nested folders (the top level folder is combined from the file_path prefix and the first path entry).

get_filesystem_path(name=None, path_type='absolute')[source]

Get the filesystem path corresponding to a given name.

path_type can be 'absolute' (return absolute path), 'relative' (return relative path; level depends on the location) or 'name' (only return path inside the location).

close(name): Close a location file

generate_new_name(prefix_name, idx=0)

Generate a new name inside the location using the given prefix and starting index.

If idx is None, check just the prefix_name first before starting to append indices.

is_free(name=None): Check if the name is unoccupied

list_opened_files(): Get a dictionary {string_name: location_file} of all files opened in this location

open(name=None, mode='read', data_type='text')

Open a location file.

Parameters:

name – File name inside the location (None means ‘default’ location),
mode (str) – Opening mode. Can be 'read', 'write' or 'append', as well as standard abbreviation (e.g., "r" or "wb").
data_type (str) – Either 'text' or 'binary'; if mode is an abbreviation, this parameter is ignored (i.e., open("r","binary") still opens file as text).

class pylablib.core.fileio.location.FolderFileSystemDataLocation(folder_path, default_name='content', default_ext='', encoding=None, transformer=None)[source]

Bases: IFileSystemDataLocation

A location describing a single folder.

Parameters:

folder_path (str) – A path to the folder. Can also have one or two '|' symbols in the end (e.g., 'folder|file|dat'), which separate default name and extension (overrides default_name and default_ext parameters)
default_name (str) – The default file name.
default_ext (str) – The default file extension.

Multi-level paths translate into nested subfolders.

get_filesystem_path(name=None, path_type='absolute')[source]

Get the filesystem path corresponding to a given name.

path_type can be 'absolute' (return absolute path), 'relative' (return relative path; level depends on the location) or 'name' (only return path inside the location).

close(name): Close a location file

generate_new_name(prefix_name, idx=0)

Generate a new name inside the location using the given prefix and starting index.

If idx is None, check just the prefix_name first before starting to append indices.

is_free(name=None): Check if the name is unoccupied

list_opened_files(): Get a dictionary {string_name: location_file} of all files opened in this location

open(name=None, mode='read', data_type='text')

Open a location file.

Parameters:

name – File name inside the location (None means ‘default’ location),
mode (str) – Opening mode. Can be 'read', 'write' or 'append', as well as standard abbreviation (e.g., "r" or "wb").
data_type (str) – Either 'text' or 'binary'; if mode is an abbreviation, this parameter is ignored (i.e., open("r","binary") still opens file as text).

pylablib.core.fileio.location.get_location(path, loc, *args, **kwargs)[source]

Build a location.

If path or loc are instances of IDataLocation, return them unchanged. If loc is a string, it describes location kind:

'single_file': SingleFileSystemDataLocation with the given path.

'file' or 'prefixed_file': PrefixedFileSystemDataLocation with the given path as a master path.

'folder': FolderFileSystemDataLocation with the given folder path.

Any additional arguments are relayed to the constructors.

pylablib.core.fileio.parse_csv module

Utilities for parsing CSV files.

class pylablib.core.fileio.parse_csv.ChunksAccumulator(dtype='numeric', ignore_corrupted_lines=True, trim_rows=False)[source]

Bases: object

Class for accumulating data chunks into a single array.

Parameters:

dtype – dtype of entries; can be either a single type, or a list of types (one per column). Possible dtypes are: 'int', 'float', 'complex', 'numeric' (tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex), 'generic' (accept arbitrary types, including lists, dictionaries, escaped strings, etc.), 'raw' (keep raw string).
ignore_corrupted_lines – if True, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raise ValueError.
trim_rows – if True and the row length is larger than expected, drop extra entries; otherwise, treat the row as corrupted

corrupted_number()[source]

convert_columns(raw_columns)[source]: Convert raw columns into appropriate data structure (numpy array for numeric dtypes, list for generic and raw).

add_columns(columns)[source]: Append columns (lists or numpy arrays) to the existing data.

add_chunk(chunk)[source]: Add a chunk (2D list) to the pre-existing data.

pylablib.core.fileio.parse_csv.read_columns(f, dtype, delimiters='\\s*,\\s*|\\s+', empty_entry_substitute=None, ignore_corrupted_lines=True, trim_rows=False, stop_comment=None)[source]

Load columns from the file stream f.

Parameters:

dtype – dtype of entries; can be either a single type, or a list of types (one per column). Possible dtypes are: 'int', 'float', 'complex', 'numeric' (tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex), 'generic' (accept arbitrary types, including lists, dictionaries, escaped strings, etc.), 'raw' (keep raw string).
delimiters (str) – Regex string which recognizes delimiters (by default r"\s*,\s*|\s+", i.e., commas and whitespaces).
empty_entry_substitute – Substitute for empty table entries. If None, all empty table entries are skipped.
ignore_corrupted_lines – If True, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raise ValueError.
trim_rows – if True and the row length is larger than expected, drop extra entries; otherwise, treat the row as corrupted
stop_comment (str) – Regex string for the stopping comment. If not None. the function will stop if comment satisfying stop_comment regex is encountered.

Returns:

(columns, comments, corrupted_lines).

columns is a list of columns with data.

comments is a list of comment strings.

corrupted_lines is a dict {'size':list, 'type':list} of corrupted lines (already split into entries), based on the corruption type ('size' means too small size, 'type' means it couldn’t be converted using provided dtype).

Return type:

tuple

pylablib.core.fileio.parse_csv.columns_to_table(data, columns=None, dtype='numeric', out_type='columns')[source]

Convert data (columns list) into a table.

Parameters:

columns – either number if columns, or a list of columns names.
out_type (str) – type of the result: 'array' for numpy array, 'pandas' for pandas DataFrame, 'columns' for tuple (data, columns)

pylablib.core.fileio.parse_csv.read_table(f, dtype='numeric', columns=None, out_type='columns', delimiters='\\s*,\\s*|\\s+', empty_entry_substitute=None, ignore_corrupted_lines=True, trim_rows=False, stop_comment=None)[source]

Load table from the file stream f.

Arguments are the same as in read_columns() and columns_to_table().

Returns:

(table, comments, corrupted_lines).

table is a table of the format out_type.

corrupted_lines is a dict {'size':list, 'type':list} of corrupted lines (already split into entries), based on the corruption type ('size' means too small size, 'type' means it couldn’t be converted using provided dtype).

comments is a list of comment strings.

Return type:

tuple

pylablib.core.fileio.savefile module

Utilities for writing data files.

class pylablib.core.fileio.savefile.IOutputFileFormat(format_name)[source]

Bases: object

Generic class for an output file format.

Parameters:: format_name (str) – The name of the format (to be defined in subclasses).

write_file(location_file, to_save)[source]

write_data(location_file, data)[source]

write(location_file, data)[source]

class pylablib.core.fileio.savefile.ITextOutputFileFormat(format_name, save_props=True, save_comments=True, save_time=True, new_time=True)[source]

Bases: IOutputFileFormat

Generic class for a text output file format.

Parameters:

format_name (str) – The name of the format (to be defined in subclasses).
save_props (bool) – If True and saving datafile.DataFile object, save its props metainfo.
save_comments (bool) – If True and saving datafile.DataFile object, save its comments metainfo.
save_time (bool) – If True, append the file creation time in the end.
new_time (bool) – If saving datafile.DataFile object, determines if the time should be updated to the current time.

make_comment_line(comment)[source]

make_prop_line(name, value)[source]

make_savetime_line(time)[source]

static write_line(stream, line)[source]

write_comments(stream, comments)[source]

write_props(stream, props)[source]

write_savetime(stream, time)[source]

write_file(location_file, to_save)[source]

write(location_file, data)

write_data(location_file, data)

class pylablib.core.fileio.savefile.CSVTableOutputFileFormat(delimiters='\t', value_formats=None, use_rep_classes=False, save_columns=True, save_props=True, save_comments=True, save_time=True)[source]

Bases: ITextOutputFileFormat

Class for CSV output file format.

Parameters:

delimiters (str) – Used to separate entries in a row.
value_formats (str) – If not None, defines value formats to be passed to utils.string.to_string() function.
use_rep_classes (bool) – If True, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as "array([1, 2, 3])" instead of just "[1, 2, 3]"); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers).
save_columns (bool) – If True, save column names as a comment line in the beginning of the file.
save_props (bool) – If True and saving datafile.DataFile object, save its props metainfo.
save_comments (bool) – If True and saving datafile.DataFile object, save its comments metainfo.
save_time (bool) – If True, append the file creation time in the end.

get_table_line(line)[source]

get_columns_line(columns)[source]

write_data(location_file, data)[source]

Write data to a CSV file.

Parameters:

location_file – Location of the destination.
data – Data to be saved. Can be a pandas DataFrame or an arbitrary 2D array (numpy array, 2D list, etc.); if the data is not DataFrame or numpy 2D array, it gets converted into a DataFrame using the standard constructor (i.e., 2D list is interpreted as a list of rows)

make_comment_line(comment)

make_prop_line(name, value)

make_savetime_line(time)

write(location_file, data)

write_comments(stream, comments)

write_file(location_file, to_save)

static write_line(stream, line)

write_props(stream, props)

write_savetime(stream, time)

class pylablib.core.fileio.savefile.DictionaryOutputFileFormat(param_formats=None, use_rep_classes=False, table_format='inline', inline_delimiters='\t', inline_formats=None, save_props=True, save_comments=True, save_time=True)[source]

Bases: ITextOutputFileFormat

Class for Dictionary output file format.

Parameters:

param_formats (str) – If not None, defines value formats to be passed to utils.string.to_string() function when writing Dictionary entries.
use_rep_classes (bool) – If True, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as "array([1, 2, 3])" instead of just "[1, 2, 3]"); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers).
table_format (str) – Default format for table (numpy arrays or pandas DataFrames) entries. Can be 'inline' (table is written inside the file), 'csv' (external CSV file) or 'bin' (external binary file).
inline_delimiters (str) – Used to separate entries in a row for inline tables.
inline_formats (str) – If not None, defines value formats to be passed to utils.string.to_string() function when writing inline tables.
save_props (bool) – If True and saving datafile.DataFile object, save its props metainfo.
save_comments (bool) – If True and saving datafile.DataFile object, save its comments metainfo.
save_time (bool) – If True, append the file creation time in the end.

get_dictionary_line(path, value)[source]

write_data(location_file, data)[source]

Write data to a Dictionary file.

Parameters:

location_file – Location of the destination.
data – Data to be saved. Should be object of class Dictionary.

make_comment_line(comment)

make_prop_line(name, value)

make_savetime_line(time)

write(location_file, data)

write_comments(stream, comments)

write_file(location_file, to_save)

static write_line(stream, line)

write_props(stream, props)

write_savetime(stream, time)

class pylablib.core.fileio.savefile.IBinaryOutputFileFormat(format_name)[source]

Bases: IOutputFileFormat

get_preamble(location_file, data)[source]

write(location_file, data)

write_data(location_file, data)

write_file(location_file, to_save)

class pylablib.core.fileio.savefile.TableBinaryOutputFileFormat(dtype=None, transposed=False)[source]

Bases: IBinaryOutputFileFormat

Class for binary output file format.

Parameters:

dtype – a string with numpy dtype (e.g., "<f8") used to save the data. By default, use little-endian ("<") variant kind of the supplied data array dtype
transposed (bool) – If False, write the data row-wise; otherwise, write it column-wise.

get_dtype(table)[source]

get_preamble(location_file, data)[source]

Generate a preamble (dictionary describing the file format).

The parameters are 'dtype', 'packing' ('transposed' or 'flatten', depending on the transposed attribute), 'ncol' (number of columns) and 'nrows' (number of rows).

write_data(location_file, data)[source]

Write data to a binary file.

Parameters:

location_file – Location of the destination.
data – Data to be saved. Can be a pandas DataFrame or an arbitrary 2D array (numpy array, 2D list, etc.) Converted to numpy array before saving.

write_file(location_file, to_save)[source]

write(location_file, data)

pylablib.core.fileio.savefile.get_output_format(data, output_format, **kwargs)[source]

pylablib.core.fileio.savefile.save_raw(data, path, loc='file', encoding=None, transformer=None)[source]

Load raw binary data from the file.

Parameters:

data – data to write
path (str) – path to the file of a file-like object
loc (str) – location type ("file" means the usual file location; see location.get_location() for details)
encoding – if a new file location is opened, this specifies the encoding
transformer – binary transformer applied to the file data

pylablib.core.fileio.savefile.save_csv(data, path, delimiters='\t', value_formats=None, use_rep_classes=False, save_columns=True, save_props=True, save_comments=True, save_time=True, loc='file', encoding=None, transformer=None)[source]

Save data to a CSV file.

Parameters:

data – Data to be saved (2D numpy array, pandas DataFrame, or a datafile.DataFile object containing this data).
path (str) – Path to the file or a file-like object.
delimiters (str) – Used to separate entries in a row.
value_formats (str) – If not None, defines value formats to be passed to utils.string.to_string() function.
use_rep_classes (bool) – If True, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as "array([1, 2, 3])" instead of just "[1, 2, 3]"); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers).
save_columns (bool) – If True, save column names as a comment line in the beginning of the file.
save_props (bool) – If True and saving datafile.DataFile object, save its props metainfo.
save_comments (bool) – If True and saving datafile.DataFile object, save its comments metainfo.
save_time (bool) – If True, append the file creation time in the end.
loc (str) – Location type.
encoding – if a new file location is opened, this specifies the encoding.
transformer – binary transformer applied to the file data

pylablib.core.fileio.savefile.save_csv_desc(data, path, loc='file', encoding=None)[source]

Save data table to a dictionary file with an inlined table.

Compared to save_csv(), supports more pandas features (index, column multi-index), but can only be directly read by pylablib.

Parameters:

data – Data to be saved (2D numpy array, pandas DataFrame, or a datafile.DataFile object containing this data).
path (str) – Path to the file or a file-like object.
loc (str) – Location type.
encoding – if a new file location is opened, this specifies the encoding.

pylablib.core.fileio.savefile.save_bin(data, path, dtype=None, transposed=False, loc='file', encoding=None, transformer=None)[source]

Save data to a binary file.

Parameters:

data – Data to be saved (2D numpy array, pandas DataFrame, or a datafile.DataFile object containing this data).
path (str) – Path to the file or a file-like object.
dtype – numpy.dtype describing the data. By default, use little-endian ("<") variant kind of the supplied data array dtype.
transposed (bool) – If False, write the data row-wise; otherwise, write it column-wise.
loc (str) – Location type.
encoding – if a new file location is opened, this specifies the encoding.
transformer – binary transformer applied to the file data

pylablib.core.fileio.savefile.save_bin_desc(data, path, loc='file', encoding=None)[source]

Save data to a binary file with an additional description file, which contains all of the data related to loading (shape, dtype, columns, etc.)

Parameters:

data – Data to be saved (2D numpy array, pandas DataFrame, or a datafile.DataFile object containing this data).
path (str) – Path to the file or a file-like object.
loc (str) – Location type.
encoding – if a new file location is opened, this specifies the encoding.

pylablib.core.fileio.savefile.save_dict(data, path, param_formats=None, use_rep_classes=False, table_format='inline', inline_delimiters='\t', inline_formats=None, save_props=True, save_comments=True, save_time=True, loc='file', encoding=None, transformer=None)[source]

Save dictionary to a text file.

Parameters:

data – Data to be saved.
path (str) – Path to the file or a file-like object.
param_formats (str) – If not None, defines value formats to be passed to utils.string.to_string() function when writing Dictionary entries.
use_rep_classes (bool) – If True, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as "array([1, 2, 3])" instead of just "[1, 2, 3]"); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers).
table_format (str) – Default format for table (numpy arrays or pandas DataFrames) entries. Can be 'inline' (table is written inside the file), 'csv' (external CSV file) or 'bin' (external binary file).
inline_delimiters (str) – Used to separate entries in a row for inline tables.
inline_formats (str) – If not None, defines value formats to be passed to utils.string.to_string() function when writing inline tables.
save_props (bool) – If True and saving datafile.DataFile object, save its props metainfo.
save_comments (bool) – If True and saving datafile.DataFile object, save its comments metainfo.
save_time (bool) – If True, append the file creation time in the end.
loc (str) – Location type.
encoding – if a new file location is opened, this specifies the encoding.
transformer – binary transformer applied to the file data

pylablib.core.fileio.savefile.save_generic(data, path, output_format=None, loc='file', encoding=None, transformer=None, **kwargs)[source]

Save data to a file.

Parameters:

data – Data to be saved.
path (str) – Path to the file or a file-like object.
output_format (str) – Output file format. Can be either None (defaults to 'csv' for table data and 'dict' for Dictionary data), a string with one of the default format names, or an already prepared IOutputFileFormat object.
loc (str) – Location type.
encoding – if a new file location is opened, this specifies the encoding.
transformer – binary transformer applied to the file data

**kwargs are passed to the file formatter constructor (see CSVTableOutputFileFormat, DictionaryOutputFileFormat and TableBinaryOutputFileFormat for the possible arguments). The default format names are:

'csv': CSV file, corresponds to CSVTableOutputFileFormat and save_csv();

'csv': CSV file with an additional dictionary containing format description, corresponds to DictionaryOutputFileFormat and save_csv_desc();

'bin': Binary file, corresponds to TableBinaryOutputFileFormat and save_bin();

'bin_desc': Binary file with an additional dictionary containing format description, corresponds to DictionaryOutputFileFormat and save_bin_desc();

'dict': Dictionary file, corresponds to DictionaryOutputFileFormat and save_dict()

pylablib.core.fileio.table_stream module

class pylablib.core.fileio.table_stream.TableStreamFile(path, columns=None, delimiter='\t', fmt=None, add_timestamp=False, header_prepend='# ')[source]

Bases: object

Expanding table file.

Can define column names and formats for different columns, and repeatedly write data into the same file. Useful for, e.g., continuous log files.

Parameters:

path (str) – Path to the destination file.
columns (list) – If not None, it’s a list of column names to be added as a header on creation.
delimiter (str) – Values delimiter.
fmt (str) – If not None, it’s a list of format strings for the line entries (e.g., ".3f"); instead of format string one can also be None, which means using the standard to_string() conversion function
add_timestamp (bool) – If True, add the UNIX timestamp in the beginning of each line (columns and format are expanded accordingly)
header_prepend – the string to prepend to the header line; by default, a comment symbol, which is best compatibly with loadfile.load_csv() function

write_text_lines(lines)[source]

Write several text lines into the file.

Create the file if it doesn’t exist (in which case the header is automatically added).

Parameters:: lines ([str]) – List of lines to write.

write_row(row)[source]

Write a single data row into the file.

Create the file if it doesn’t exist (in which case the header is automatically added).

Parameters:: data (list or numpy.ndarray) – Data row to be added.

write_multiple_rows(rows)[source]

Write a multiple data lines into the file.

Create the file if it doesn’t exist (in which case the header is automatically added).

Parameters:: rows ([list or numpy.ndarray]) – Data rows to be added.

pylablib.core.fileio package

Submodules

pylablib.core.fileio.bin_transform module

pylablib.core.fileio.datafile module

pylablib.core.fileio.dict_entry module

pylablib.core.fileio.loadfile module

pylablib.core.fileio.loadfile_utils module

pylablib.core.fileio.location module

pylablib.core.fileio.parse_csv module

pylablib.core.fileio.savefile module

pylablib.core.fileio.table_stream module

Module contents