pylablib.core.fileio package¶
Submodules¶
pylablib.core.fileio.datafile module¶
-
class
pylablib.core.fileio.datafile.
DataFile
(data, filepath=None, filetype=None, creation_time=None, comments=None, props=None)[source]¶ Bases:
object
Describes a single datafile.
Parameters: - data – the main content of the file (usually a numpy array, a pandas DataFrame or a
Dictionary
). - filepath (str) – absolute path from which the file was read
- filetype (str) – a source type (e.g.,
"csv"
or"bin"
) - creation_time (datetime.datetime) – File creation time
- props (dict) – all the metainfo about the file (extracted from comments, filename etc.)
- comments (list) – all the comments excluding the ones containing props
- data – the main content of the file (usually a numpy array, a pandas DataFrame or a
pylablib.core.fileio.dict_entry module¶
Classes for dealing with the Dictionary
entries with special conversion rules when saved or loaded.
Used to redefine how certain objects (e.g., tables) inside dictionaries are written into files and read from files.
-
pylablib.core.fileio.dict_entry.
is_dict_entry_branch
(branch)[source]¶ Check if the dictionary branch contains a dictionary entry which needs to be specially converted.
-
class
pylablib.core.fileio.dict_entry.
DictEntryBuilder
(entry_cls, pred=None, **kwargs)[source]¶ Bases:
object
Object for building dictionary entries from objects.
Parameters: - entry_cls – dictionary entry class
- pred – method used to check if an object can be turned into the corresponding entry;
if
None
, use the default entry class checker (entry_class.is_data_valid
) - kwargs – keyword arguments passed to the entry constructor along with the data
-
class
pylablib.core.fileio.dict_entry.
DictEntryParser
(entry_cls, pred=None, **kwargs)[source]¶ Bases:
object
Object for building dictionary entries from dictionary branches.
Parameters: - entry_cls – dictionary entry class
- pred – method used to check if a dictionary branch can be turned into the corresponding entry;
if
None
, use the default entry class checker (entry_class.is_branch_valid
) - kwargs – keyword arguments passed to the entry
from_dict
class method along with the branch
-
pylablib.core.fileio.dict_entry.
add_dict_entry_builder
(builder)[source]¶ Add an entry builder to the global list of builders
-
pylablib.core.fileio.dict_entry.
add_dict_entry_parser
(parser)[source]¶ Add an entry parser to the global list of parsers
-
pylablib.core.fileio.dict_entry.
add_dict_entry_class
(cls)[source]¶ Add an entry class.
Automatically registers builder and parser, which take no additional arguments and use default class method to determine if an object/branch can be converted into an entry.
-
pylablib.core.fileio.dict_entry.
from_data
(data, builders=None)[source]¶ Build a dictionary entry from the data.
builders can contain an additional list of builder to try before using the default ones.
-
pylablib.core.fileio.dict_entry.
from_dict
(dict_ptr, loc, parsers=None)[source]¶ Build a dictionary entry from the dictionary branch and the file location.
parsers can contain an additional list of parsers to try before using the default ones.
-
class
pylablib.core.fileio.dict_entry.
IDictionaryEntry
(data)[source]¶ Bases:
object
A generic Dictionary entry.
Contains data represented by the node, as well as the way to represent this data as a dictionary branch.
Parameters: data – data to be wrapped -
is_data_valid
(class method)[source]¶ check if a data object can be wrapped by the current entry class
-
from_dict
(class method)[source]¶ create a dictionary entry of a given class from the dictionary branch
-
classmethod
is_data_valid
(data)[source] Check if a data object can be wrapped by the current entry class
-
classmethod
is_branch_valid
(branch)[source] Check if a branch can be parsed by the current entry class
-
classmethod
from_dict
(dict_ptr, loc)[source] Convert a dictionary branch to a specific
IDictionaryEntry
object.Parameters: - dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
-
to_dict
(dict_ptr, loc)[source] Convert data to a dictionary branch on saving.
Parameters: - dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – File location for the data to be saved.
-
-
pylablib.core.fileio.dict_entry.
parse_stored_table_data
(desc=None, data=None, out_type='pandas')[source]¶ Parse table data corresponding to the given description dictionary and data.
Parameters: - desc – description dictionary; can be
None
, if no description is given - data – separately loaded data; can be
None
, if no data is given (in this case assume that it is stored in the description dictionary); can be a tuple(column_data, column_names)
(such as the one returned byparse_csv.read_table()
), or a anInlineTable
object containing such tuple. - out_type (str) – Output format of the data (
'array'
for numpy arrays or'pandas'
for pandas DataFrame objects).
Returns: tuple
(data, columns)
, wheredata
is the data table in the specified format, andcolumns
is the list of columns- desc – description dictionary; can be
-
class
pylablib.core.fileio.dict_entry.
ITableDictionaryEntry
(data, columns=None)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IDictionaryEntry
A generic table Dictionary entry.
Parameters: - data – Table data.
- columns (list) – If not
None
, list of column names (ifNone
and data is a pandas DataFrame object, get column names from that).
-
classmethod
is_data_valid
(data)[source]¶ Check if a data object can be wrapped by the current entry class
-
classmethod
from_dict
(dict_ptr, loc, out_type='pandas')[source]¶ Convert a dictionary branch to a specific DictionaryEntry object.
Parameters: - dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'
for numpy arrays or'pandas'
for pandas DataFrame objects), used only if the dictionary doesn’t provide the format.
-
classmethod
is_branch_valid
(branch)¶ Check if a branch can be parsed by the current entry class
-
to_dict
(dict_ptr, loc)¶ Convert data to a dictionary branch on saving.
Parameters: - dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – File location for the data to be saved.
-
class
pylablib.core.fileio.dict_entry.
InlineTableDictionaryEntry
(data, columns=None)[source]¶ Bases:
pylablib.core.fileio.dict_entry.ITableDictionaryEntry
An inlined table Dictionary entry.
Parameters: - data – Table data.
- columns (list) – If not
None
, a list of column names (ifNone
and data is a pandas DataFrame object, get column names from that).
-
to_dict
(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and write the table to the file.
-
classmethod
from_dict
(dict_ptr, loc, out_type='pandas')[source]¶ Build an
InlineTableDictionaryEntry
object from the dictionary and read the inlined data.Parameters: - dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'
for numpy arrays or'pandas'
for pandas DataFrame objects).
-
classmethod
is_branch_valid
(branch)¶ Check if a branch can be parsed by the current entry class
-
classmethod
is_data_valid
(data)¶ Check if a data object can be wrapped by the current entry class
-
class
pylablib.core.fileio.dict_entry.
IExternalTableDictionaryEntry
(data, file_format, name, columns, force_name=True)[source]¶ Bases:
pylablib.core.fileio.dict_entry.ITableDictionaryEntry
-
classmethod
from_dict
(dict_ptr, loc, out_type='pandas')[source]¶ Convert a dictionary branch to a specific DictionaryEntry object.
Parameters: - dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'
for numpy arrays or'pandas'
for pandas DataFrame objects), used only if the dictionary doesn’t provide the format.
-
classmethod
is_branch_valid
(branch)¶ Check if a branch can be parsed by the current entry class
-
classmethod
is_data_valid
(data)¶ Check if a data object can be wrapped by the current entry class
-
to_dict
(dict_ptr, loc)¶ Convert data to a dictionary branch on saving.
Parameters: - dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – File location for the data to be saved.
-
classmethod
-
class
pylablib.core.fileio.dict_entry.
ExternalTextTableDictionaryEntry
(data=None, file_format='csv', name='', columns=None, force_name=True)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IExternalTableDictionaryEntry
An external text table Dictionary entry.
Parameters: - data – Table data.
- file_format (str) – Output file format.
- name (str) – Name template for the external file (default is the full path connected with
"_"
symbol). - columns (list) – If not
None
, a list of column names (ifNone
and data is a pandas DataFrame object, get column names from that). - force_name (bool) – If
False
and the target file already exists, generate a new unique name; otherwise, overwrite the file.
-
to_dict
(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and save the table to an external file.
-
classmethod
from_dict
(dict_ptr, loc, out_type='pandas')[source]¶ Build an
ExternalTextTableDictionaryEntry
object from the dictionary and load the external data.Parameters: - dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'
for numpy arrays or'pandas'
for pandas DataFrame objects).
-
classmethod
is_branch_valid
(branch)¶ Check if a branch can be parsed by the current entry class
-
classmethod
is_data_valid
(data)¶ Check if a data object can be wrapped by the current entry class
-
class
pylablib.core.fileio.dict_entry.
ExternalBinTableDictionaryEntry
(data=None, file_format='bin', name='', columns=None, force_name=True)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IExternalTableDictionaryEntry
An external binary table Dictionary entry.
Parameters: - data – Table data.
- file_format (str) – Output file format.
- name (str) – Name template for the external file (default is the full path connected with
"_"
symbol). - columns (list) – If not
None
, a list of column names (ifNone
and data is a pandas DataFrame object, get column names from that). - force_name (bool) – If
False
and the target file already exists, generate a new unique name; otherwise, overwrite the file.
-
to_dict
(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and save the table to an external file.
-
classmethod
from_dict
(dict_ptr, loc, out_type='pandas')[source]¶ Build an
ExternalBinTableDictionaryEntry
object from the dictionary and load the external data.Parameters: - dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'
for numpy arrays or'pandas'
for pandas DataFrame objects).
-
classmethod
is_branch_valid
(branch)¶ Check if a branch can be parsed by the current entry class
-
classmethod
is_data_valid
(data)¶ Check if a data object can be wrapped by the current entry class
-
pylablib.core.fileio.dict_entry.
table_entry_builder
(table_format='inline')[source]¶ Make an entry builder for tables depending on the table format.
Parameters: table_format (str) – Default format for table (numpy arrays or pandas DataFrames) entries. Can be 'inline'
(table is written inside the file),'csv'
(external CSV file) or'bin'
(external binary file).
-
class
pylablib.core.fileio.dict_entry.
IExternalFileDictionaryEntry
(data, name='', force_name=True)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IDictionaryEntry
Generic dictionary entry for data in an external file.
Parameters: -
file_format
= None¶
-
static
add_file_format
(subclass)[source]¶ Register an
IExternalFileDictionaryEntry
as a possible stored file format.Used to automatically invoke a correct loader when loading the dictionary file. Only needs to be done once after the subclass declaration.
-
to_dict
(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and save the data to an external file
-
classmethod
from_dict
(dict_ptr, loc)[source]¶ Build an
IExternalFileDictionaryEntry
object from the dictionary and load the external data.Parameters: - dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
-
get_preamble
()[source]¶ Generate preamble (dictionary with supplementary data which allows to load the data from the file)
-
save_file
(location_file)[source]¶ Save stored data into the given location.
Virtual method, should be overloaded in subclasses
-
classmethod
load_file
(location_file, preamble)[source]¶ Load stored data from the given location, using the supplied preamble.
Virtual method, should be overloaded in subclasses
-
classmethod
is_branch_valid
(branch)¶ Check if a branch can be parsed by the current entry class
-
classmethod
is_data_valid
(data)¶ Check if a data object can be wrapped by the current entry class
-
-
class
pylablib.core.fileio.dict_entry.
ExternalNumpyDictionaryEntry
(data, name='', force_name=True, dtype=None)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IExternalFileDictionaryEntry
A dictionary entry which stores the numpy array data into an external file in binary format.
Parameters: - data – Numpy array data.
- name (str) – Name template for the external file (default is the full path connected with
"_"
symbol). - force_name (bool) – If
False
and the target file already exists, generate a new unique name; otherwise, overwrite the file. - dtype – numpy dtype to load/save the data (by default, dtype of the supplied data).
-
file_format
= 'numpy'¶
-
get_preamble
()[source]¶ Generate preamble (dictionary with supplementary data which allows to load the data from the file)
-
classmethod
load_file
(location_file, preamble)[source]¶ Load stored data from the given location, using the supplied preamble
-
static
add_file_format
(subclass)¶ Register an
IExternalFileDictionaryEntry
as a possible stored file format.Used to automatically invoke a correct loader when loading the dictionary file. Only needs to be done once after the subclass declaration.
-
classmethod
from_dict
(dict_ptr, loc)¶ Build an
IExternalFileDictionaryEntry
object from the dictionary and load the external data.Parameters: - dict_ptr (dictionary.DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
-
classmethod
is_branch_valid
(branch)¶ Check if a branch can be parsed by the current entry class
-
classmethod
is_data_valid
(data)¶ Check if a data object can be wrapped by the current entry class
-
to_dict
(dict_ptr, loc)¶ Convert the data to a dictionary branch and save the data to an external file
-
class
pylablib.core.fileio.dict_entry.
ExpandedContainerDictionaryEntry
(data)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IDictionaryEntry
A dictionary entry which expands containers (lists, tuples, dictionaries) into subdictionaries.
Useful when the data in the containers is complex, so writing it into one line (as is default for lists and tuples) wouldn’t work.
Parameters: data – Container data. -
classmethod
from_dict
(dict_ptr, loc)[source]¶ Build an
ExpandedContainerDictionaryEntry
object from the dictionary
-
classmethod
is_branch_valid
(branch)¶ Check if a branch can be parsed by the current entry class
-
classmethod
is_data_valid
(data)¶ Check if a data object can be wrapped by the current entry class
-
classmethod
pylablib.core.fileio.loadfile module¶
Utilities for reading data files.
-
class
pylablib.core.fileio.loadfile.
IInputFileFormat
[source]¶ Bases:
object
Generic class for an input file format.
Based on file_format or autodetection, calls one of its subclasses to read the file.
Defines a single static method
-
class
pylablib.core.fileio.loadfile.
ITextInputFileFormat
[source]¶ Bases:
pylablib.core.fileio.loadfile.IInputFileFormat
Generic class for a text input file format.
Based on file_format or autodetection, calls one of its subclasses to read the file.
-
read
(location_file)¶ Read a file at a given location
-
-
class
pylablib.core.fileio.loadfile.
CSVTableInputFileFormat
(out_type='default', dtype='numeric', columns=None, delimiters=None, empty_entry_substitute=None, ignore_corrupted_lines=True, skip_lines=0)[source]¶ Bases:
pylablib.core.fileio.loadfile.ITextInputFileFormat
Class for CSV input file format.
Parameters: - out_type (str) – type of the result:
'array'
for numpy array,'pandas'
for pandas DataFrame, or'default'
(determined by the library default;'pandas'
by default) - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
'int'
,'float'
,'complex'
,'numeric'
(tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex),'generic'
(accept arbitrary types, including lists, dictionaries, escaped strings, etc.),'raw'
(keep raw string). - columns – either a number if columns, or a list of columns names.
- delimiters (str) – Regex string which recognizes entries delimiters (by default
r"\s*,\s*|\s+"
, i.e., commas and whitespaces). - empty_entry_substitute – Substitute for empty table entries. If
None
, all empty table entries are skipped. - ignore_corrupted_lines (bool) – If
True
, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raiseValueError
. - skip_lines (int) – Number of lines to skip from the beginning of the file.
-
static
detect_file_format
(location_file)¶
- out_type (str) – type of the result:
-
class
pylablib.core.fileio.loadfile.
DictionaryInputFileFormat
(case_normalization=None, inline_dtype='generic', inline_out_type='default', entry_format='value', allow_duplicate_keys=False, skip_lines=0)[source]¶ Bases:
pylablib.core.fileio.loadfile.ITextInputFileFormat
Class for Dictionary input file format.
Parameters: - location_file – Location of the data.
- case_normalization (str) – If
None
, the dictionary paths are case-sensitive; otherwise, defines the way the entries are normalized ('lower'
or'upper'
). - inline_dtype (str) – dtype for inlined tables.
- inline_out_type (str) – type of the result of the inline table:
'array'
for numpy array,'pandas'
for pandas DataFrame,'raw'
for rawInlineTable
data containing tuple(column_data, column_names)
, or'default'
(determined by the library default;'pandas'
by default). - entry_format (str) – Determines the way for dealing with
dict_entry.IDictionaryEntry
objects (objects transformed into dictionary branches with special recognition rules). Can be'branch'
(don’t attempt to recognize those object, leave dictionary as in the file),'dict_entry'
(recognize and leave asdict_entry.IDictionaryEntry
objects) or'value'
(recognize and keep the value). - allow_duplicate_keys (bool) – if
False
and the same key is mentioned twice in the file, raise and error - skip_lines (int) – Number of lines to skip from the beginning of the file.
-
static
detect_file_format
(location_file)¶
-
class
pylablib.core.fileio.loadfile.
BinaryTableInputFileFormatter
(out_type='default', dtype='<f8', columns=None, packing='flatten', preamble=None, skip_bytes=0)[source]¶ Bases:
pylablib.core.fileio.loadfile.IInputFileFormat
Class for binary input file format.
Parameters: - location_file – Location of the data.
- out_type (str) – type of the result:
'array'
for numpy array,'pandas'
for pandas DataFrame, or'default'
(determined by the library default;'pandas'
by default) - dtype –
numpy.dtype
describing the data. - columns – either number if columns, or a list of columns names.
- packing (str) – The way the 2D array is packed. Can be either
'flatten'
(data is stored row-wise) or'transposed'
(data is stored column-wise). - preamble (dict) – If not
None
, defines binary file parameters that supersede the parameters supplied to the function. The defined parameters are'dtype'
,'packing'
,'ncols'
(number of columns) and'nrows'
(number of rows). - skip_bytes (int) – Number of bytes to skip from the beginning of the file.
-
static
detect_file_format
(location_file)¶
-
pylablib.core.fileio.loadfile.
build_file_format
(location_file, file_format='generic', **kwargs)[source]¶ Create file format (
IInputFileFormat
instance) for given parameters and file locations.If
file_format
is already an instance ofIInputFileFormat
, return unchanged. Iffile_format
is generic (e.g.,"generic"
or"test"
), attempt to autodetect it from the file.**kwargs
are passed to the file format constructor.
-
pylablib.core.fileio.loadfile.
load_csv
(path=None, out_type='default', dtype='numeric', columns=None, delimiters=None, empty_entry_substitute=None, ignore_corrupted_lines=True, skip_lines=0, loc='file', encoding=None, return_file=False)[source]¶ Load data table from a CSV/table file.
Parameters: - path (str) – path to the file of a file-like object
- out_type (str) – type of the result:
'array'
for numpy array,'pandas'
for pandas DataFrame, or'default'
(determined by the library default;'pandas'
by default) - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
'int'
,'float'
,'complex'
,'numeric'
(tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex),'generic'
(accept arbitrary types, including lists, dictionaries, escaped strings, etc.),'raw'
(keep raw string). - columns – either a number if columns, or a list of columns names
- delimiters (str) – regex string which recognizes entries delimiters (by default
r"\s*,\s*|\s+"
, i.e., commas and whitespaces) - empty_entry_substitute – substitute for empty table entries. If
None
, all empty table entries are skipped - ignore_corrupted_lines (bool) – if
True
, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raiseValueError
- skip_lines (int) – number of lines to skip from the beginning of the file
- loc (str) – location type (
"file"
means the usual file location; seelocation.get_location()
for details) - encoding – if a new file location is opened, this specifies the encoding
- return_file (bool) – if
True
, returnDataFile
object (contains some metainfo); otherwise, return just the file data
-
pylablib.core.fileio.loadfile.
load_csv_desc
(path=None, loc='file', encoding=None, return_file=False)[source]¶ Load data from the extended CSV table file.
Analogous to
load_dict()
, but doesn’t allow any additional parameters (which don’t matter in this case).Parameters: - path (str) – path to the file of a file-like object
- loc (str) – location type (
"file"
means the usual file location; seelocation.get_location()
for details) - encoding – if a new file location is opened, this specifies the encoding
- return_file (bool) – if
True
, returnDataFile
object (contains some metainfo); otherwise, return just the file data
-
pylablib.core.fileio.loadfile.
load_bin
(path=None, out_type='default', dtype='<f8', columns=None, packing='flatten', preamble=None, skip_bytes=0, loc='file', encoding=None, return_file=False)[source]¶ Load data from the binary file.
Parameters: - path (str) – path to the file of a file-like object
- out_type (str) – type of the result:
'array'
for numpy array,'pandas'
for pandas DataFrame, or'default'
(determined by the library default;'pandas'
by default) - dtype –
numpy.dtype
describing the data. - columns – either number if columns, or a list of columns names.
- packing (str) – The way the 2D array is packed. Can be either
'flatten'
(data is stored row-wise) or'transposed'
(data is stored column-wise). - preamble (dict) – If not
None
, defines binary file parameters that supersede the parameters supplied to the function. The defined parameters are'dtype'
,'packing'
,'ncols'
(number of columns) and'nrows'
(number of rows). - skip_bytes (int) – Number of bytes to skip from the beginning of the file.
- loc (str) – location type (
"file"
means the usual file location; seelocation.get_location()
for details) - encoding – if a new file location is opened, this specifies the encoding
- return_file (bool) – if
True
, returnDataFile
object (contains some metainfo); otherwise, return just the file data
-
pylablib.core.fileio.loadfile.
load_bin_desc
(path=None, loc='file', encoding=None, return_file=False)[source]¶ Load data from the binary file with a description.
Analogous to
load_dict()
, but doesn’t allow any additional parameters (which don’t matter in this case).Parameters: - path (str) – path to the file of a file-like object
- loc (str) – location type (
"file"
means the usual file location; seelocation.get_location()
for details) - encoding – if a new file location is opened, this specifies the encoding
- return_file (bool) – if
True
, returnDataFile
object (contains some metainfo); otherwise, return just the file data
-
pylablib.core.fileio.loadfile.
load_dict
(path=None, case_normalization=None, inline_dtype='generic', entry_format='value', inline_out_type='default', skip_lines=0, allow_duplicate_keys=False, loc='file', encoding=None, return_file=False)[source]¶ Load data from the dictionary file.
Parameters: - path (str) – path to the file of a file-like object
- case_normalization (str) – If
None
, the dictionary paths are case-sensitive; otherwise, defines the way the entries are normalized ('lower'
or'upper'
). - inline_dtype (str) – dtype for inlined tables.
- inline_out_type (str) – type of the result of the inline table:
'array'
for numpy array,'pandas'
for pandas DataFrame,'raw'
for rawInlineTable
data containing tuple(column_data, column_names)
, or'default'
(determined by the library default;'pandas'
by default). - entry_format (str) – Determines the way for dealing with
dict_entry.IDictionaryEntry
objects (objects transformed into dictionary branches with special recognition rules). Can be'branch'
(don’t attempt to recognize those object, leave dictionary as in the file),'dict_entry'
(recognize and leave asdict_entry.IDictionaryEntry
objects) or'value'
(recognize and keep the value). - allow_duplicate_keys (bool) – if
False
and the same key is mentioned twice in the file, raise and error - skip_lines (int) – Number of lines to skip from the beginning of the file.
- loc (str) – location type (
"file"
means the usual file location; seelocation.get_location()
for details) - encoding – if a new file location is opened, this specifies the encoding
- return_file (bool) – if
True
, returnDataFile
object (contains some metainfo); otherwise, return just the file data
-
pylablib.core.fileio.loadfile.
load_generic
(path=None, file_format=None, loc='file', encoding=None, return_file=False, **kwargs)[source]¶ Load data from the file.
Parameters: - path (str) – path to the file of a file-like object
- file_format (str) – input file format; if
None
, attempt to auto-detect file format (same as'generic'
); can also be anIInputFileFormat
instance for specific reading method - loc (str) – location type (
"file"
means the usual file location; seelocation.get_location()
for details) - encoding – if a new file location is opened, this specifies the encoding
- return_file (bool) – if
True
, returnDataFile
object (contains some metainfo); otherwise, return just the file data
**kwargs are passed to the file formatter used to read the data (see
CSVTableInputFileFormat
,DictionaryInputFileFormat
andBinaryTableInputFileFormatter
for the possible arguments). The default format names are:'generic'
: Generic file format. Attempt to autodetect, raiseIOError
if unsuccessful;'txt'
: Generic text file. Attempt to autodetect, raiseIOError
if unsuccessful'csv'
: CSV file, corresponds toCSVTableInputFileFormat
;'dict'
: Dictionary file, corresponds toDictionaryInputFileFormat
;'bin'
: Binary file, corresponds toBinaryTableInputFileFormatter
pylablib.core.fileio.loadfile_utils module¶
Miscellaneous utilities for reading data files.
-
pylablib.core.fileio.loadfile_utils.
detect_binary_file
(stream)[source]¶ Check if the opened file is binary
-
pylablib.core.fileio.loadfile_utils.
test_row_type
(line)[source]¶ Try to determine whether the line is a comment line, a numerical data row, a dictionary row or an unrecognized row.
Doesn’t distinguish with a great accuracy; useful only for trying to guess file format.
-
pylablib.core.fileio.loadfile_utils.
detect_textfile_type
(stream)[source]¶ Try to autodetect text file type: dictionary or table
-
pylablib.core.fileio.loadfile_utils.
test_savetime_comment
(line)[source]¶ Test if the comment resembles a savetime line
-
pylablib.core.fileio.loadfile_utils.
find_savetime_comment
(comments)[source]¶ Try to find savetime comment
-
pylablib.core.fileio.loadfile_utils.
test_columns_line
(line, cols_num)[source]¶ Test if the line looks like a list of columns for a given columns number
-
pylablib.core.fileio.loadfile_utils.
find_columns_lines
(corrupted, comments, cols_num)[source]¶ Try to find a column line (for a given columns number) among the comment and corrupted lines
-
class
pylablib.core.fileio.loadfile_utils.
InlineTable
(table)[source]¶ Bases:
object
Simple marker class that denotes that the wrapped numpy 2D array should be written inline
-
pylablib.core.fileio.loadfile_utils.
parse_dict_line
(line)[source]¶ Parse stripped dictionary file line
-
pylablib.core.fileio.loadfile_utils.
read_dict_and_comments
(f, case_normalization=None, inline_dtype='generic', allow_duplicate_keys=False)[source]¶ Load dictionary entries and comments from the file stream.
Parameters: - f – file stream
- case_normalization – case normalization for the returned dictionary;
None
means that it’s case sensitive,"upper"
and"lower"
determine how they are normalized - inline_dtype – dtype for inline tables; by default, use the most generic type (can include Python objects such as lists or strings)
- allow_duplicate_keys – if
False
and the same key is listed twice, raise and error
Return tuple
(data, comment_lines)
, wheredata
is a dictionary with parsed entries (tables are still represented as ‘raw’, i.e., as a tuple of columns list and column names list), andcomment_lines
is a list of comment lines
pylablib.core.fileio.location module¶
Classes for describing a generic file location.
-
class
pylablib.core.fileio.location.
LocationName
(path=None, ext=None)[source]¶ Bases:
object
File name inside a location.
Parameters: - path – Path inside the location. Gets normalized according to the Dictionary rules (not case-sensitive;
'/'
and'\'
are the delimiters). - ext (str) – Name extension (
None
is default).
-
get_path
(default_path='', sep='/')[source]¶ Get the string path.
If the object’s path is
None
, use default_path instead. If sep is notNone
, use it to join the path entries; otherwise, return the path in a list form.
-
get_ext
(default_ext='')[source]¶ Get the extension.
If the object’s ext is
None
, use default_ext instead.
-
to_string
(default_path='', default_ext='', path_sep='/', ext_sep='|', add_empty_ext=True)[source]¶ Convert the path to a string representation.
Parameters: - default_path (str) – Use it as path if the object’s path is
None
. - default_ext (str) – Use it as path if the object’s ext is
None
. - path_sep (str) – Use it to join the path entries.
- ext_sep (str) – Use it to join path and extension.
- add_empty_ext (str) – If
False
and the extension is empty, don’t add ext_sep in the end.
- default_path (str) – Use it as path if the object’s path is
-
to_path
(default_path='', default_ext='', ext_sep='|', add_empty_ext=True)[source]¶ Convert the path to a list representation.
Extension is added with ext_sep to the last entry in the path.
Parameters:
-
static
from_string
(expr, ext_sep='|')[source]¶ Create a
LocationName
object from a string representation.ext_sep defines extension separator; the path separators are
'/'
and'\'
. Empty path or extension translate intoNone
.
-
static
from_object
(obj)[source]¶ Create a
LocationName
object from an object.obj can be a
LocationName
(return unchanged), tuple or list (use as construct arguments), string (treat as a string representation) orNone
(return empty name).
- path – Path inside the location. Gets normalized according to the Dictionary rules (not case-sensitive;
-
class
pylablib.core.fileio.location.
LocationFile
(loc, name=None)[source]¶ Bases:
object
A file at a location.
Combines information about the location and the name within this location. Can be opened for reading or writing.
Parameters: - loc – File location.
- name – File’s name inside the location.
-
loc
¶ File location.
-
name
¶ File’s name inside the location.
-
opened
¶ Whether the file is currently opened.
-
class
pylablib.core.fileio.location.
IDataLocation
[source]¶ Bases:
object
Generic location.
-
generate_new_name
(prefix_name, idx=0)[source]¶ Generate a new name inside the location using the given prefix and starting index.
If idx is
None
, check just the prefix_name first before starting to append indices.
-
open
(name=None, mode='read', data_type='text')[source]¶ Open a location file.
Parameters: - name – File name inside the location (
None
means ‘default’ location), - mode (str) – Opening mode. Can be
'read'
,'write'
or'append'
, as well as standard abbreviation (e.g.,"r"
or"wb"
). - data_type (str) – Either
'text'
or'binary'
; if mode is an abbreviation, this parameter is ignored (i.e.,open("r","binary")
still opens file as text).
- name – File name inside the location (
-
-
class
pylablib.core.fileio.location.
OpenedFileLocation
(f, open_error=False, check_mode=False, check_data_type=True)[source]¶ Bases:
object
File location which corresponds to an already opened file.
-
class
pylablib.core.fileio.location.
IFileSystemDataLocation
(encoding=None)[source]¶ Bases:
pylablib.core.fileio.location.IDataLocation
A generic filesystem data location.
A single file name describes a single file in the filesystem.
-
get_filesystem_path
(name=None, path_type='absolute')[source]¶ Get the filesystem path corresponding to a given name.
path_type can be
'absolute'
(return absolute path),'relative'
(return relative path; level depends on the location) or'name'
(only return path inside the location).
-
open
(name=None, mode='read', data_type='text')[source]¶ Open a location file.
Parameters: - name – File name inside the location (
None
means ‘default’ location), - mode (str) – Opening mode. Can be
'read'
,'write'
or'append'
, as well as standard abbreviation (e.g.,"r"
or"wb"
). - data_type (str) – Either
'text'
or'binary'
; if mode is an abbreviation, this parameter is ignored (i.e.,open("r","binary")
still opens file as text).
- name – File name inside the location (
-
list_opened_files
()[source]¶ Get a dictionary
{string_name: location_file}
of all files opened in this location
-
generate_new_name
(prefix_name, idx=0)¶ Generate a new name inside the location using the given prefix and starting index.
If idx is
None
, check just the prefix_name first before starting to append indices.
-
-
class
pylablib.core.fileio.location.
SingleFileSystemDataLocation
(file_path, encoding=None)[source]¶ Bases:
pylablib.core.fileio.location.IFileSystemDataLocation
A location describing a single file.
Any use of a non-default name raises
ValueError
.Parameters: file_path (str) – The path to the file. -
get_filesystem_path
(name=None, path_type='absolute')[source]¶ Get the filesystem path corresponding to a given name.
path_type can be
'absolute'
(return absolute path),'relative'
(return relative path; level depends on the location) or'name'
(only return path inside the location).
-
close
(name)¶ Close a location file
-
generate_new_name
(prefix_name, idx=0)¶ Generate a new name inside the location using the given prefix and starting index.
If idx is
None
, check just the prefix_name first before starting to append indices.
-
is_free
(name=None)¶ Check if the name is unoccupied
-
list_opened_files
()¶ Get a dictionary
{string_name: location_file}
of all files opened in this location
-
open
(name=None, mode='read', data_type='text')¶ Open a location file.
Parameters: - name – File name inside the location (
None
means ‘default’ location), - mode (str) – Opening mode. Can be
'read'
,'write'
or'append'
, as well as standard abbreviation (e.g.,"r"
or"wb"
). - data_type (str) – Either
'text'
or'binary'
; if mode is an abbreviation, this parameter is ignored (i.e.,open("r","binary")
still opens file as text).
- name – File name inside the location (
-
-
class
pylablib.core.fileio.location.
PrefixedFileSystemDataLocation
(file_path, prefix_template='{0}_{1}', encoding=None)[source]¶ Bases:
pylablib.core.fileio.location.IFileSystemDataLocation
A location describing a set of prefixed files.
Parameters: - file_path (str) – A master path. Its name is used as a prefix, and its extension is used as a default.
- prefix_template (str) – A
str.format()
string for generating prefixed files. Has two arguments: the first is the master name, the second is the sub_location.
Multi-level paths translate into nested folders (the top level folder is combined from the file_path prefix and the first path entry).
-
get_filesystem_path
(name=None, path_type='absolute')[source]¶ Get the filesystem path corresponding to a given name.
path_type can be
'absolute'
(return absolute path),'relative'
(return relative path; level depends on the location) or'name'
(only return path inside the location).
-
close
(name)¶ Close a location file
-
generate_new_name
(prefix_name, idx=0)¶ Generate a new name inside the location using the given prefix and starting index.
If idx is
None
, check just the prefix_name first before starting to append indices.
-
is_free
(name=None)¶ Check if the name is unoccupied
-
list_opened_files
()¶ Get a dictionary
{string_name: location_file}
of all files opened in this location
-
open
(name=None, mode='read', data_type='text')¶ Open a location file.
Parameters: - name – File name inside the location (
None
means ‘default’ location), - mode (str) – Opening mode. Can be
'read'
,'write'
or'append'
, as well as standard abbreviation (e.g.,"r"
or"wb"
). - data_type (str) – Either
'text'
or'binary'
; if mode is an abbreviation, this parameter is ignored (i.e.,open("r","binary")
still opens file as text).
- name – File name inside the location (
-
class
pylablib.core.fileio.location.
FolderFileSystemDataLocation
(folder_path, default_name='content', default_ext='', encoding=None)[source]¶ Bases:
pylablib.core.fileio.location.IFileSystemDataLocation
A location describing a single folder.
Parameters: - folder_path (str) – A path to the folder. Can also have one or two
'|'
symbols in the end (e.g.,'folder|file|dat'
), which separate default name and extension (overrides default_name and default_ext parameters) - default_name (str) – The default file name.
- default_ext (str) – The default file extension.
Multi-level paths translate into nested subfolders.
-
get_filesystem_path
(name=None, path_type='absolute')[source]¶ Get the filesystem path corresponding to a given name.
path_type can be
'absolute'
(return absolute path),'relative'
(return relative path; level depends on the location) or'name'
(only return path inside the location).
-
close
(name)¶ Close a location file
-
generate_new_name
(prefix_name, idx=0)¶ Generate a new name inside the location using the given prefix and starting index.
If idx is
None
, check just the prefix_name first before starting to append indices.
-
is_free
(name=None)¶ Check if the name is unoccupied
-
list_opened_files
()¶ Get a dictionary
{string_name: location_file}
of all files opened in this location
-
open
(name=None, mode='read', data_type='text')¶ Open a location file.
Parameters: - name – File name inside the location (
None
means ‘default’ location), - mode (str) – Opening mode. Can be
'read'
,'write'
or'append'
, as well as standard abbreviation (e.g.,"r"
or"wb"
). - data_type (str) – Either
'text'
or'binary'
; if mode is an abbreviation, this parameter is ignored (i.e.,open("r","binary")
still opens file as text).
- name – File name inside the location (
- folder_path (str) – A path to the folder. Can also have one or two
-
pylablib.core.fileio.location.
get_location
(path, loc, *args, **kwargs)[source]¶ Build a location.
If path or loc are instances of
IDataLocation
, return them unchanged. If loc is a string, it describes location kind:'single_file'
:SingleFileSystemDataLocation
with the given path.'file'
or'prefixed_file'
:PrefixedFileSystemDataLocation
with the given path as a master path.'folder'
:FolderFileSystemDataLocation
with the given folder path.
Any additional arguments are relayed to the constructors.
pylablib.core.fileio.parse_csv module¶
Utilities for parsing CSV files.
-
class
pylablib.core.fileio.parse_csv.
ChunksAccumulator
(dtype='numeric', ignore_corrupted_lines=True, trim_rows=False)[source]¶ Bases:
object
Class for accumulating data chunks into a single array.
Parameters: - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
'int'
,'float'
,'complex'
,'numeric'
(tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex),'generic'
(accept arbitrary types, including lists, dictionaries, escaped strings, etc.),'raw'
(keep raw string). - ignore_corrupted_lines – if
True
, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raiseValueError
. - trim_rows – if
True
and the row length is larger than expected, drop extra entries; otherwise, treat the row as corrupted
- dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
-
pylablib.core.fileio.parse_csv.
read_columns
(f, dtype, delimiters='\\s*, \\s*|\\s+', empty_entry_substitute=None, ignore_corrupted_lines=True, trim_rows=False, stop_comment=None)[source]¶ Load columns from the file stream f.
Parameters: - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
'int'
,'float'
,'complex'
,'numeric'
(tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex),'generic'
(accept arbitrary types, including lists, dictionaries, escaped strings, etc.),'raw'
(keep raw string). - delimiters (str) – Regex string which recognizes delimiters (by default
r"\s*,\s*|\s+"
, i.e., commas and whitespaces). - empty_entry_substitute – Substitute for empty table entries. If
None
, all empty table entries are skipped. - ignore_corrupted_lines – If
True
, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raiseValueError
. - trim_rows – if
True
and the row length is larger than expected, drop extra entries; otherwise, treat the row as corrupted - stop_comment (str) – Regex string for the stopping comment.
If not
None
. the function will stop if comment satisfying stop_comment regex is encountered.
Returns: (columns, comments, corrupted_lines)
.columns is a list of columns with data.
comments is a list of comment strings.
corrupted_lines is a dict
{'size':list, 'type':list}
of corrupted lines (already split into entries), based on the corruption type ('size'
means too small size,'type'
means it couldn’t be converted using provided dtype).Return type: - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
-
pylablib.core.fileio.parse_csv.
columns_to_table
(data, columns=None, dtype='numeric', out_type='columns')[source]¶ Convert data (columns list) into a table.
Parameters: - columns – either number if columns, or a list of columns names.
- out_type (str) – type of the result:
'array'
for numpy array,'pandas'
for pandas DataFrame,'columns'
for tuple(data, columns)
-
pylablib.core.fileio.parse_csv.
read_table
(f, dtype='numeric', columns=None, out_type='columns', delimiters='\\s*, \\s*|\\s+', empty_entry_substitute=None, ignore_corrupted_lines=True, trim_rows=False, stop_comment=None)[source]¶ Load table from the file stream f.
Arguments are the same as in
read_columns()
andcolumns_to_table()
.Returns: (table, comments, corrupted_lines)
.table is a table of the format out_type.corrupted_lines is a dict
{'size':list, 'type':list}
of corrupted lines (already split into entries), based on the corruption type ('size'
means too small size,'type'
means it couldn’t be converted using provided dtype).comments is a list of comment strings.
Return type: tuple
pylablib.core.fileio.savefile module¶
Utilities for writing data files.
-
class
pylablib.core.fileio.savefile.
IOutputFileFormat
(format_name)[source]¶ Bases:
object
Generic class for an output file format.
Parameters: format_name (str) – The name of the format (to be defined in subclasses).
-
class
pylablib.core.fileio.savefile.
ITextOutputFileFormat
(format_name, save_props=True, save_comments=True, save_time=True, new_time=True)[source]¶ Bases:
pylablib.core.fileio.savefile.IOutputFileFormat
Generic class for a text output file format.
Parameters: - format_name (str) – The name of the format (to be defined in subclasses).
- save_props (bool) – If
True
and savingdatafile.DataFile
object, save its props metainfo. - save_comments (bool) – If
True
and savingdatafile.DataFile
object, save its comments metainfo. - save_time (bool) – If
True
, append the file creation time in the end. - new_time (bool) – If saving
datafile.DataFile
object, determines if the time should be updated to the current time.
-
write
(location_file, data)¶
-
write_data
(location_file, data)¶
-
class
pylablib.core.fileio.savefile.
CSVTableOutputFileFormat
(delimiters='t', value_formats=None, use_rep_classes=False, save_columns=True, save_props=True, save_comments=True, save_time=True)[source]¶ Bases:
pylablib.core.fileio.savefile.ITextOutputFileFormat
Class for CSV output file format.
Parameters: - delimiters (str) – Used to separate entries in a row.
- value_formats (str) – If not
None
, defines value formats to be passed toutils.string.to_string()
function. - use_rep_classes (bool) – If
True
, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as"array([1, 2, 3])"
instead of just"[1, 2, 3]"
); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers). - save_columns (bool) – If
True
, save column names as a comment line in the beginning of the file. - save_props (bool) – If
True
and savingdatafile.DataFile
object, save its props metainfo. - save_comments (bool) – If
True
and savingdatafile.DataFile
object, save its comments metainfo. - save_time (bool) – If
True
, append the file creation time in the end.
-
write_data
(location_file, data)[source]¶ Write data to a CSV file.
Parameters: - location_file – Location of the destination.
- data – Data to be saved. Can be a pandas DataFrame or an arbitrary 2D array (numpy array, 2D list, etc.); if the data is not DataFrame or numpy 2D array, it gets converted into a DataFrame using the standard constructor (i.e., 2D list is interpreted as a list of rows)
-
make_comment_line
(comment)¶
-
make_prop_line
(name, value)¶
-
make_savetime_line
(time)¶
-
write
(location_file, data)¶
-
write_comments
(stream, comments)¶
-
write_file
(location_file, to_save)¶
-
static
write_line
(stream, line)¶
-
write_props
(stream, props)¶
-
write_savetime
(stream, time)¶
-
class
pylablib.core.fileio.savefile.
DictionaryOutputFileFormat
(param_formats=None, use_rep_classes=False, table_format='inline', inline_delimiters='t', inline_formats=None, save_props=True, save_comments=True, save_time=True)[source]¶ Bases:
pylablib.core.fileio.savefile.ITextOutputFileFormat
Class for Dictionary output file format.
Parameters: - param_formats (str) – If not
None
, defines value formats to be passed toutils.string.to_string()
function when writing Dictionary entries. - use_rep_classes (bool) – If
True
, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as"array([1, 2, 3])"
instead of just"[1, 2, 3]"
); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers). - table_format (str) – Default format for table (numpy arrays or pandas DataFrames) entries. Can be
'inline'
(table is written inside the file),'csv'
(external CSV file) or'bin'
(external binary file). - inline_delimiters (str) – Used to separate entries in a row for inline tables.
- inline_formats (str) – If not
None
, defines value formats to be passed toutils.string.to_string()
function when writing inline tables. - save_props (bool) – If
True
and savingdatafile.DataFile
object, save its props metainfo. - save_comments (bool) – If
True
and savingdatafile.DataFile
object, save its comments metainfo. - save_time (bool) – If
True
, append the file creation time in the end.
-
write_data
(location_file, data)[source]¶ Write data to a Dictionary file.
Parameters: - location_file – Location of the destination.
- data – Data to be saved. Should be object of class
Dictionary
.
-
make_comment_line
(comment)¶
-
make_prop_line
(name, value)¶
-
make_savetime_line
(time)¶
-
write
(location_file, data)¶
-
write_comments
(stream, comments)¶
-
write_file
(location_file, to_save)¶
-
static
write_line
(stream, line)¶
-
write_props
(stream, props)¶
-
write_savetime
(stream, time)¶
- param_formats (str) – If not
-
class
pylablib.core.fileio.savefile.
IBinaryOutputFileFormat
(format_name)[source]¶ Bases:
pylablib.core.fileio.savefile.IOutputFileFormat
-
write
(location_file, data)¶
-
write_data
(location_file, data)¶
-
write_file
(location_file, to_save)¶
-
-
class
pylablib.core.fileio.savefile.
TableBinaryOutputFileFormat
(dtype=None, transposed=False)[source]¶ Bases:
pylablib.core.fileio.savefile.IBinaryOutputFileFormat
Class for binary output file format.
Parameters: - dtype – a string with numpy dtype (e.g.,
"<f8"
) used to save the data. By default, use little-endian ("<"
) variant kind of the supplied data array dtype - transposed (bool) – If
False
, write the data row-wise; otherwise, write it column-wise.
-
get_preamble
(location_file, data)[source]¶ Generate a preamble (dictionary describing the file format).
The parameters are
'dtype'
,'packing'
('transposed'
or'flatten'
, depending on the transposed attribute),'ncol'
(number of columns) and'nrows'
(number of rows).
-
write_data
(location_file, data)[source]¶ Write data to a binary file.
Parameters: - location_file – Location of the destination.
- data – Data to be saved. Can be a pandas DataFrame or an arbitrary 2D array (numpy array, 2D list, etc.) Converted to numpy array before saving.
-
write
(location_file, data)¶
- dtype – a string with numpy dtype (e.g.,
-
pylablib.core.fileio.savefile.
save_csv
(data, path, delimiters='\t', value_formats=None, use_rep_classes=False, save_columns=True, save_props=True, save_comments=True, save_time=True, loc='file', encoding=None)[source]¶ Save data to a CSV file.
Parameters: - data – Data to be saved (2D numpy array, pandas DataFrame, or a
datafile.DataFile
object containing this data). - path (str) – Path to the file or a file-like object.
- delimiters (str) – Used to separate entries in a row.
- value_formats (str) – If not
None
, defines value formats to be passed toutils.string.to_string()
function. - use_rep_classes (bool) – If
True
, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as"array([1, 2, 3])"
instead of just"[1, 2, 3]"
); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers). - save_columns (bool) – If
True
, save column names as a comment line in the beginning of the file. - save_props (bool) – If
True
and savingdatafile.DataFile
object, save its props metainfo. - save_comments (bool) – If
True
and savingdatafile.DataFile
object, save its comments metainfo. - save_time (bool) – If
True
, append the file creation time in the end. - loc (str) – Location type.
- encoding – if a new file location is opened, this specifies the encoding.
- data – Data to be saved (2D numpy array, pandas DataFrame, or a
-
pylablib.core.fileio.savefile.
save_csv_desc
(data, path, loc='file', encoding=None)[source]¶ Save data table to a dictionary file with an inlined table.
Compared to
save_csv()
, supports more pandas features (index, column multi-index), but can only be directly read by pylablib.Parameters: - data – Data to be saved (2D numpy array, pandas DataFrame, or a
datafile.DataFile
object containing this data). - path (str) – Path to the file or a file-like object.
- loc (str) – Location type.
- encoding – if a new file location is opened, this specifies the encoding.
- data – Data to be saved (2D numpy array, pandas DataFrame, or a
-
pylablib.core.fileio.savefile.
save_bin
(data, path, dtype=None, transposed=False, loc='file', encoding=None)[source]¶ Save data to a binary file.
Parameters: - data – Data to be saved (2D numpy array, pandas DataFrame, or a
datafile.DataFile
object containing this data). - path (str) – Path to the file or a file-like object.
- dtype –
numpy.dtype
describing the data. By default, use little-endian ("<"
) variant kind of the supplied data array dtype. - transposed (bool) – If
False
, write the data row-wise; otherwise, write it column-wise. - loc (str) – Location type.
- encoding – if a new file location is opened, this specifies the encoding.
- data – Data to be saved (2D numpy array, pandas DataFrame, or a
-
pylablib.core.fileio.savefile.
save_bin_desc
(data, path, loc='file', encoding=None)[source]¶ Save data to a binary file with an additional description file, which contains all of the data related to loading (shape, dtype, columns, etc.)
Parameters: - data – Data to be saved (2D numpy array, pandas DataFrame, or a
datafile.DataFile
object containing this data). - path (str) – Path to the file or a file-like object.
- loc (str) – Location type.
- encoding – if a new file location is opened, this specifies the encoding.
- data – Data to be saved (2D numpy array, pandas DataFrame, or a
-
pylablib.core.fileio.savefile.
save_dict
(data, path, param_formats=None, use_rep_classes=False, table_format='inline', inline_delimiters='\t', inline_formats=None, save_props=True, save_comments=True, save_time=True, loc='file', encoding=None)[source]¶ Save dictionary to a text file.
Parameters: - data – Data to be saved.
- path (str) – Path to the file or a file-like object.
- param_formats (str) – If not
None
, defines value formats to be passed toutils.string.to_string()
function when writing Dictionary entries. - use_rep_classes (bool) – If
True
, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as"array([1, 2, 3])"
instead of just"[1, 2, 3]"
); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers). - table_format (str) – Default format for table (numpy arrays or pandas DataFrames) entries. Can be
'inline'
(table is written inside the file),'csv'
(external CSV file) or'bin'
(external binary file). - inline_delimiters (str) – Used to separate entries in a row for inline tables.
- inline_formats (str) – If not
None
, defines value formats to be passed toutils.string.to_string()
function when writing inline tables. - save_props (bool) – If
True
and savingdatafile.DataFile
object, save its props metainfo. - save_comments (bool) – If
True
and savingdatafile.DataFile
object, save its comments metainfo. - save_time (bool) – If
True
, append the file creation time in the end. - loc (str) – Location type.
- encoding – if a new file location is opened, this specifies the encoding.
-
pylablib.core.fileio.savefile.
save_generic
(data, path, output_format=None, loc='file', encoding=None, **kwargs)[source]¶ Save data to a file.
Parameters: - data – Data to be saved.
- path (str) – Path to the file or a file-like object.
- output_format (str) – Output file format. Can be either
None
(defaults to'csv'
for table data and'dict'
for Dictionary data), a string with one of the default format names, or an already preparedIOutputFileFormat
object. - loc (str) – Location type.
- encoding – if a new file location is opened, this specifies the encoding.
**kwargs are passed to the file formatter constructor (see
CSVTableOutputFileFormat
,DictionaryOutputFileFormat
andTableBinaryOutputFileFormat
for the possible arguments). The default format names are:'csv'
: CSV file, corresponds toCSVTableOutputFileFormat
andsave_csv()
;'csv'
: CSV file with an additional dictionary containing format description, corresponds toDictionaryOutputFileFormat
andsave_csv_desc()
;'bin'
: Binary file, corresponds toTableBinaryOutputFileFormat
andsave_bin()
;'bin_desc'
: Binary file with an additional dictionary containing format description, corresponds toDictionaryOutputFileFormat
andsave_bin_desc()
;'dict'
: Dictionary file, corresponds toDictionaryOutputFileFormat
andsave_dict()
pylablib.core.fileio.table_stream module¶
-
class
pylablib.core.fileio.table_stream.
TableStreamFile
(path, columns=None, delimiter='t', fmt=None, add_timestamp=False, header_prepend='# ')[source]¶ Bases:
object
Expanding table file.
Can define column names and formats for different columns, and repeatedly write data into the same file. Useful for, e.g., continuous log files.
Parameters: - path (str) – Path to the destination file.
- columns (list) – If not
None
, it’s a list of column names to be added as a header on creation. - delimiter (str) – Values delimiter.
- fmt (str) – If not
None
, it’s a list of format strings for the line entries (e.g.,".3f"
); instead of format string one can also beNone
, which means using the standardto_string()
conversion function - add_timestamp (bool) – If
True
, add the UNIX timestamp in the beginning of each line (columns and format are expanded accordingly) - header_prepend – the string to prepend to the header line; by default, a comment symbol, which is best compatibly with
loadfile.load_csv()
function
-
write_text_lines
(lines)[source]¶ Write several text lines into the file.
Create the file if it doesn’t exist (in which case the header is automatically added).
Parameters: lines ([str]) – List of lines to write.
-
write_row
(row)[source]¶ Write a single data row into the file.
Create the file if it doesn’t exist (in which case the header is automatically added).
Parameters: data (list or numpy.ndarray) – Data row to be added.
-
write_multiple_rows
(rows)[source]¶ Write a multiple data lines into the file.
Create the file if it doesn’t exist (in which case the header is automatically added).
Parameters: rows ([list or numpy.ndarray]) – Data rows to be added.