exporters

class exporters.FileResultProcessor(dataset_name: str, output_filename=None, output_directory=None, format: str = 'feather', concatenate: bool = False, raw: bool = False, categorical_columns: set[str] = {}, numerical_columns: dict[str, str] | set[str] = {}, *args, **kwargs)[source]

Bases: YAMLObject

Export the given dataset as [feather/arrow](https://arrow.apache.org/docs/python/feather.html) or JSON

Parameters:
output_filename: str

the name of the output file

output_directory: str

the path of the output directory

dataset_name: str

the name of the dataset to export

format: str

the output file format, either feather or json

concatenate: bool

Whether to concatenate the input data before exporting it. If false, the name of the output files will be derived from the input file names and the aliases in the data.

raw: bool

whether to save the raw input or convert the columns of the input pandas.DataFrame to categories before saving

Attributes:
yaml_flow_style

Methods

from_yaml(loader, node)

Convert a representation node to a Python object.

to_yaml(dumper, data)

Convert a Python object to a representation node.

get_data

prepare

prepare_concatenated

prepare_separated

save_to_disk

set_data_repo

yaml_dumper

get_data(dataset_name: str)[source]
prepare()[source]
prepare_concatenated(data_list, job_list)[source]
prepare_separated(data_list, job_list)[source]
save_to_disk(df, filename, file_format='feather', compression='lz4', hdf_key='data')[source]
set_data_repo(data_repo)[source]
yaml_tag = '!FileResultProcessor'
exporters.register_constructors()[source]

Register YAML constructors for all exporters

exporters.register_jsonpickle_handlers()[source]

Register the jsonpickle handlers for pickling pandas objects to JSON.