extractors

class extractors.BaseExtractor(input_files: list[str], categorical_columns: list[str] | None = None, numerical_columns: dict[str, str] | list[str] | None = None, *args, **kwargs)[source]

Bases: Extractor

Attributes:
yaml_flow_style

Methods

convert_columns_dtype(data[, ...])

Convert the data in the specified columns of the given DataFrame to either a categorical data type or a numerical data type

from_yaml(loader, node)

Convert a representation node to a Python object.

prepare()

Prepare and return a list or a single dask.Delayed task.

read_sql_from_file(db_file, query[, ...])

Extract the data from a SQLite database into a pandas.DataFrame

to_yaml(dumper, data)

Convert a Python object to a representation node.

set_tag_maps

yaml_dumper

static convert_columns_dtype(data: DataFrame, categorical_columns: set[str] | None = None, numerical_columns: dict[str, str] | set[str] | None = None)[source]

Convert the data in the specified columns of the given DataFrame to either a categorical data type or a numerical data type

Parameters:
data: pd.DataFrame

The input DataFrame whose columns are to be converted.

categorical_columns: Optional[set[str]]

The set of names of the columns to convert to a categorical data type.

numerical_columns: Optional[Union[dict[str, str], set[str]]]

The set of names of the columns to convert to a categorical data type. The data type to convert to can also be given explicitly as a dictionary with the column names as keys and the data type as values.

static read_sql_from_file(db_file: str, query: str, includeFilename: bool = False) DataFrame[source]

Extract the data from a SQLite database into a pandas.DataFrame

Parameters:
db_file: str

The input file name from which data is to be extracted.

query: str

The SQL query to extract data from the input file.

includeFilename: bool

Whether to include the input file name in the column filename of the result DataFrame.

yaml_tag = '!BaseExtractor'
class extractors.DataAttributes(**kwargs)[source]

Bases: YAMLObject

A class for assigning arbitrary attributes to a dataset. The constructor accept an arbitrary number of keyword arguments and turns them into object attributes.

Parameters:
source_filestr

The file name the dataset was extracted from.

source_filesList[str]

The list of file names the dataset was extracted from.

common_rootstr

The root directory that was not containing regex for search files.

aliasList[str]

The alias given to the data in the dataset.

aliasesList[str]

The aliases given to the data in the dataset.

Attributes:
yaml_flow_style
yaml_tag

Methods

from_yaml(loader, node)

Convert a representation node to a Python object.

to_yaml(dumper, data)

Convert a Python object to a representation node.

add_alias

add_source_file

add_source_files

common_root

get_aliases

get_source_files

remove_source_file

yaml_dumper

add_alias(alias: str)[source]
add_source_file(source_file: str)[source]
add_source_files(source_files: Set[str])[source]
common_root() Set[str][source]
get_aliases() Set[str][source]
get_source_files() Set[str][source]
remove_source_file(source_file: str)[source]
class extractors.Extractor[source]

Bases: YAMLObject

A class for extracting and preprocessing data from a SQLite database. This is the abstract base class.

Attributes:
yaml_flow_style

Methods

from_yaml(loader, node)

Convert a representation node to a Python object.

prepare()

Prepare and return a list or a single dask.Delayed task.

to_yaml(dumper, data)

Convert a Python object to a representation node.

set_tag_maps

yaml_dumper

prepare()[source]

Prepare and return a list or a single dask.Delayed task.

set_tag_maps(attributes_regex_map, iterationvars_regex_map, parameters_regex_map)[source]
yaml_tag = '!Extractor'
class extractors.MatchingExtractor(input_files: list, pattern: str, alias_pattern: str, alias: str, *args, **kwargs)[source]

Bases: OmnetExtractor

Extract the data for multiple signals matching a regular expression, with the associated positions, from the input files specified.

Parameters:
input_files: List[str]

the list of paths to the input files, as literal path or as a regular expression

pattern: str

the regular expression used for matching possible signal names

alias_pattern: str

the template string for naming the extracted signal

alias: str

the name given to the column with the extracted signal data

Attributes:
yaml_flow_style

Methods

convert_columns_dtype(data[, ...])

from_yaml(loader, node)

Convert a representation node to a Python object.

prepare()

Prepare and return a list or a single dask.Delayed task.

read_sql_from_file(db_file, query[, ...])

to_yaml(dumper, data)

Convert a Python object to a representation node.

apply_tags

extract_all_signals

get_matching_signals

read_pattern_matched_scalars_from_file

read_pattern_matched_signals_from_file

read_query_from_file

read_scalars_from_file

read_signals_from_file

read_statistic_from_file

set_tag_maps

yaml_dumper

static extract_all_signals(db_file, signals, categorical_columns: set[str] | None = None, numerical_columns: dict[str, str] | set[str] | None = None, base_tags=None, additional_tags=None, minimal_tags=True, attributes_regex_map={'MCO': [{'regex': 'configname', 'transform': <function <lambda>>}], 'SCO': [{'regex': 'configname', 'transform': <function <lambda>>}], 'prefix': [{'regex': 'configname', 'transform': <function <lambda>>}]}, iterationvars_regex_map={'cp_rate': [{'regex': 'services-ca-((.\\..)|(.\\...))-cp-((.\\..)|(.\\...))', 'transform': <function <lambda>>}], 'limit': [{'regex': '\\$limit=.*?s', 'transform': <function <lambda>>}], 'period': [{'regex': '\\$period=.*?s', 'transform': <function <lambda>>}], 'sensors': [{'regex': '\\$sensorConf=.*?,', 'transform': <function <lambda>>}], 'simulationEnd': [{'regex': '\\$1=.*?s\\+.*?s', 'transform': <function <lambda>>}], 'simulationStart': [{'regex': '\\$simulationStart=.*?s', 'transform': <function <lambda>>}], 'traciStart': [{'regex': '\\$traciStart=.*?s', 'transform': <function <lambda>>}], 'v2x_rate': [{'regex': 'vehicles-((.\\..)|(.\\...))-plain-((.\\..\\.)|(.\\...\\.))', 'transform': <function <lambda>>}]}, parameters_regex_map={'ca_rate': [{'regex': '\\*\\*\\.ca_rate', 'transform': <function <lambda>>}], 'ca_weight': [{'regex': '\\*\\.ca_weight', 'transform': <function <lambda>>}], 'cam_length': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.ExampleServiceCam\\.packetLength', 'transform': <function <lambda>>}], 'cp_rate': [{'regex': '\\*\\*\\.cp_rate', 'transform': <function <lambda>>}], 'cp_weight': [{'regex': '\\*\\.cp_weight', 'transform': <function <lambda>>}], 'cpm_length': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.ExampleServiceCpm\\.packetLength', 'transform': <function <lambda>>}], 'dcc': [{'regex': '\\*\\*\\.vanetza\\[\\*\\]\\.dcc\\.typename', 'transform': <function <lambda>>}], 'dcc_profile': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.dccProfile', 'transform': <function <lambda>>}], 'gen_rule': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.generationRule', 'transform': <function <lambda>>}], 'limit': [{'regex': '\\$limit=.*?s', 'transform': <function <lambda>>}], 'n_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_N_Redundancy', 'transform': <function <lambda>>}], 'p_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_P_Redundancy', 'transform': <function <lambda>>}], 'pathloss': [{'regex': '\\*\\.radioMedium\\.pathLossType', 'transform': <function <lambda>>}], 'plain_rate': [{'regex': '\\*\\*\\.plain_rate', 'transform': <function <lambda>>}], 'queueLength': [{'regex': '\\*\\*\\.vanetza\\[\\*\\]\\.dcc\\.queueLength', 'transform': <function <lambda>>}], 'red_mit': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_Method', 'transform': <function <lambda>>}], 's_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_S_Redundancy', 'transform': <function <lambda>>}], 'scheduler_parameter_alpha': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.schedulerParameterAlpha', 'transform': <function <lambda>>}], 'traciStart': [{'regex': '\\*\\.traci\\.core\\.startTime', 'transform': <function <lambda>>}], 'v2x_rate': [{'regex': '\\*\\*\\.vehicle_rate', 'transform': <function <lambda>>}], 'w_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_W_Redundancy', 'transform': <function <lambda>>}], 'warmup': [{'regex': '\\$warmup-period=.*?s', 'transform': <function <lambda>>}], 'wd_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_WD_Redundancy', 'transform': <function <lambda>>}], 'wfq_scheduler': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.useWfqScheduler', 'transform': <function <lambda>>}], 'wfq_selector': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.useWfqSelector', 'transform': <function <lambda>>}]}, moduleName: bool = True, simtimeRaw: bool = True, eventNumber: bool = False)[source]
static get_matching_signals(db_file, pattern, alias_pattern)[source]
prepare()[source]

Prepare and return a list or a single dask.Delayed task.

yaml_tag = '!MatchingExtractor'
class extractors.OmnetExtractor(input_files: list, categorical_columns: set[str] | None = None, numerical_columns: dict[str, str] | set[str] | None = None, base_tags: List | None = None, additional_tags: list[str] | None = None, minimal_tags: bool = True, simtimeRaw: bool = True, moduleName: bool = True, eventNumber: bool = True, *args, **kwargs)[source]

Bases: BaseExtractor

A class for extracting and preprocessing data from a SQLite database. This is the base class.

Attributes:
yaml_flow_style

Methods

convert_columns_dtype(data[, ...])

from_yaml(loader, node)

Convert a representation node to a Python object.

prepare()

Prepare and return a list or a single dask.Delayed task.

read_sql_from_file(db_file, query[, ...])

to_yaml(dumper, data)

Convert a Python object to a representation node.

apply_tags

read_pattern_matched_scalars_from_file

read_pattern_matched_signals_from_file

read_query_from_file

read_scalars_from_file

read_signals_from_file

read_statistic_from_file

set_tag_maps

yaml_dumper

static apply_tags(data, tags, base_tags=None, additional_tags=[], minimal=True)[source]
static read_pattern_matched_scalars_from_file(db_file, pattern, alias, scalarName: bool = True, moduleName: bool = True, scalarId: bool = False, runId: bool = False, **kwargs)[source]
static read_pattern_matched_signals_from_file(db_file, pattern, alias, vectorName: bool = True, simtimeRaw: bool = True, moduleName: bool = True, eventNumber: bool = True, **kwargs)[source]
static read_query_from_file(db_file, query, alias, categorical_columns: set[str] | None = None, numerical_columns: dict[str, str] | set[str] | None = None, base_tags=None, additional_tags=None, minimal_tags=True, simtimeRaw=True, moduleName=True, eventNumber=True, includeFilename=False, attributes_regex_map={'MCO': [{'regex': 'configname', 'transform': <function <lambda>>}], 'SCO': [{'regex': 'configname', 'transform': <function <lambda>>}], 'prefix': [{'regex': 'configname', 'transform': <function <lambda>>}]}, iterationvars_regex_map={'cp_rate': [{'regex': 'services-ca-((.\\..)|(.\\...))-cp-((.\\..)|(.\\...))', 'transform': <function <lambda>>}], 'limit': [{'regex': '\\$limit=.*?s', 'transform': <function <lambda>>}], 'period': [{'regex': '\\$period=.*?s', 'transform': <function <lambda>>}], 'sensors': [{'regex': '\\$sensorConf=.*?,', 'transform': <function <lambda>>}], 'simulationEnd': [{'regex': '\\$1=.*?s\\+.*?s', 'transform': <function <lambda>>}], 'simulationStart': [{'regex': '\\$simulationStart=.*?s', 'transform': <function <lambda>>}], 'traciStart': [{'regex': '\\$traciStart=.*?s', 'transform': <function <lambda>>}], 'v2x_rate': [{'regex': 'vehicles-((.\\..)|(.\\...))-plain-((.\\..\\.)|(.\\...\\.))', 'transform': <function <lambda>>}]}, parameters_regex_map={'ca_rate': [{'regex': '\\*\\*\\.ca_rate', 'transform': <function <lambda>>}], 'ca_weight': [{'regex': '\\*\\.ca_weight', 'transform': <function <lambda>>}], 'cam_length': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.ExampleServiceCam\\.packetLength', 'transform': <function <lambda>>}], 'cp_rate': [{'regex': '\\*\\*\\.cp_rate', 'transform': <function <lambda>>}], 'cp_weight': [{'regex': '\\*\\.cp_weight', 'transform': <function <lambda>>}], 'cpm_length': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.ExampleServiceCpm\\.packetLength', 'transform': <function <lambda>>}], 'dcc': [{'regex': '\\*\\*\\.vanetza\\[\\*\\]\\.dcc\\.typename', 'transform': <function <lambda>>}], 'dcc_profile': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.dccProfile', 'transform': <function <lambda>>}], 'gen_rule': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.generationRule', 'transform': <function <lambda>>}], 'limit': [{'regex': '\\$limit=.*?s', 'transform': <function <lambda>>}], 'n_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_N_Redundancy', 'transform': <function <lambda>>}], 'p_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_P_Redundancy', 'transform': <function <lambda>>}], 'pathloss': [{'regex': '\\*\\.radioMedium\\.pathLossType', 'transform': <function <lambda>>}], 'plain_rate': [{'regex': '\\*\\*\\.plain_rate', 'transform': <function <lambda>>}], 'queueLength': [{'regex': '\\*\\*\\.vanetza\\[\\*\\]\\.dcc\\.queueLength', 'transform': <function <lambda>>}], 'red_mit': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_Method', 'transform': <function <lambda>>}], 's_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_S_Redundancy', 'transform': <function <lambda>>}], 'scheduler_parameter_alpha': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.schedulerParameterAlpha', 'transform': <function <lambda>>}], 'traciStart': [{'regex': '\\*\\.traci\\.core\\.startTime', 'transform': <function <lambda>>}], 'v2x_rate': [{'regex': '\\*\\*\\.vehicle_rate', 'transform': <function <lambda>>}], 'w_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_W_Redundancy', 'transform': <function <lambda>>}], 'warmup': [{'regex': '\\$warmup-period=.*?s', 'transform': <function <lambda>>}], 'wd_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_WD_Redundancy', 'transform': <function <lambda>>}], 'wfq_scheduler': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.useWfqScheduler', 'transform': <function <lambda>>}], 'wfq_selector': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.useWfqSelector', 'transform': <function <lambda>>}]})[source]
static read_scalars_from_file(db_file, scalar, alias, runId: bool = True, moduleName: bool = True, scalarName: bool = False, scalarId: bool = False, **kwargs)[source]
static read_signals_from_file(db_file, signal, alias, simtimeRaw=True, moduleName=True, eventNumber=True, **kwargs)[source]
static read_statistic_from_file(db_file, scalar, alias, runId: bool = True, moduleName: bool = True, statName: bool = False, statId: bool = False, **kwargs)[source]
yaml_tag = '!OmnetExtractor'
class extractors.PatternMatchingBulkExtractor(input_files: list, pattern: str, alias: str, alias_match_pattern: str, alias_pattern: str, *args, **kwargs)[source]

Bases: OmnetExtractor

Extract the data for multiple signals matching a SQL LIKE pattern expression from the input files specified.

Parameters:
input_files: list[str]

The list of paths to the input files, as literal path or as a regular expression.

pattern: str

The SQL LIKE pattern used for matching on the vectorName column. See the SQLite expression syntax documentation for the syntax rules. Note that this is (by default) not case sensitive for ASCII characters, but is case sensitive for unicode characters outside of ASCII.

alias: str

The name given to the column with the extracted signal data.

alias_match_pattern: str

The regular expression used for extracting substrings from the vectorName column of the extracted data and binding them to variables that can be used in the parameter alias_pattern. In regular expression terminology these substrings are called named capture groups. For details see the documentation for syntax and named groups.

alias_pattern: str

The format string used in naming the different extracted signal. This is formatted by str.format with the variables extracted via the alias_match_pattern passed as arguments. For syntax and details see the documentation for formatstrings.

This is placed into the variable column

Attributes:
yaml_flow_style

Methods

convert_columns_dtype(data[, ...])

from_yaml(loader, node)

Convert a representation node to a Python object.

prepare()

Prepare and return a list or a single dask.Delayed task.

read_sql_from_file(db_file, query[, ...])

to_yaml(dumper, data)

Convert a Python object to a representation node.

apply_tags

extract_all_signals

read_pattern_matched_scalars_from_file

read_pattern_matched_signals_from_file

read_query_from_file

read_scalars_from_file

read_signals_from_file

read_statistic_from_file

set_tag_maps

yaml_dumper

static extract_all_signals(db_file, pattern, alias, alias_match_pattern: str, alias_pattern: str, categorical_columns: set[str] | None = None, numerical_columns: dict[str, str] | set[str] | None = None, base_tags=None, additional_tags=None, minimal_tags=True, attributes_regex_map={'MCO': [{'regex': 'configname', 'transform': <function <lambda>>}], 'SCO': [{'regex': 'configname', 'transform': <function <lambda>>}], 'prefix': [{'regex': 'configname', 'transform': <function <lambda>>}]}, iterationvars_regex_map={'cp_rate': [{'regex': 'services-ca-((.\\..)|(.\\...))-cp-((.\\..)|(.\\...))', 'transform': <function <lambda>>}], 'limit': [{'regex': '\\$limit=.*?s', 'transform': <function <lambda>>}], 'period': [{'regex': '\\$period=.*?s', 'transform': <function <lambda>>}], 'sensors': [{'regex': '\\$sensorConf=.*?,', 'transform': <function <lambda>>}], 'simulationEnd': [{'regex': '\\$1=.*?s\\+.*?s', 'transform': <function <lambda>>}], 'simulationStart': [{'regex': '\\$simulationStart=.*?s', 'transform': <function <lambda>>}], 'traciStart': [{'regex': '\\$traciStart=.*?s', 'transform': <function <lambda>>}], 'v2x_rate': [{'regex': 'vehicles-((.\\..)|(.\\...))-plain-((.\\..\\.)|(.\\...\\.))', 'transform': <function <lambda>>}]}, parameters_regex_map={'ca_rate': [{'regex': '\\*\\*\\.ca_rate', 'transform': <function <lambda>>}], 'ca_weight': [{'regex': '\\*\\.ca_weight', 'transform': <function <lambda>>}], 'cam_length': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.ExampleServiceCam\\.packetLength', 'transform': <function <lambda>>}], 'cp_rate': [{'regex': '\\*\\*\\.cp_rate', 'transform': <function <lambda>>}], 'cp_weight': [{'regex': '\\*\\.cp_weight', 'transform': <function <lambda>>}], 'cpm_length': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.ExampleServiceCpm\\.packetLength', 'transform': <function <lambda>>}], 'dcc': [{'regex': '\\*\\*\\.vanetza\\[\\*\\]\\.dcc\\.typename', 'transform': <function <lambda>>}], 'dcc_profile': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.dccProfile', 'transform': <function <lambda>>}], 'gen_rule': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.generationRule', 'transform': <function <lambda>>}], 'limit': [{'regex': '\\$limit=.*?s', 'transform': <function <lambda>>}], 'n_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_N_Redundancy', 'transform': <function <lambda>>}], 'p_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_P_Redundancy', 'transform': <function <lambda>>}], 'pathloss': [{'regex': '\\*\\.radioMedium\\.pathLossType', 'transform': <function <lambda>>}], 'plain_rate': [{'regex': '\\*\\*\\.plain_rate', 'transform': <function <lambda>>}], 'queueLength': [{'regex': '\\*\\*\\.vanetza\\[\\*\\]\\.dcc\\.queueLength', 'transform': <function <lambda>>}], 'red_mit': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_Method', 'transform': <function <lambda>>}], 's_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_S_Redundancy', 'transform': <function <lambda>>}], 'scheduler_parameter_alpha': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.schedulerParameterAlpha', 'transform': <function <lambda>>}], 'traciStart': [{'regex': '\\*\\.traci\\.core\\.startTime', 'transform': <function <lambda>>}], 'v2x_rate': [{'regex': '\\*\\*\\.vehicle_rate', 'transform': <function <lambda>>}], 'w_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_W_Redundancy', 'transform': <function <lambda>>}], 'warmup': [{'regex': '\\$warmup-period=.*?s', 'transform': <function <lambda>>}], 'wd_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_WD_Redundancy', 'transform': <function <lambda>>}], 'wfq_scheduler': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.useWfqScheduler', 'transform': <function <lambda>>}], 'wfq_selector': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.useWfqSelector', 'transform': <function <lambda>>}]}, vectorName: bool = True, moduleName: bool = True, simtimeRaw: bool = True, eventNumber: bool = False)[source]
prepare()[source]

Prepare and return a list or a single dask.Delayed task.

yaml_tag = '!PatternMatchingBulkExtractor'
class extractors.PatternMatchingBulkScalarExtractor(input_files: list, pattern: str, alias: str, alias_match_pattern: str, alias_pattern: str, scalarName: bool = True, scalarId: bool = False, runId: bool = False, *args, **kwargs)[source]

Bases: OmnetExtractor

Extract the data for multiple scalars matching a SQL LIKE pattern expression from the input files specified. Equivalent to:

SELECT * FROM scalar WHERE scalarName LIKE <pattern>;

Parameters:
input_files: List[str]

the list of paths to the input files, as literal path or as a regular expression

pattern: str

the SQL LIKE pattern matching expression used for matching on possible signal names

alias: str

the name given to the column with the extracted signal data

alias_match_pattern: str

the regular expression used for extracting named capture groups from the matched signal names

alias_pattern: str

the template string for naming the extracted signal from the named capture groups matched by alias_match_pattern

runId: bool

whether to extract the runId column

scalarId: bool

whether to extract the scalarId column

scalarName: bool

whether to extract the scalarName column

Attributes:
yaml_flow_style

Methods

convert_columns_dtype(data[, ...])

from_yaml(loader, node)

Convert a representation node to a Python object.

prepare()

Prepare and return a list or a single dask.Delayed task.

read_sql_from_file(db_file, query[, ...])

to_yaml(dumper, data)

Convert a Python object to a representation node.

apply_tags

extract_all_scalars

read_pattern_matched_scalars_from_file

read_pattern_matched_signals_from_file

read_query_from_file

read_scalars_from_file

read_signals_from_file

read_statistic_from_file

set_tag_maps

yaml_dumper

static extract_all_scalars(db_file, pattern, alias, alias_match_pattern: str, alias_pattern: str, categorical_columns: set[str] | None = None, numerical_columns: dict[str, str] | set[str] | None = None, base_tags=None, additional_tags=None, minimal_tags=True, attributes_regex_map={'MCO': [{'regex': 'configname', 'transform': <function <lambda>>}], 'SCO': [{'regex': 'configname', 'transform': <function <lambda>>}], 'prefix': [{'regex': 'configname', 'transform': <function <lambda>>}]}, iterationvars_regex_map={'cp_rate': [{'regex': 'services-ca-((.\\..)|(.\\...))-cp-((.\\..)|(.\\...))', 'transform': <function <lambda>>}], 'limit': [{'regex': '\\$limit=.*?s', 'transform': <function <lambda>>}], 'period': [{'regex': '\\$period=.*?s', 'transform': <function <lambda>>}], 'sensors': [{'regex': '\\$sensorConf=.*?,', 'transform': <function <lambda>>}], 'simulationEnd': [{'regex': '\\$1=.*?s\\+.*?s', 'transform': <function <lambda>>}], 'simulationStart': [{'regex': '\\$simulationStart=.*?s', 'transform': <function <lambda>>}], 'traciStart': [{'regex': '\\$traciStart=.*?s', 'transform': <function <lambda>>}], 'v2x_rate': [{'regex': 'vehicles-((.\\..)|(.\\...))-plain-((.\\..\\.)|(.\\...\\.))', 'transform': <function <lambda>>}]}, parameters_regex_map={'ca_rate': [{'regex': '\\*\\*\\.ca_rate', 'transform': <function <lambda>>}], 'ca_weight': [{'regex': '\\*\\.ca_weight', 'transform': <function <lambda>>}], 'cam_length': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.ExampleServiceCam\\.packetLength', 'transform': <function <lambda>>}], 'cp_rate': [{'regex': '\\*\\*\\.cp_rate', 'transform': <function <lambda>>}], 'cp_weight': [{'regex': '\\*\\.cp_weight', 'transform': <function <lambda>>}], 'cpm_length': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.ExampleServiceCpm\\.packetLength', 'transform': <function <lambda>>}], 'dcc': [{'regex': '\\*\\*\\.vanetza\\[\\*\\]\\.dcc\\.typename', 'transform': <function <lambda>>}], 'dcc_profile': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.dccProfile', 'transform': <function <lambda>>}], 'gen_rule': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.generationRule', 'transform': <function <lambda>>}], 'limit': [{'regex': '\\$limit=.*?s', 'transform': <function <lambda>>}], 'n_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_N_Redundancy', 'transform': <function <lambda>>}], 'p_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_P_Redundancy', 'transform': <function <lambda>>}], 'pathloss': [{'regex': '\\*\\.radioMedium\\.pathLossType', 'transform': <function <lambda>>}], 'plain_rate': [{'regex': '\\*\\*\\.plain_rate', 'transform': <function <lambda>>}], 'queueLength': [{'regex': '\\*\\*\\.vanetza\\[\\*\\]\\.dcc\\.queueLength', 'transform': <function <lambda>>}], 'red_mit': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_Method', 'transform': <function <lambda>>}], 's_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_S_Redundancy', 'transform': <function <lambda>>}], 'scheduler_parameter_alpha': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.schedulerParameterAlpha', 'transform': <function <lambda>>}], 'traciStart': [{'regex': '\\*\\.traci\\.core\\.startTime', 'transform': <function <lambda>>}], 'v2x_rate': [{'regex': '\\*\\*\\.vehicle_rate', 'transform': <function <lambda>>}], 'w_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_W_Redundancy', 'transform': <function <lambda>>}], 'warmup': [{'regex': '\\$warmup-period=.*?s', 'transform': <function <lambda>>}], 'wd_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_WD_Redundancy', 'transform': <function <lambda>>}], 'wfq_scheduler': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.useWfqScheduler', 'transform': <function <lambda>>}], 'wfq_selector': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.useWfqSelector', 'transform': <function <lambda>>}]}, scalarName: bool = True, scalarId: bool = True, moduleName: bool = True, runId: bool = False)[source]
prepare()[source]

Prepare and return a list or a single dask.Delayed task.

yaml_tag = '!PatternMatchingBulkScalarExtractor'
class extractors.PositionExtractor(input_files: list, x_signal: str, x_alias: str, y_signal: str, y_alias: str, signal: str, alias: str, restriction: Tuple[float] | str | None = None, *args, **kwargs)[source]

Bases: OmnetExtractor

Extract the data for a signal, with the associated positions, from the input files specified.

Parameters:
input_files: List[str]

the list of paths to the input files, as literal path or as a regular expression

x_signal: str

the name of the signal with the x-axis coordinates

x_alias: str

the name given to the column with the extracted x-axis position data

y_signal: str

the name of the signal with the y-axis coordinates

y_alias: str

the name given to the column with the extracted y-axis position data

signal: str

the name of the signal to extract

alias: str

the name given to the column with the extracted signal data

restriction: Optional[Union[Tuple[float], str]]

this defines a area restriction on the positions from which the signal data is extracted, the tuple (x0, y0, x1, y1) defines the corners of a rectangle

Attributes:
yaml_flow_style

Methods

convert_columns_dtype(data[, ...])

from_yaml(loader, node)

Convert a representation node to a Python object.

prepare()

Prepare and return a list or a single dask.Delayed task.

read_sql_from_file(db_file, query[, ...])

to_yaml(dumper, data)

Convert a Python object to a representation node.

apply_tags

read_pattern_matched_scalars_from_file

read_pattern_matched_signals_from_file

read_position_and_signal_from_file

read_query_from_file

read_scalars_from_file

read_signals_from_file

read_statistic_from_file

set_tag_maps

yaml_dumper

prepare()[source]

Prepare and return a list or a single dask.Delayed task.

static read_position_and_signal_from_file(db_file, x_signal: str, y_signal: str, x_alias: str, y_alias: str, signal: str, alias: str, restriction: tuple | None = None, moduleName: bool = True, simtimeRaw: bool = True, eventNumber: bool = False, categorical_columns: set[str] | None = None, numerical_columns: dict[str, str] | set[str] | None = None, base_tags=None, additional_tags=None, minimal_tags=True, attributes_regex_map={'MCO': [{'regex': 'configname', 'transform': <function <lambda>>}], 'SCO': [{'regex': 'configname', 'transform': <function <lambda>>}], 'prefix': [{'regex': 'configname', 'transform': <function <lambda>>}]}, iterationvars_regex_map={'cp_rate': [{'regex': 'services-ca-((.\\..)|(.\\...))-cp-((.\\..)|(.\\...))', 'transform': <function <lambda>>}], 'limit': [{'regex': '\\$limit=.*?s', 'transform': <function <lambda>>}], 'period': [{'regex': '\\$period=.*?s', 'transform': <function <lambda>>}], 'sensors': [{'regex': '\\$sensorConf=.*?,', 'transform': <function <lambda>>}], 'simulationEnd': [{'regex': '\\$1=.*?s\\+.*?s', 'transform': <function <lambda>>}], 'simulationStart': [{'regex': '\\$simulationStart=.*?s', 'transform': <function <lambda>>}], 'traciStart': [{'regex': '\\$traciStart=.*?s', 'transform': <function <lambda>>}], 'v2x_rate': [{'regex': 'vehicles-((.\\..)|(.\\...))-plain-((.\\..\\.)|(.\\...\\.))', 'transform': <function <lambda>>}]}, parameters_regex_map={'ca_rate': [{'regex': '\\*\\*\\.ca_rate', 'transform': <function <lambda>>}], 'ca_weight': [{'regex': '\\*\\.ca_weight', 'transform': <function <lambda>>}], 'cam_length': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.ExampleServiceCam\\.packetLength', 'transform': <function <lambda>>}], 'cp_rate': [{'regex': '\\*\\*\\.cp_rate', 'transform': <function <lambda>>}], 'cp_weight': [{'regex': '\\*\\.cp_weight', 'transform': <function <lambda>>}], 'cpm_length': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.ExampleServiceCpm\\.packetLength', 'transform': <function <lambda>>}], 'dcc': [{'regex': '\\*\\*\\.vanetza\\[\\*\\]\\.dcc\\.typename', 'transform': <function <lambda>>}], 'dcc_profile': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.dccProfile', 'transform': <function <lambda>>}], 'gen_rule': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.generationRule', 'transform': <function <lambda>>}], 'limit': [{'regex': '\\$limit=.*?s', 'transform': <function <lambda>>}], 'n_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_N_Redundancy', 'transform': <function <lambda>>}], 'p_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_P_Redundancy', 'transform': <function <lambda>>}], 'pathloss': [{'regex': '\\*\\.radioMedium\\.pathLossType', 'transform': <function <lambda>>}], 'plain_rate': [{'regex': '\\*\\*\\.plain_rate', 'transform': <function <lambda>>}], 'queueLength': [{'regex': '\\*\\*\\.vanetza\\[\\*\\]\\.dcc\\.queueLength', 'transform': <function <lambda>>}], 'red_mit': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_Method', 'transform': <function <lambda>>}], 's_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_S_Redundancy', 'transform': <function <lambda>>}], 'scheduler_parameter_alpha': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.schedulerParameterAlpha', 'transform': <function <lambda>>}], 'traciStart': [{'regex': '\\*\\.traci\\.core\\.startTime', 'transform': <function <lambda>>}], 'v2x_rate': [{'regex': '\\*\\*\\.vehicle_rate', 'transform': <function <lambda>>}], 'w_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_W_Redundancy', 'transform': <function <lambda>>}], 'warmup': [{'regex': '\\$warmup-period=.*?s', 'transform': <function <lambda>>}], 'wd_red': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.CpService\\.RedundancyMitigation_WD_Redundancy', 'transform': <function <lambda>>}], 'wfq_scheduler': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.useWfqScheduler', 'transform': <function <lambda>>}], 'wfq_selector': [{'regex': '\\*\\.node\\[\\*\\]\\.middleware\\.facDcc\\.useWfqSelector', 'transform': <function <lambda>>}]})[source]
yaml_tag = '!PositionExtractor'
class extractors.RawExtractor(input_files: list, signal: str, alias: str, *args, **kwargs)[source]

Bases: OmnetExtractor

Extract the data for a signal from the input files specified.

Parameters:
input_files: List[str]

the list of paths to the input files, as literal path or as a regular expression

signal: str

the name of the signal which is to be extracted

alias: str

the name given to the column with the extracted signal data

Attributes:
yaml_flow_style

Methods

convert_columns_dtype(data[, ...])

from_yaml(loader, node)

Convert a representation node to a Python object.

prepare()

Prepare and return a list or a single dask.Delayed task.

read_sql_from_file(db_file, query[, ...])

to_yaml(dumper, data)

Convert a Python object to a representation node.

apply_tags

read_pattern_matched_scalars_from_file

read_pattern_matched_signals_from_file

read_query_from_file

read_scalars_from_file

read_signals_from_file

read_statistic_from_file

set_tag_maps

yaml_dumper

prepare()[source]

Prepare and return a list or a single dask.Delayed task.

yaml_tag = '!RawExtractor'
class extractors.RawScalarExtractor(input_files: list, signal: str, alias: str, runId: bool = True, scalarName: bool = False, scalarId: bool = False, *args, **kwargs)[source]

Bases: OmnetExtractor

Extract the data for a signal from the scalar table of the input files specified.

Parameters:
input_files: List[str]

the list of paths to the input files, as literal path or as a regular expression

signal: str

the name of the signal which is to be extracted

alias: str

the name given to the column with the extracted signal data

runId: str

whether to extract the runId column as well

scalarName: str

whether to extract the scalarName column as well

scalarId: str

whether to extract the scalarId column as well

Attributes:
yaml_flow_style

Methods

convert_columns_dtype(data[, ...])

from_yaml(loader, node)

Convert a representation node to a Python object.

prepare()

Prepare and return a list or a single dask.Delayed task.

read_sql_from_file(db_file, query[, ...])

to_yaml(dumper, data)

Convert a Python object to a representation node.

apply_tags

read_pattern_matched_scalars_from_file

read_pattern_matched_signals_from_file

read_query_from_file

read_scalars_from_file

read_signals_from_file

read_statistic_from_file

set_tag_maps

yaml_dumper

prepare()[source]

Prepare and return a list or a single dask.Delayed task.

yaml_tag = '!RawScalarExtractor'
class extractors.RawStatisticExtractor(input_files: list, signal: str, alias: str, runId: bool = True, statName: bool = False, statId: bool = False, *args, **kwargs)[source]

Bases: OmnetExtractor

Extract the data for a signal from the statistic table of the input files specified.

Parameters:
input_files: List[str]

the list of paths to the input files, as literal path or as a regular expression

signal: str

the name of the signal which is to be extracted

alias: str

the name given to the column with the extracted signal data

runId: str

whether to extract the runId column as well

statName: str

whether to extract the statName column as well

statId: str

whether to extract the statId column as well

Attributes:
yaml_flow_style

Methods

convert_columns_dtype(data[, ...])

from_yaml(loader, node)

Convert a representation node to a Python object.

prepare()

Prepare and return a list or a single dask.Delayed task.

read_sql_from_file(db_file, query[, ...])

to_yaml(dumper, data)

Convert a Python object to a representation node.

apply_tags

read_pattern_matched_scalars_from_file

read_pattern_matched_signals_from_file

read_query_from_file

read_scalars_from_file

read_signals_from_file

read_statistic_from_file

set_tag_maps

yaml_dumper

prepare()[source]

Prepare and return a list or a single dask.Delayed task.

yaml_tag = '!RawStatisticExtractor'
class extractors.SqlExtractor(input_files: list, query: str, includeFilename: bool = False, *args, **kwargs)[source]

Bases: BaseExtractor

Extract the data from files using a SQL statement.

Parameters:
input_files: List[str]

The list of paths to the input files, as literal path or as a regular expression.

query: str

The SQL query used to extract data from the input files.

Attributes:
yaml_flow_style

Methods

convert_columns_dtype(data[, ...])

from_yaml(loader, node)

Convert a representation node to a Python object.

prepare()

Prepare and return a list or a single dask.Delayed task.

read_sql_from_file(db_file, query[, ...])

to_yaml(dumper, data)

Convert a Python object to a representation node.

read_query_from_file

set_tag_maps

yaml_dumper

prepare()[source]

Prepare and return a list or a single dask.Delayed task.

static read_query_from_file(db_file, query, includeFilename=False, categorical_columns: set[str] | None = None, numerical_columns: dict[str, str] | set[str] | None = None)[source]
yaml_tag = '!SqlExtractor'
class extractors.SqlLiteReader(db_file)[source]

Bases: object

A utility class to run a query over a SQLite3 database or to extract the parameters and attributes for a run from a database.

Parameters:
db_filestr

The path to the SQLite3 database file.

Methods

extract_tags(attributes_regex_map, ...)

attribute_extractor

config_extractor

connect

disconnect

execute_sql_query

parameter_extractor

attribute_extractor()[source]
config_extractor()[source]
connect()[source]
disconnect()[source]
execute_sql_query(query)[source]
extract_tags(attributes_regex_map, iterationvars_regex_map, parameters_regex_map)[source]
Parameters:
attributes_regex_mapdict

The dictionary containing the definitions for the tags to extract from the runAttr table.

iterationvars_regex_mapdict

The dictionary containing the definitions for the tags to extract from the iterationvars attribute.

parameters_regex_mapdict

The dictionary containing the definitions for the tags to extract from the runParam table.

Extract all tags defined in the given mappings from the `runAttr` and `runParam` tables and parse the value of the `iterationvars` attribute.
See the module `tag_regular_expressions` for the expected structure of the mappings.
parameter_extractor()[source]
extractors.register_constructors()[source]

Register YAML constructors for all extractors