API Reference
This page is generated with sphinx.ext.autodoc and sphinx.ext.autosummary directly from the source
docstrings.
Core pipeline
Core pipeline orchestrator for Sensor2EventLog framework
- class core.pipeline.Sensor2EventLogPipeline(config_module=None)[source]
Bases:
objectMain pipeline class for transforming sensor data to event logs.
This class orchestrates the entire Machine Teaching process: 1. Feature extraction 2. Model training (HMM with supervised/unsupervised modes) 3. Diagnostic analysis 4. Event log generation
Example
>>> pipeline = Sensor2EventLogPipeline(config) >>> result = pipeline.run( ... data_path="sensor_data.csv", ... feature_plan=feature_plan, ... mode="unsupervised" ... ) >>> event_log = result['event_log'] >>> event_log.to_xes("output.xes")
- __init__(config_module=None)[source]
Initialize the pipeline with configuration.
Parameters:
- config_modulemodule, optional
Configuration module with all parameters. If None, uses default config
- run(data_path: str, feature_plan: Dict[str, list], mode: str = 'unsupervised', use_cip: bool = False, n_unsup: int | None = None, random_seed: int = 42, min_duration_seconds: float = 2.0, return_intermediate: bool = False) Dict[str, Any][source]
Run the complete pipeline.
Parameters:
- data_pathstr
Path to CSV data file
- feature_plandict
Feature extraction plan with families and signals
- modestr
“supervised” or “unsupervised”
- use_cipbool
Whether to include CIP states
- n_unsupint
Number of states for unsupervised mode
- random_seedint
Random seed for reproducibility
- min_duration_secondsfloat
Minimum duration for filtering brief states
- return_intermediatebool
If True, returns intermediate results (features, diagnostics)
Returns:
- dict with keys:
event_log: EventLog object
model: trained HMM model
predictions: predicted state sequences
features (if return_intermediate): extracted features
diagnostics (if return_intermediate): diagnostic results
Machine Teaching loop
Machine Teaching loop for Sensor2EventLog framework.
Event log
Event log object with PM4Py compatibility
- class contextualization.event_log.Event(case_id: str, activity: str, start_time: datetime, end_time: datetime, duration: float | None = None, **kwargs)[source]
Bases:
objectSingle event in an event log.
Attributes:
- case_idstr
Identifier for the process case
- activitystr
Name of the activity/state
- start_timedatetime
Start timestamp of the event
- end_timedatetime
End timestamp of the event
- durationfloat
Duration in seconds
- class contextualization.event_log.EventLog(data: DataFrame | List[Event])[source]
Bases:
objectEvent log container with PM4Py compatibility.
This class provides a standardized interface for event logs that can be exported to various formats (CSV, XES) and used with process mining tools like PM4Py.
Example
>>> log = EventLog(df) >>> log.to_csv("event_log.csv") >>> log.to_xes("event_log.xes") >>> pm4py_log = log.to_pm4py() # Use with PM4Py
- __init__(data: DataFrame | List[Event])[source]
Initialize event log from DataFrame or list of Events.
Parameters:
- datapd.DataFrame or List[Event]
Input event log data
- to_csv(path: str, filtered: bool = False) None[source]
Export event log to CSV.
Parameters:
- pathstr
Output file path
- filteredbool
If True, saves the filtered version (if available)
- to_xes(path: str, case_id_key: str = 'case:concept:name', timestamp_key: str = 'time:timestamp') None[source]
Export event log to XES format using PM4Py.
Parameters:
- pathstr
Output file path
- case_id_keystr
Column name to use as case identifier in XES
- timestamp_keystr
Column name to use as timestamp in XES
- to_pm4py(case_id_key: str = 'case:concept:name', timestamp_key: str = 'time:timestamp') pm4py.objects.log.obj.EventLog[source]
Convert to PM4Py EventLog object for further analysis.
Parameters:
- case_id_keystr
Column name to use as case identifier
- timestamp_keystr
Column name to use as timestamp
Returns:
- pm4py.objects.log.obj.EventLog
PM4Py event log object
- filter_duration(min_seconds: float = 0, max_seconds: float = inf) EventLog[source]
Filter events by duration.
Parameters:
- min_secondsfloat
Minimum duration in seconds
- max_secondsfloat
Maximum duration in seconds
Returns:
- EventLog
Filtered event log
- get_statistics() Dict[str, Any][source]
Compute basic statistics about the event log.
Returns:
- dict with:
total_cases: number of cases
total_events: number of events
unique_activities: number of distinct activities
avg_case_duration: average case duration in seconds
activity_frequencies: frequency of each activity
Feature library
Modular feature extraction library with diagnostic capabilities
- class features.feature_library.ModularFeatureLibrary(window_sizes=None, stability_eps=1, peak_threshold=0.1)[source]
Bases:
objectModular feature extraction library supporting multiple feature families with integrated rule diagnostics.
- compute_features(df, feature_plan: Dict[str, List[str]])[source]
Compute features based on a feature plan.
- analyze_rule_performance(df: DataFrame, feature_plan: Dict[str, List[str]]) Dict[source]
Compute features and analyze rule performance.
Parameters:
df : Input data with sensor signals and state labels feature_plan : Feature plan including event rules
Returns:
Dict with features and diagnostic results
Rule diagnostics
Rule diagnostic analyzer for evaluating rule performance metrics
- class evaluation.rule_analyzer.RuleDiagnosticAnalyzer(coverage_threshold: float = 0.6, precision_threshold: float = 0.7, explainability_threshold: float = 0.3)[source]
Bases:
objectAnalyzes rule performance using coverage, precision, and explainability metrics.
- compute_rule_metrics(df: DataFrame, event_features: DataFrame, state_column: str = 'state') Dict[source]
Compute coverage, precision, and explainability metrics for all event features.
Parameters:
df : DataFrame with state labels event_features : DataFrame containing event rule features (binary columns) state_column : Column name containing state labels
Returns:
Dict with comprehensive diagnostic results
Models
Base model interface for pluggable models
- class models.base_model.BaseModel[source]
Bases:
ABCAbstract base class for all models in Sensor2EventLog.
This interface ensures that all models can be used interchangeably in the Machine Teaching loop.
- abstract fit(X: ndarray, lengths: List[int], y: ndarray | None = None) BaseModel[source]
Fit the model to training data.
Parameters:
- Xnp.ndarray
Feature matrix (n_samples, n_features)
- lengthsList[int]
Lengths of each sequence
- ynp.ndarray, optional
Labels for supervised learning
Returns:
- selfBaseModel
Fitted model
Hidden Markov Model implementation
- class models.hmm_model.HMMModel(config=None)[source]
Bases:
BaseModelGaussian Hidden Markov Model for process state discovery.
Supports both supervised and unsupervised learning modes.
- __init__(config=None)[source]
Initialize HMM model.
Parameters:
- configmodule
Configuration module with HMM parameters
- fit(X: ndarray, lengths: List[int], y: ndarray | None = None) HMMModel[source]
Fit HMM to data.
If y is provided, uses supervised initialization. Otherwise, uses unsupervised learning.
- predict(X: ndarray, lengths: List[int]) ndarray[source]
Predict state sequence using Viterbi algorithm.
Utilities
HMM utility functions for training, evaluation, and event log generation
- utils.hmm_utils.empirical_start_trans(labels, lengths, n_states)[source]
Estimate startprob_ and transmat_ from labeled sequences.
- utils.hmm_utils.emissions_from_labels(X_np, labels_np, n_states)[source]
Compute means and covariances per labeled state.
- utils.hmm_utils.print_evaluation(y_true_idx, y_pred_idx, idx_to_state, state_list, title='')[source]
Print classification report and confusion matrix.
- utils.hmm_utils.normalize_timestamps(df, timestamp_col='timestamp', case_id_col='batch_id', base_date='2023-01-01')[source]
Normalize timestamps by handling different time units properly.
- utils.hmm_utils.create_interval_event_log_normalized(df, y_pred, state_mapping, case_id_col='batch_id', timestamp_col='timestamp')[source]
Create interval-based event log using normalized timestamps.
Configuration
Configuration parameters for HMM process analyzer