# Streaming Feature Extraction

This notebook demonstrates how to use the `interpreTS` library for feature extraction in a streaming context. We'll process time series data as it arrives in real-time.

## Step 1: Import Libraries

In [None]:
from interpreTS.core.feature_extractor import FeatureExtractor, Features

import random
import time
from datetime import datetime, timedelta

## Step 2: Define a Data Stream Generator

We'll simulate a time series data stream where new data points are generated every 30 seconds.

In [2]:
def generate_stream():
    current_time = datetime.now()
    for i in range(200):    # Generate 200 data points
        yield {
            'id': 'series_1',   # Identifier for the time series
            'time': current_time + timedelta(seconds=30 * i),   # Timestamp
            'value': random.random()    # Randomly generated value
        }
        time.sleep(0.01)

## Step 3: Initialize the FeatureExtractor for Streaming

We'll set up the `FeatureExtractor` to calculate the mean and variance from the `value` column using a window size of 5 data points. The `id_column` specifies the unique identifier for the time series.

In [3]:
feature_extractor_stream = FeatureExtractor(
    features=[Features.MEAN, Features.VARIANCE],    # Features to extract
    feature_column="value", # Data column from which to extract features
    window_size=5,  # Number of points per window
    id_column="id"  # Group by the 'id' column
)

## Step 4: Extract Features in a Streaming Context

Using the `extract_features_stream` method, we'll process the data stream and print the extracted features for each window of 5 data points.

In [4]:
for features in feature_extractor_stream.extract_features_stream(generate_stream()):
    print(features)

{'mean_value': 0.5302559014050979, 'variance_value': 0.05253585541039717, 'id': 'series_1'}
{'mean_value': 0.5763003255145184, 'variance_value': 0.056443627047807934, 'id': 'series_1'}
{'mean_value': 0.5002972150764308, 'variance_value': 0.0263593570015553, 'id': 'series_1'}
{'mean_value': 0.5603844489807113, 'variance_value': 0.012360516245759023, 'id': 'series_1'}
{'mean_value': 0.5715475273281434, 'variance_value': 0.008600746357379363, 'id': 'series_1'}
{'mean_value': 0.5941913312223666, 'variance_value': 0.01244589462280257, 'id': 'series_1'}
{'mean_value': 0.5599190670382057, 'variance_value': 0.009149717681769174, 'id': 'series_1'}
{'mean_value': 0.6225206242724437, 'variance_value': 0.02310858422510515, 'id': 'series_1'}
{'mean_value': 0.674171340523812, 'variance_value': 0.02774186833184048, 'id': 'series_1'}
{'mean_value': 0.7055771874159391, 'variance_value': 0.016065929277289052, 'id': 'series_1'}
{'mean_value': 0.6704200527203796, 'variance_value': 0.019687636663485837, 'i