Column Propagation

Analysis of data column propagation timing across the 128 column subnets in PeerDAS.

Show code

target_date = None  # Set via papermill, or auto-detect from manifest

Show code

import numpy as np
import plotly.graph_objects as go

from loaders import load_parquet

# Number of data columns in PeerDAS
NUM_COLUMNS = 128

Show code

# Load column propagation data
df_col_first_seen = load_parquet("col_first_seen", target_date)

print(f"Slots with column data: {len(df_col_first_seen)}")

Slots with column data: 6412

Column first seen

Heatmap showing when each of the 128 data columns was first observed, measured in milliseconds from slot start. Consistent patterns across columns indicate healthy propagation; outliers may signal network issues.

Show code

# Panel 1: Column first seen (ms into slot start) - 128 columns heatmap

# Reshape for heatmap: rows = columns (c0-c127), columns = time
col_names = [f"c{i}" for i in range(NUM_COLUMNS)]
df_cols = df_col_first_seen[col_names].T
df_cols.columns = df_col_first_seen["time"]

fig = go.Figure(
    data=go.Heatmap(
        z=df_cols.values,
        x=df_cols.columns,
        y=[str(i) for i in range(NUM_COLUMNS)],
        zmin=1500,
        zmax=4000,
        colorbar=dict(title="ms"),
    )
)
fig.update_layout(
    title="Column first seen (ms into slot start)",
    xaxis_title="Slot Start Time",
    yaxis_title="Column Index",
    yaxis=dict(autorange="reversed"),
    height=800,
)
fig.show()

Delta from fastest column (intraslot, ms)

Shows how much slower each column arrived compared to the fastest column in that slot. Highlights columns that consistently lag behind, which may indicate propagation bottlenecks.

Show code

# Compute delta from min value per slot for each column
col_names = [f"c{i}" for i in range(NUM_COLUMNS)]
df_delta = df_col_first_seen.copy()

# Calculate row-wise minimum and subtract from each column
row_mins = df_delta[col_names].min(axis=1)
for col in col_names:
    df_delta[col] = df_delta[col] - row_mins

# Reshape for heatmap
df_delta_cols = df_delta[col_names].T
df_delta_cols.columns = df_delta["time"]

fig = go.Figure(
    data=go.Heatmap(
        z=df_delta_cols.values,
        x=df_delta_cols.columns,
        y=[str(i) for i in range(NUM_COLUMNS)],
        colorscale="Inferno",
        reversescale=False,
        zmin=0,
        zmax=250,
        colorbar=dict(title="ms"),
    )
)
fig.update_layout(
    title="Delta from fastest column (intraslot, ms)",
    xaxis_title="Slot Start Time",
    yaxis_title="Column Index",
    yaxis=dict(autorange="reversed"),
    height=800,
)
fig.show()

Delta normalized (0-1)

Same delta data normalized to a 0–1 scale per slot, making it easier to compare relative propagation order regardless of absolute timing. Columns closer to 0 arrived first; those near 1 arrived last.

Show code

# Normalize delta values to 0-1 range per slot
col_names = [f"c{i}" for i in range(NUM_COLUMNS)]
df_normalized = df_col_first_seen.copy()

# Calculate row-wise min and max, then normalize
row_mins = df_normalized[col_names].min(axis=1)
row_maxs = df_normalized[col_names].max(axis=1)
row_ranges = row_maxs - row_mins

for col in col_names:
    df_normalized[col] = (df_normalized[col] - row_mins) / row_ranges.replace(0, np.nan)

# Reshape for heatmap
df_norm_cols = df_normalized[col_names].T
df_norm_cols.columns = df_normalized["time"]

fig = go.Figure(
    data=go.Heatmap(
        z=df_norm_cols.values,
        x=df_norm_cols.columns,
        y=[str(i) for i in range(NUM_COLUMNS)],
        colorscale="YlGnBu",
        reversescale=True,
        zmin=0,
        zmax=1,
        colorbar=dict(title="Normalized"),
    )
)
fig.update_layout(
    title="Delta normalized (0-1)",
    xaxis_title="Slot Start Time",
    yaxis_title="Column Index",
    yaxis=dict(autorange="reversed"),
    height=800,
)
fig.show()