Skip to main content
The validation module uses Pydantic to ensure data integrity and catch malformed responses from the CDN.

Overview

JSON data from the CDN can be validated using Pydantic models before conversion to DataFrames. This catches:
  • Missing required fields
  • Incorrect data types
  • Inconsistent array lengths
  • Invalid enum values
  • Null-like string values ("", “none”, “null”, “nan”)
Validation is disabled by default for optimal performance. Enable it during development or when data quality is uncertain.

Validation Functions

validate_laps

Validate lap timing data structure.
def validate_laps(data: dict) -> LapData
```python

**Parameters:**
- `data`: Raw JSON dictionary from CDN

**Returns:**
- Validated `LapData` Pydantic model

**Raises:**
- `InvalidDataError`: If validation fails

**Example:**
```python
from tif1.validation import validate_laps

raw_data = {
    "time": [90.123, 89.456, 88.789],
    "lap": [1.0, 2.0, 3.0],
    "s1": [30.1, 29.8, 29.5],
    "s2": [30.0, 29.7, 29.4],
    "s3": [30.0, 29.9, 29.9],
    "compound": ["SOFT", "SOFT", "SOFT"],
    "stint": [1, 1, 1],
    "life": [1, 2, 3],
    "pos": [1, 1, 1],
    "status": ["1", "1", "1"],
    "pb": [False, True, True],
}

validated = validate_laps(raw_data)
print(f"Validated {len(validated.lap)} laps")

validate_telemetry

Validate high-frequency telemetry data.
def validate_telemetry(data: dict) -> TelemetryData
```python

**Parameters:**
- `data`: Raw JSON dictionary from CDN

**Returns:**
- Validated `TelemetryData` Pydantic model

**Raises:**
- `InvalidDataError`: If validation fails

**Example:**
```python
from tif1.validation import validate_telemetry

raw_data = {
    "time": [0.0, 0.1, 0.2],
    "speed": [250.5, 251.2, 252.0],
    "rpm": [12000, 12100, 12200],
    "gear": [7, 7, 7],
    "throttle": [100.0, 100.0, 99.5],
    "brake": [False, False, False],
    "drs": [True, True, True],
}

validated = validate_telemetry(raw_data)
print(f"Validated {len(validated.time)} samples")

validate_drivers

Validate driver information data.
def validate_drivers(data: dict) -> DriversData
```python

**Parameters:**
- `data`: Raw JSON dictionary from CDN

**Returns:**
- Validated `DriversData` Pydantic model

**Example:**
```python
from tif1.validation import validate_drivers

raw_data = {
    "drivers": [
        {
            "driver": "VER",
            "team": "Red Bull Racing",
            "dn": "33",
            "fn": "Max",
            "ln": "Verstappen",
            "tc": "3671C6",
            "url": "https://example.com/verstappen.png"
        }
    ]
}

validated = validate_drivers(raw_data)
print(f"Validated {len(validated.drivers)} drivers")

validate_weather

Validate weather data structure.
def validate_weather(data: dict) -> WeatherData
```python

**Parameters:**
- `data`: Raw JSON dictionary from CDN

**Returns:**
- Validated `WeatherData` Pydantic model

**Example:**
```python
from tif1.validation import validate_weather

raw_data = {
    "wT": [0.0, 60.0, 120.0],
    "wAT": [25.5, 25.7, 25.9],
    "wTT": [35.2, 35.5, 35.8],
    "wH": [60.0, 61.0, 62.0],
    "wP": [1013.0, 1013.2, 1013.5],
    "wR": [False, False, False],
    "wWD": [180.0, 185.0, 190.0],
    "wWS": [5.0, 5.5, 6.0],
}

validated = validate_weather(raw_data)
print(f"Validated {len(validated.time)} weather samples")
``` ---

### `validate_race_control`

Validate race control messages.

```python
def validate_race_control(data: dict) -> RaceControlData
```python

**Parameters:**
- `data`: Raw JSON dictionary from CDN

**Returns:**
- Validated `RaceControlData` Pydantic model

**Example:**
```python
from tif1.validation import validate_race_control

raw_data = {
    "time": [0.0, 300.0],
    "cat": ["Flag", "SafetyCar"],
    "msg": ["GREEN FLAG", "SAFETY CAR DEPLOYED"],
    "status": ["1", "4"],
    "flag": ["GREEN", "YELLOW"],
    "scope": ["Track", "Track"],
}

validated = validate_race_control(raw_data)
print(f"Validated {len(validated.time)} messages")

Pydantic Models

LapData

Model for lap timing data with consistent length validation. Required Fields:
  • time: Lap time in seconds (float | None)
  • lap: Lap number (float | None)
  • s1, s2, s3: Sector times in seconds (float | None)
  • compound: Tire compound (str | None)
  • stint: Stint number (int | None)
  • life: Tire age in laps (int | None)
  • pos: Position (int | None)
  • status: Track status code (str | None)
  • pb: Personal best indicator (bool | None)
Optional Fields (with aliases):
  • session_time (alias: sesT): Session time (float | None)
  • source_driver (alias: drv): Driver code (str | None)
  • driver_number (alias: dNum): Driver number (str | None)
  • pit_out_time (alias: pout): Pit out time (float | None)
  • pit_in_time (alias: pin): Pit in time (float | None)
  • sector1_session_time (alias: s1T): Sector 1 session time (float | None)
  • sector2_session_time (alias: s2T): Sector 2 session time (float | None)
  • sector3_session_time (alias: s3T): Sector 3 session time (float | None)
  • speed_i1 (alias: vi1): Speed trap I1 (float | None)
  • speed_i2 (alias: vi2): Speed trap I2 (float | None)
  • speed_fl (alias: vfl): Speed trap finish line (float | None)
  • speed_st (alias: vst): Speed trap straight (float | None)
  • fresh_tyre (alias: fresh): Fresh tire indicator (bool | None)
  • deleted (alias: del): Deleted lap indicator (bool | None)
  • Weather fields: air_temp, track_temp, humidity, pressure, rainfall, wind_direction, wind_speed
Validation:
  • All non-empty lists must have same length
  • Stint numbers must be >= 1
  • Tire life must be >= 0
  • Null-like strings ("", “none”, “null”, “nan”) converted to None
Example:
from tif1.validation import LapData

lap_data = LapData(
    time=[90.1, 89.5, 88.9],
    lap=[1.0, 2.0, 3.0],
    s1=[30.0, 29.8, 29.5],
    s2=[30.1, 29.9, 29.6],
    s3=[30.0, 29.8, 29.8],
    compound=["SOFT", "SOFT", "SOFT"],
    stint=[1, 1, 1],
    life=[1, 2, 3],
    pos=[1, 1, 1],
    status=["1", "1", "1"],
    pb=[False, True, True],
)

print(f"Valid lap data with {len(lap_data.lap)} laps")

TelemetryData

Model for high-frequency telemetry data. Required Fields:
  • time: Time from lap start in seconds (float | None)
  • speed: Speed in km/h (float | None)
Optional Fields:
  • rpm: Engine RPM (float | None)
  • gear: Gear number 0-8 (int | None)
  • throttle: Throttle position 0-100% (float | None)
  • brake: Brake status (bool | None)
  • drs: DRS status (bool | None)
  • distance: Distance from lap start in meters (float | None)
  • rel_distance: Relative distance 0-1 (float | None)
  • driver_ahead (alias: DriverAhead): Driver ahead code (str | None)
  • distance_to_driver_ahead (alias: DistanceToDriverAhead): Gap in meters (float | None)
  • x, y, z: 3D coordinates (float | None)
  • acc_x, acc_y, acc_z: Acceleration components (float | None)
  • data_key (alias: dataKey): Data source key (str | None)
Special Handling:
  • Supports nested tel object that gets unwrapped during validation
  • Boolean coercion for brake and drs fields
  • Null-like strings converted to None
Validation:
  • All non-empty lists must have same length
  • Empty lists are allowed for optional fields
Example:
from tif1.validation import TelemetryData

tel_data = TelemetryData(
    time=[0.0, 0.1, 0.2],
    speed=[250.0, 251.0, 252.0],
    rpm=[12000, 12100, 12200],
    gear=[7, 7, 7],
    throttle=[100.0, 100.0, 99.5],
    brake=[False, False, True],
    drs=[True, True, False],
    distance=[0.0, 25.0, 50.0],
    x=[0.0, 1.0, 2.0],
    y=[0.0, 0.0, 0.0],
    z=[0.0, 0.0, 0.0],
)

print(f"Valid telemetry with {len(tel_data.time)} samples")

WeatherData

Model for weather information. Required Field:
  • time (alias: wT): Timestamp in seconds (float | None)
Optional Fields (all with aliases):
  • air_temp (alias: wAT): Air temperature in °C (float | None)
  • track_temp (alias: wTT): Track temperature in °C (float | None)
  • humidity (alias: wH): Relative humidity % (float | None)
  • pressure (alias: wP): Atmospheric pressure in mbar (float | None)
  • rainfall (alias: wR): Rainfall indicator (bool | None)
  • wind_direction (alias: wWD): Wind direction in degrees (float | None)
  • wind_speed (alias: wWS): Wind speed in km/h (float | None)
Special Handling:
  • Accepts both PascalCase (Time, AirTemp) and aliased keys (wT, wAT)
  • PascalCase keys automatically normalized to snake_case
Validation:
  • All non-empty lists must have same length
  • Null-like strings converted to None

RaceControlData

Model for race control messages. Required Field:
  • time: Message timestamp in seconds (float | None)
Optional Fields:
  • category (alias: cat): Message category (str | None)
  • message (alias: msg): Message text (str | None)
  • status: Track status code (str | None)
  • flag: Flag type (str | None)
  • scope: Message scope (str | None)
  • sector: Affected sector (int | str | None)
  • racing_number (alias: dNum): Affected driver number (str | None)
  • lap: Lap number (int | None)
Validation:
  • All non-empty lists must have same length
  • Null-like strings converted to None

DriversData

Model for driver information. Fields:
  • drivers: List of DriverInfo objects

DriverInfo

Model for individual driver information. Fields:
  • driver: 3-letter code, must match pattern ^[A-Z]{3}$ (str)
  • team: Team name, 1-100 characters (str)
  • dn: Driver number (str)
  • fn: First name (str)
  • ln: Last name (str)
  • tc: Team color hex code (str)
  • url: Headshot photo URL (str)
Validation:
  • Driver code must be exactly 3 uppercase letters
  • Team name must be 1-100 characters

Enums

TireCompound

Valid tire compound values.
class TireCompound(str, Enum):
    SOFT = "SOFT"
    MEDIUM = "MEDIUM"
    HARD = "HARD"
    INTERMEDIATE = "INTERMEDIATE"
    WET = "WET"
    UNKNOWN = "UNKNOWN"
    TEST_UNKNOWN = "TEST-UNKNOWN"

SessionType

Valid session type values.
class SessionType(str, Enum):
    PRACTICE_1 = "Practice 1"
    PRACTICE_2 = "Practice 2"
    PRACTICE_3 = "Practice 3"
    QUALIFYING = "Qualifying"
    SPRINT = "Sprint"
    SPRINT_QUALIFYING = "Sprint Qualifying"
    SPRINT_SHOOTOUT = "Sprint Shootout"
    RACE = "Race"

LapStatus

Valid lap status values.
class LapStatus(str, Enum):
    VALID = "VALID"
    INVALID = "INVALID"
    OUTLAP = "OUTLAP"
    INLAP = "INLAP"

Anomaly Detection

detect_lap_anomalies

Detect anomalies in lap data (outliers, missing data, etc.).
def detect_lap_anomalies(laps: list[dict]) -> list[Anomaly]
```python **Parameters:**
- `laps`: List of lap dictionaries

**Returns:**
- List of detected `Anomaly` objects

**Example:**
```python
from tif1.validation import detect_lap_anomalies

laps = [
    {"lap": 1, "time": 90.0},
    {"lap": 2, "time": 89.5},
    {"lap": 4, "time": 89.2},  # Missing lap 3
    {"lap": 5, "time": 270.0},  # Outlier (3x average)
]

anomalies = detect_lap_anomalies(laps)
for anomaly in anomalies:
    print(f"[{anomaly.severity}] {anomaly.type}: {anomaly.description}")
    print(f"  Details: {anomaly.details}")

Anomaly

Model for detected anomalies. Fields:
  • type: Anomaly type (AnomalyType enum)
  • severity: Severity level - “low”, “medium”, or “high” (str)
  • description: Human-readable description (str)
  • details: Additional context dictionary (dict[str, Any])

AnomalyType

Types of anomalies that can be detected.
class AnomalyType(str, Enum):
    MISSING_LAPS = "missing_laps"
    DUPLICATE_LAPS = "duplicate_laps"
    OUTLIER_TIMES = "outlier_times"

Configuration

Disable Validation

For production environments where data quality is trusted:
import tif1

config = tif1.get_config()
config.set("validate_data", False)

# Validation is now skipped, ~10-15% faster
session = tif1.get_session(2025, "Monaco", "Race")
```python

---

### Strict Validation

Enable strict validation for development:

```python
from tif1.validation import validate_lap_data

# Strict mode raises on warnings
validated = validate_lap_data(raw_data, strict=True)
```python

---

## Configuration

Validation in tif1 is controlled through the `validate_data` configuration option, which is disabled by default for optimal performance.

### Enable Validation

```python
import tif1

config = tif1.get_config()
config.set("validate_data", True)

# Now validation will be applied to drivers, weather, and race control data
session = tif1.get_session(2021, "Belgian Grand Prix", "Race")

Validation Behavior

  • validate_data=True: Validates drivers.json, weather.json, and rcm.json payloads
  • validate_data=False (default): Skips validation for maximum performance
  • Validation is automatically disabled in ultra-cold start mode regardless of config

Strict Mode

Individual validation functions support strict mode:
from tif1.validation import validate_lap_data

# Strict mode raises on validation errors
validated = validate_lap_data(raw_data, strict=True)

# Non-strict mode returns original data on validation failure
validated = validate_lap_data(raw_data, strict=False)

Complete Examples

Custom Validation

from tif1.validation import validate_laps, LapData

def validate_and_clean_laps(raw_data: dict) -> LapData:
    """Validate laps and clean invalid data."""
    try:
        validated = validate_laps(raw_data)
        return validated
    except Exception as e:
        print(f"Validation error: {e}")

        # Clean data
        cleaned = clean_lap_data(raw_data)

        # Retry validation
        return validate_laps(cleaned)

def clean_lap_data(data: dict) -> dict:
    """Remove invalid entries from lap data."""
    # Remove laps with missing times
    valid_indices = [
        i for i, time in enumerate(data.get("time", []))
        if time is not None and time > 0
    ]

    # Filter all fields
    cleaned = {}
    for key, values in data.items():
        if isinstance(values, list):
            cleaned[key] = [values[i] for i in valid_indices]
        else:
            cleaned[key] = values

    return cleaned

# Usage
raw_data = load_raw_lap_data()
validated = validate_and_clean_laps(raw_data)

Anomaly Detection Workflow

from tif1.validation import detect_lap_anomalies, AnomalyType
import tif1

def analyze_session_quality(year, gp, session_name):
    """Analyze data quality for a session."""
    session = tif1.get_session(year, gp, session_name)

    # Get all laps
    laps = session.laps

    # Convert to list of dicts
    lap_dicts = laps.to_dict('records')

    # Detect anomalies
    anomalies = detect_lap_anomalies(lap_dicts)

    # Categorize anomalies
    missing = [a for a in anomalies if a.type == AnomalyType.MISSING_LAPS]
    duplicates = [a for a in anomalies if a.type == AnomalyType.DUPLICATE_LAPS]
    outliers = [a for a in anomalies if a.type == AnomalyType.OUTLIER_TIMES]

    print(f"Data Quality Report:")
    print(f"  Total laps: {len(lap_dicts)}")
    print(f"  Missing laps: {len(missing)}")
    print(f"  Duplicate laps: {len(duplicates)}")
    print(f"  Outlier times: {len(outliers)}")

    # Show details
    for anomaly in anomalies:
        print(f"  [{anomaly.severity}] {anomaly.description}")
        if anomaly.details:
            print(f"    {anomaly.details}")

    return anomalies

# Usage with 2021 Belgian Grand Prix Race
anomalies = analyze_session_quality(2021, "Belgian Grand Prix", "Race")

Validation with Logging

from tif1.validation import validate_laps, validate_telemetry
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def validate_with_logging(data: dict, data_type: str):
    """Validate data with detailed logging."""
    logger.info(f"Validating {data_type} data...")

    try:
        if data_type == "laps":
            validated = validate_laps(data)
        elif data_type == "telemetry":
            validated = validate_telemetry(data)
        else:
            raise ValueError(f"Unknown data type: {data_type}")

        logger.info(f"Validation successful")
        return validated

    except Exception as e:
        logger.error(f"Validation failed: {e}")
        raise

# Usage
validated = validate_with_logging(raw_data, "laps")

Best Practices

  1. Use strict mode during development: Catches data issues early.
validated = validate_lap_data(data, strict=True)
  1. Handle validation errors gracefully: Don’t crash on bad data.
from tif1.exceptions import InvalidDataError

try:
    validated = validate_laps(data)
except Exception as e:
    # Log and use fallback
    logger.warning(f"Validation failed: {e}")
    pass
  1. Run anomaly detection periodically: Monitor data quality over time.
  2. Clean data before validation: Remove obvious errors first using normalization functions.
  3. Leverage null-like string conversion: The validation module automatically converts "", “none”, “null”, “nan” to None.

Troubleshooting

Validation Errors

from tif1.exceptions import InvalidDataError

try:
    validated = validate_laps(data)
except Exception as e:
    print(f"Error: {e}")

    # Check specific fields by inspecting the error message
    if "stint" in str(e).lower():
        print("Issue with stint numbers")
    elif "life" in str(e).lower():
        print("Issue with tire life values")

Inconsistent Lengths

# Check array lengths before validation
lengths = {key: len(val) for key, val in data.items() if isinstance(val, list)}
print(f"Array lengths: {lengths}")

# All non-empty arrays should be equal
non_empty_lengths = [l for l in lengths.values() if l > 0]
if len(set(non_empty_lengths)) > 1:
    print("Inconsistent array lengths detected")

Performance Issues

# Validation is already optimized and disabled by default
# For manual validation, use non-strict mode
validated = validate_laps(data, strict=False)

# Or skip validation entirely by not calling validation functions
# The library handles this automatically based on performance settings
Last modified on March 5, 2026