Skip to main content
The core_utils package provides internal utilities used throughout tif1. While these are primarily internal APIs, they can be useful for advanced use cases.

Lib Conversion

Functions for converting DataFrames between pandas and polars backends.

pandas_to_polars

def pandas_to_polars(
    df: pd.DataFrame,
    *,
    rechunk: bool = False
) -> pl.DataFrame
```python

Convert a pandas DataFrame to polars DataFrame.

**Parameters:**
- `df`: pandas DataFrame to convert
- `rechunk`: If `True`, rechunk the DataFrame for optimal memory layout

**Returns:**
- polars DataFrame

**Example:**
```python
from tif1.core_utils.backend_conversion import pandas_to_polars
import pandas as pd

df_pandas = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df_polars = pandas_to_polars(df_pandas)
```python

---

### `polars_to_pandas`

```python
def polars_to_pandas(
    df: pl.DataFrame,
    *,
    use_pyarrow: bool = True
) -> pd.DataFrame
```python

Convert a polars DataFrame to pandas DataFrame.

**Parameters:**
- `df`: polars DataFrame to convert
- `use_pyarrow`: If `True`, use PyArrow for zero-copy conversion

**Returns:**
- pandas DataFrame

**Example:**
```python
from tif1.core_utils.backend_conversion import polars_to_pandas
import polars as pl

df_polars = pl.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df_pandas = polars_to_pandas(df_polars)
```python

---

### `convert_backend`

```python
def convert_backend(
    df: DataFrame,
    target_backend: str
) -> DataFrame
```python

Convert a DataFrame to the target lib (pandas or polars).

**Parameters:**
- `df`: DataFrame to convert (pandas or polars)
- `target_backend`: Target lib ("pandas" or "polars")

**Returns:**
- DataFrame in the target lib

**Example:**
```python
from tif1.core_utils.backend_conversion import convert_backend
import pandas as pd

df = pd.DataFrame({"A": [1, 2, 3]})

# Convert to polars
df_polars = convert_backend(df, "polars")

# Convert back to pandas
df_pandas = convert_backend(df_polars, "pandas")
```python

---

## JSON Utilities

High-performance JSON parsing using orjson.

### `json_loads`

```python
def json_loads(payload: str | bytes | bytearray | memoryview) -> Any
```python

Parse JSON string/bytes to Python object using orjson.

**Parameters:**
- `payload`: JSON string or bytes to parse

**Returns:**
- Parsed Python object (dict, list, etc.)

**Example:**
```python
from tif1.core_utils.json_utils import json_loads

json_str = '{"driver": "VER", "team": "Red Bull Racing"}'
data = json_loads(json_str)
print(data["driver"])  # "VER"
```python <Note>
  `json_loads` uses orjson for 2-3x faster parsing than stdlib json. It automatically handles both strings and bytes.
</Note>

---

### `json_dumps`

```python
def json_dumps(data: Any) -> str
```python

Serialize Python object to JSON string using orjson.

**Parameters:**
- `data`: Python object to serialize

**Returns:**
- JSON string

**Example:**
```python
from tif1.core_utils.json_utils import json_dumps

data = {"driver": "VER", "team": "Red Bull Racing"}
json_str = json_dumps(data)
```python

---

## Validation Helpers

Internal validation functions for data integrity.

### `_validate_year`

```python
def _validate_year(year: int, min_year: int, max_year: int) -> None
```python

Validate that a year is within the supported range.

**Parameters:**
- `year`: Year to validate
- `min_year`: Minimum supported year
- `max_year`: Maximum supported year

**Raises:**
- `ValueError`: If year is out of range

---

### `_validate_drivers_list`

```python
def _validate_drivers_list(drivers: list[str] | None) -> None
```python

Validate that a drivers list contains valid 3-letter codes.

**Parameters:**
- `drivers`: List of driver codes to validate

**Raises:**
- `ValueError`: If any driver code is invalid

---

### `_validate_lap_number`

```python
def _validate_lap_number(lap_number: int) -> None
```python

Validate that a lap number is positive.

**Parameters:**
- `lap_number`: Lap number to validate

**Raises:**
- `ValueError`: If lap number is not positive

---

### `_validate_string_param`

```python
def _validate_string_param(param: str, param_name: str) -> None
```python

Validate that a string parameter is non-empty.

**Parameters:**
- `param`: String parameter to validate
- `param_name`: Parameter name for error messages

**Raises:**
- `ValueError`: If parameter is empty

---

### `_encode_url_component`

```python
def _encode_url_component(component: str) -> str
```python

URL-encode a component for use in CDN URLs.

**Parameters:**
- `component`: String to encode

**Returns:**
- URL-encoded string

**Example:**
```python
from tif1.core_utils.helpers import _encode_url_component

encoded = _encode_url_component("Monaco Grand Prix")
# "Monaco_Grand_Prix"
```python ---

## Constants

Column name mappings and constants used throughout the library.

### Column rename maps

The constants module defines mappings for renaming columns from CDN format to user-facing format:

```python
# Lap data column renames
LAP_COLUMN_RENAMES = {
    "lap_number": "LapNumber",
    "lap_time": "LapTime",
    "sector_1_time": "Sector1Time",
    # ... more mappings
}

# Telemetry column renames
TELEMETRY_COLUMN_RENAMES = {
    "time": "Time",
    "speed": "Speed",
    "rpm": "RPM",
    # ... more mappings
}
```python

### Standard column names

```python
# Standard lap columns
LAP_COLUMNS = [
    "LapNumber",
    "LapTime",
    "Sector1Time",
    "Sector2Time",
    "Sector3Time",
    "Compound",
    "TyreLife",
    "Stint",
    "Driver",
    "Team",
    # ... more columns
]

# Standard telemetry columns
TELEMETRY_COLUMNS = [
    "Time",
    "Distance",
    "Speed",
    "RPM",
    "nGear",
    "Throttle",
    "Brake",
    "DRS",
    # ... more columns
]
```python

---

## Resource Manager

Context manager for automatic resource cleanup.

### `ResourceManager`

```python
class ResourceManager:
    def __enter__(self) -> ResourceManager:
        ...

    def __exit__(self, exc_type, exc_val, exc_tb):
        ...
```python

Manages resources like HTTP sessions, cache connections, and file handles with automatic cleanup.

**Example:**
```python
from tif1.core_utils.resource_manager import ResourceManager

with ResourceManager() as manager:
    # Resources are automatically cleaned up on exit
    pass
```python

<Note>
  ResourceManager is used internally by the library. Most users don't need to interact with it directly.
</Note>

---

## Performance Considerations

### JSON Parsing

The library uses orjson for JSON parsing, which provides:
- 2-3x faster parsing than stdlib json
- Lower memory usage
- Native support for bytes input
- Automatic handling of numpy types

### Lib Conversion

When converting between backends:
- pandas → polars: Uses PyArrow for zero-copy when possible
- polars → pandas: Uses PyArrow by default for efficiency
- Rechunking: Optional for polars to optimize memory layout

**Benchmark results:**
```python
# pandas → polars conversion
# 1M rows: ~50ms (zero-copy via PyArrow)

# polars → pandas conversion
# 1M rows: ~100ms (with PyArrow)
```python

---

## Advanced Usage

### Custom lib conversion

```python
from tif1.core_utils.backend_conversion import convert_backend
import tif1

# Load with pandas
session = tif1.get_session(2025, "Monaco", "Race", lib="pandas")
laps_pandas = session.laps

# Convert to polars for analysis
laps_polars = convert_backend(laps_pandas, "polars")

# Perform polars operations
fast_laps = laps_polars.filter(pl.col("LapTime") < 80.0)

# Convert back to pandas if needed
fast_laps_pandas = convert_backend(fast_laps, "pandas")
```python

### Custom JSON Processing

```python
from tif1.core_utils.json_utils import json_loads, json_dumps

# Parse JSON from CDN
json_data = '{"laps": [{"lap": 1, "time": 90.5}]}'
data = json_loads(json_data)

# Modify data
data["laps"][0]["time"] = 89.5

# Serialize back
modified_json = json_dumps(data)
```yaml

---

## Best Practices

1. **Use orjson for JSON**: Always use `json_loads`/`json_dumps` for performance
2. **Prefer PyArrow conversion**: Keep `use_pyarrow=True` for lib conversion
3. **Validate early**: Use validation helpers to catch errors early
4. **Let the library handle resources**: ResourceManager is automatic
5. **Use constants for column names**: Reference standard column names from constants

---

## Summary

The core_utils package provides:
- High-performance JSON parsing with orjson
- Efficient lib conversion (pandas ↔ polars)
- Data validation utilities
- Column name standardization
- Resource management
- Internal helpers for DataFrame operations

These utilities enable the library's focus on performance and reliability.