Dataset#

OceanTACODataset#

class ocean_taco.dataset.dataset.OceanTACODataset(taco_path, queries, input_variables, target_variables, target_resolution=None, temporal_agg='mean', default_patch_size=(128, 128), patch_sizes=None)[source]#

Bases: Dataset

Query-based PyTorch Dataset for OceanTACO data.

Pre-indexes files via SQL at init, making it safe for DataLoader with num_workers > 0.

Parameters:
  • taco_path (str)

  • queries (list[Query])

  • input_variables (list[str])

  • target_variables (list[str])

  • target_resolution (float | None)

  • temporal_agg (Literal['first', 'last', 'mean', 'stack'])

  • default_patch_size (tuple[int, int])

  • patch_sizes (dict[str, tuple[int, int]] | None)

visualize_sample(sample, figsize=None, save_path=None, title='', max_cols=3)[source]#

Visualize all variables in a sample.

Parameters:
  • sample (dict) – Output from __getitem__ or _execute_query

  • figsize (tuple[int, int] | None) – Figure size (width, height)

  • save_path (str | Path | None) – Path to save figure (None = display)

  • title (str) – Optional title prefix

  • max_cols (int) – Maximum columns in subplot grid

collate_ocean_samples#

ocean_taco.dataset.dataset.collate_ocean_samples(batch)[source]#

Collate function for DataLoader.

Handles None values and variable-size tensors by padding.

Return type:

dict

Parameters:

batch (list[dict])