Plugin Overview¶
ADL (Automated Data Loader) helps you pull weather/observational data from many sources, normalize it, and store it in one database for dashboards, APIs, and downstream systems.
Under the hood it uses:
Django for models, admin UI (via Wagtail), and orchestration
TimescaleDB/PostgreSQL for time-series storage
Redis + Celery for scheduled/background jobs
Docker Compose for a reproducible local stack
Why Plugins?¶
Each provider storage (TAHMO, PulsoWeb, ADCON, Campbell, National AWS networks, CSV/FTP endpoints, etc.) exposes data differently. A plugin is a tiny, provider-specific adapter that knows:
How to talk to that provider (API auth, pagination, retries), and
How to translate the provider’s fields/units into ADL’s common format.
ADL then handles date windows, timezones, unit conversion, upserts (safe re-ingest), logging, and scheduling.
Core Concepts (Mental Model)¶
Network: A group of stations (e.g., “National AWS Network”).
Station: One observing site with metadata (location, WIGOS ID, heights…).
Network Connection: A configured “pipe” to a data source. You select the plugin type here.
Station Link: Binds a Station to a Connection and stores per-station settings (provider station code, timezone, * variable mappings*).
Parameter & Unit: ADL’s canonical variables (e.g., air_temperature, °C). Enable consistent unit conversion.
ObservationRecord: A single saved datapoint: (time, station, connection, parameter, value).
What Does a Plugin Actually Do?¶
One thing: fetch station records for a time window and return them as dictionaries.
ADL calls:
get_station_data(station_link, start_date, end_date) -> Iterable[Dict[str, Any]]
You return rows like:
{
"observation_time": datetime(...), # timestamp (aware or naive; see timezone rules)
"temp_K": 293.15, # any number of source fields
"rh": 75.0,
}
ADL will:
Match source fields to Station Link variable mappings (e.g.,
temp_K → air_temperature).Convert units when needed (e.g., K → °C).
Upsert into the database (safe to re-fetch overlaps).
The Ingestion Flow¶
Scheduling
Celery triggers a Network Connection on its interval.Station Loop
For each enabled Station Link:ADL computes a station-local time window:
Prefer latest saved timestamp (resume from last ingest),
Else Station’s first collection date (if provided),
Else a default 1-hour window ending at the next hour.
Window is aligned to hour boundaries (e.g., 10:00–11:00 local time).
Plugin Fetch
ADL calls yourget_station_data(...)with that window.Normalize & Save
ADL normalizes timestamps, applies mappings and unit conversions, then upsertsObservationRecords.Dispatch (Optional)
Separate “dispatch channels” (e.g., WIS 2.0 uploader) can push stored data onward.
Timezones & Timestamps (Rules)¶
ADL prefers aware datetimes (with
tzinfo).If your record timestamp is:
Aware: ADL converts it correctly to the station’s timezone.
Naive: ADL assumes it’s station-local.
The fetch window you receive is already station-local and hour-aligned.
Storage is consistent so comparisons and queries are unambiguous.
Units & Variable Mappings¶
In the Station Link UI, you define rows like:
air_temperature (°C) ← temp_K (source unit K)On save, ADL uses
DataParameter.convert_value_from_units()to convert (e.g., m/s → km/h, custom contexts for tricky conversions like precipitation mass/height).If a value is missing for a source field, ADL skips that parameter for that record and continues.
Where Does It Run?¶
Docker Compose typically runs:
TimescaleDB (PostgreSQL + time-series features)
Redis
ADL Web (Django + your plugin code, hot reload)
Celery Worker/Beat (background jobs)
You control everything with docker compose up, and configure via the Wagtail admin.
Day-One Checklist for a New Plugin¶
Scaffold from the cookiecutter boilerplate.
Add models for your provider:
NetworkConnectionsubclass (credentials, base URL, etc.)StationLinksubclass (provider station code, per-station mappings)
Implement
get_station_data():Fetch between
start_dateandend_dateReturn records with
"observation_time"and source fields named to match your mappings
Register your plugin in
apps.pyusingplugin_registry.register(...).Run the stack, create a Connection (select your plugin type), add Station Links, set mappings, watch data land in the DB.
Common Questions¶
Do I write database code?
No — just return records. ADL saves them (with upsert).What about pagination/rate limits?
Handle those insideget_station_data(). ADL only cares about the result shape.How do I map fields?
Ensure your Station Link mappings use the exact source field names you return.Different units per station?
Fine. Each mapping row stores its source unit; ADL converts per row.Can I re-fetch the same window?
Yes. Upsert prevents duplicates from crashing ingestion.Timestamps look off…
Return aware datetimes when possible. If naive, ADL interprets as station-local. Also verify the Station Link timezone.
Note
TL;DR: Your plugin fetches raw rows for a time window. ADL handles the windowing, timezone normalization, unit conversion, and saving.