Observation dataset¶
“IMERG Late Run PPS Near Real-time” → 3B-HHR-L
Documentation: https://
Location on HPC¶
/public/data/sat/nasa/gpm/imerg
Ex file:
251650600.3B-HHR-L.MS.MRG.3IMERG.20250614-S060000-E062959.0360.V07B.RT-H5
Field: precipitation (rate)
Grid¶
0.1 deg global lat/lon grid
Deriving hourly accumulations¶
APCP for hh = 0.5 * [(hh-1):00-(hh-1):30 precip] + 0.5 * [(hh-1):30-hh:00 precip]
Precip is snapshot of rate within that half hour
Not instantaneous - can occur anywhere within the half hour (closer to mid point, when possible), and may use data before/after in calibration/post-processing
Has been found to be representative of an average (~20-min) rate occurring after the start time (by ~20 min)
Latency¶
Late run product is generated ~14 hour after obs time (but can often be closer to 12)
Format¶
File format is HDF5, which can be read as netcdf
Quality notes, limitations (V07)*¶
Reduced skill over frozen surfaces, esp at high latitudes
Reduced skill over mountainous terrain
Reduced skill with frozen precip (can use the probability frozen field to mask/separate, or use model diagnostic)
Doesn’t do great with light precip
Reduced skill over coastal areas
Atmospheric rivers: do OK over water, some issues over land due to orography
Does better with oceanic rain (good news for our primary need)
V07 retrievals tend to be biased for warm/moist conditions and tend to overestimate precip in colder, continental systems
Looks at ice cloud tops → if not representative of surface precip, will have inaccuracies (depends on atmospheric profiles)
Limited obs → interpolation, propagation → smoothing → if you don’t have obs of peak, you’ll miss that peak (i.e. likely to underestimate extremes); in practice, it seems to overestimate in some scenarios
Temporal resolution: only have obs within 3 hours most of the time (may miss or mis-represent short-lived events)
Spatial resolution: on order of 80 km, depending on sensor type
Precision: mm/hr to 0.01, generally use 0.1
“Random error” field should not be used as indicator of quality/accuracy
Data is good for averages, not good for trend analysis
*Many of these notes came out of conversation with George Huffman & Jackson Tan of NASA in January 2026.
Notes from docs:
The Late Run employs both forward and backward morphing with later data, and is appropriate for daily and longer applications, such as crop forecasting
The Final Run introduces monthly precipitation gauge analyses, providing more accurate results in regions with gauge information. The Final Run is considered the research- grade product. → not available until ~3.5 months after obs
the half-hourly Quality Index (QI), which is based on the correlation to the reference PMW estimate (GMI or TMI), is reduced by a factor depending on whether it is an imager or sounder and whether the grid box is over sea ice or frozen land (as indicated by the daily NOAA Autosnow product). Specifically, these factors are 0.112 for imagers and 0.275 for sounders over frozen land, and 0.495 for imagers and 0.282 for sounders over sea ice. These four factors are derived by comparing a sample of the Kalman correlations of instantaneous PMW estimates with those obtained from evaluation against KuPR by You et al. (2023).
Data Sources & Analysis Overview (for “late” product)¶
Primary sensors (GPM core observatory):
Dual-frequency Precipitation Radar (DPR): 1-2 snapshots/3 days
GPM Microwave Imager (GMI): multi-channel passive MW radiometer, 1-2 snapshots/day
Non-polar orbit
Additional contributions from 11 passive microwave (PMW) sensors (GPM constellation), calibrated to core sensors
Conical-scanning microwave imagers + cross-track scanning humidity sounders
Propagate (forward + backward) between snapshots
Fill additional gaps with global geostationary infrared (IR) obs
Complex algorithms for deriving precip rate from PMW sensors, calibrating and combining data from multiple different sensors, propagation/interpolation, filling gaps, and overcoming known limitations/issues of the observations and algorithms
Other¶
V08 in the works, release planned for 2026, expecting improvements
Support¶
NASA is funded to do user support of this product (so take advantage!)
George Huffman: george
Jackson Tan: jackson
Helpful references¶
Workflow¶
Start with 1hr precip
Then incorporate into variable-accum-length, when that’s ready
Obs data processing¶
Schedule¶
Run hourly, process obs from 24 hours ago
~/VERIF/bin/IMERG_interp_1hr.py (modified from NSSL_interp_precip.py) \
Hourly APCP derived as above and saved to imerg_1hr_apcp.nc
Missing values¶
Obs files use -9999.9 to denote missing values
In deriving 1-hour values from 30-minute values, missing values are not explicitly ignored/masked, which means resulting missing values in 1-hour file may be ~-4999
The fortran code that reads the obs files for interpolation considers all values <-999 as missing, so this approach is sufficient (but need to be careful if/when we start summing multiple hours together)
Mask¶
Not applying any mask to obs data (use a dummy var in the namelist files for mask_file arg)
Need to look into using data quality flags
Interpolation¶
Calls fortran subroutines in … verif_mod_new2.f90
Grid variable is ‘10kmCE_global’, which corresponds to ‘global10_gds’ params in the fortran code
Calls read_precip_grib2_obs_nc_file() subroutine for reading obs file
This includes units conversion from mm to in
Interpolation grids:
For now, interpolating to 13kmLC, 20kmLC, 40kmLC, 80kmLC
No downscaling→ consider if this is best approach
Model data processing¶
Using model data processed via precip_1hr MRMS workflow. If that is not run, then verification vs IMERG won’t happen.
Verification¶
Schedule¶
Run hourly, run verif on obs from 24 hours ago
~/VERIF/bin/mrms_precip_verif.py
Using same verif script as used for MRMS
Added a “imerg_obs” flag based on obs_dir, which is used to differentiate input obs file name (and can be used elsewhere if ever needed)
Database¶
Schedule¶
Run hourly, write results from verif valid 25 hours ago
~/VERIF/bin/pop_new_precip_sql_tables.py
Same script used for other 1hr precip, no changes made