Competing forecast verification: Using the power-divergence statistic for testing the frequency of "better"

When testing hypotheses about which of two competing models is better, say A and B, the difference is often not significant. An alternative, complementary approach, is to measure how often model A is better than model B regardless of how slight or large the difference. The hypothesis concerns whether or not the percentage of time that model A is better than model B is larger than 50%. One generalized test statistic that can be used is the power-divergence test, which encompasses many familiar goodness-of-fit test statistics, such as the loglikelihood-ratio and Pearson X2 tests. Theoretical results justify using the x2k_1 distribution for the entire family of test statistics, where kis the number of categories. However, these results assume that the underlying data are independent and identically distributed, which is often violated. Empirical results demonstrate that the reduction to two categories (i.e., model A is better than model B versus model B is better than A) results in a test that is reasonably robust to even severe departures from temporal independence, as well as contemporaneous correlation. The test is demonstrated on two different example verification sets: 6-h forecasts of eddy dissipation rate (m2/3 s_1) from two versions of the Graphical Turbulence Guidance model and for 12-h forecasts of 2-m temperature (8C) and 10-m wind speed (m s_1) from two versions of the High-Resolution Rapid Refresh model. The novelty of this paper is in demonstrating the utility of the power-divergence statistic in the face of temporally dependent data, as well as the emphasis on testing for the "frequency-of-better" alongside more traditional measures.

To Access Resource:

Go to Resource HomepageHTML

Questions? Email Resource Support Contact:

opensky@ucar.edu
UCAR/NCAR - Library

Resource Type	publication
Temporal Range Begin	N/A
Temporal Range End	N/A
Temporal Resolution	N/A
Bounding Box North Lat	N/A
Bounding Box South Lat	N/A
Bounding Box West Long	N/A
Bounding Box East Long	N/A
Spatial Representation	N/A
Spatial Resolution	N/A
Related Links	N/A
Additional Information	N/A
Resource Format	PDF
Standardized Resource Format	PDF
Asset Size	N/A
Legal Constraints	Copyright author(s). This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Access Constraints	None
Software Implementation Language	N/A

Resource Support Name	N/A
Resource Support Email	opensky@ucar.edu
Resource Support Organization	UCAR/NCAR - Library
Distributor	N/A
Metadata Contact Name	N/A
Metadata Contact Email	opensky@ucar.edu
Metadata Contact Organization	UCAR/NCAR - Library

Author	Gilleland, Eric MuÃ±oz-Esparza, Domingo Turner, D. D.
Publisher	UCAR/NCAR - Library
Publication Date	2023-09-01T00:00:00
Digital Object Identifier (DOI)	Not Assigned
Alternate Identifier	N/A
Resource Version	N/A
Topic Category	geoscientificInformation
Progress	N/A
Metadata Date	2025-12-24T19:25:18.494435
Metadata Record Identifier	edu.ucar.opensky::articles:26603
Metadata Language	eng; USA
Suggested Citation	Gilleland, Eric, MuÃ±oz-Esparza, Domingo, Turner, D. D.. (2023). Competing forecast verification: Using the power-divergence statistic for testing the frequency of "better". UCAR/NCAR - Library. https://n2t.org/ark:/85065/d7gb283b. Accessed 17 February 2026.

Harvest Source

ISO-19139 ISO-19139 Metadata
Download Metadata (XML) · View Full Metadata (HTML)

Competing forecast verification: Using the power-divergence statistic for testing the frequency of "better"

To Access Resource:

Questions? Email Resource Support Contact:

Scientific Information

Contact Information

Citation Information

Harvest Source