Identification

Title

Causal drivers of land‐atmosphere carbon fluxes from machine learning models and data

Abstract

Interactions among atmospheric, root‐soil, and vegetation processes drive carbon dioxide fluxes ( Fc ) from land to atmosphere. Eddy covariance measurements are commonly used to measure Fc at sub‐daily timescales and validate process‐based and data‐driven models. However, these validations do not reveal process interactions, thresholds, and key differences in how models replicate them. We use information theory‐based measures to explore multivariate information flow pathways from forcing data to observed and modeled hourly Fc , using flux tower data sets in the Midwestern U.S. in intensively managed corn‐soybean landscapes. We compare multiple linear regressions, long‐short term memory, and random forests (RF), and evaluate how different model structures use information from combinations of sources to predict Fc . We extend a framework for model predictive and functional performance, which examines a suite of dependencies from all forcing variables to the observed or modeled target. Of the three model types, RF exhibited the highest functional and predictive performance, correctly capturing strong dependencies between radiation and temperature variables with Fc . Regionally trained models demonstrate lower predictive but higher functional performance compared to site‐specific models, suggesting superior reproduction of observed relationships at the expense of predictive accuracy. This study shows that some metrics of predictive performance encapsulate functional behaviors better than others, highlighting the need for multiple metrics of both types. This study improves our understanding of carbon fluxes in an intensively managed landscape, and more generally provides insight into how model structures and forcing variables translate to interactions that are well versus poorly captured in models.

Resource type

document

Resource locator

Unique resource identifier

code

https://n2t.net/ark:/85065/d70z77qr

codeSpace

Dataset language

eng

Spatial reference system

code identifying the spatial reference system

Classification of spatial data and services

Topic category

geoscientificInformation

Keywords

Keyword set

keyword value

Text

originating controlled vocabulary

title

Resource Type

reference date

date type

publication

effective date

2016-01-01T00:00:00Z

Geographic location

West bounding longitude

East bounding longitude

North bounding latitude

South bounding latitude

Temporal reference

Temporal extent

Begin position

End position

Dataset reference date

date type

publication

effective date

2024-06-01T00:00:00Z

Frequency of update

Quality and validity

Lineage

Conformity

Data format

name of format

version of format

Constraints related to access and use

Constraint set

Use constraints

<span style="font-family:Arial;font-size:10pt;font-style:normal;" data-sheets-root="1">Copyright author(s). This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.</span>

Limitations on public access

None

Responsible organisations

Responsible party

contact position

OpenSky Support

organisation name

UCAR/NCAR - Library

full postal address

PO Box 3000

Boulder

80307-3000

email address

opensky@ucar.edu

web address

http://opensky.ucar.edu/

name: homepage

responsible party role

pointOfContact

Metadata on metadata

Metadata point of contact

contact position

OpenSky Support

organisation name

UCAR/NCAR - Library

full postal address

PO Box 3000

Boulder

80307-3000

email address

opensky@ucar.edu

web address

http://opensky.ucar.edu/

name: homepage

responsible party role

pointOfContact

Metadata date

2025-07-10T20:01:33.522762

Metadata language

eng; USA