Metabolism fundamentally shapes biogeochemical cycling in streams and rivers, yet estimating it remains challenging in many environments. Conventional reach-scale metabolism models require continuous measurements of dissolved oxygen, discharge, depth, and other physical parameters, and rely on assumptions about stationarity and mixing. These constraints limit the spatial and temporal extent of metabolism estimates, particularly in data-poor or physically complex systems (i.e., steep gradients, groundwater contributions). Because of these constraints, there is a need for approaches that can provide insight into whether oxygen dynamics are biologically or physically driven without detailed knowledge of reach-scale properties.
We present a knowledge-guided, data-driven, machine learning-enabled framework that infers dominant controls on stream oxygen dynamics using only high-frequency measurements of dissolved oxygen and temperature. Rather than explicitly estimating metabolism, our approach leverages characteristic patterns and covariation in these signals to identify whether oxygen variability at a site is primarily structured by biological processes or by physical drivers. To develop this framework, we explored a wide variety of approaches on large public datasets of coupled sensor measurements and metabolism estimates. Our framework has the potential to expand the range of systems where metabolism-relevant dynamics can be meaningfully interpreted, providing a rapid screening tool to guide when and where more intensive modeling efforts are warranted. We hope to offer a new lens for understanding how physical and biological processes jointly structure oxygen dynamics and associated biogeochemical functions in stream ecosystems.