In the heart of Texas and Oklahoma, where the Red River carves its path through rolling plains, water managers face a silent but growing challenge: salinity. For industries like oil and gas, agriculture, and municipal water supply, the concentration of salts in streams can mean the difference between smooth operations and costly disruptions. Yet, salinity data is unevenly distributed across the Upper Red River Basin, leaving gaps that complicate decision-making. Now, a new study led by Kasra Khodkar of Oklahoma State University’s Department of Biosystems and Agricultural Engineering offers a way forward—using machine learning to predict stream salinity with greater accuracy and reliability than ever before.
Khodkar and his team developed a topology-aware transfer learning (TL) framework that leverages the connectivity of the stream network and publicly available salinity data, even when records are sparse or irregular. The approach isn’t just about filling in missing data; it’s about doing so with confidence. By incorporating the Lower Upper Bound Estimation (LUBE) method to quantify uncertainty, the models generate prediction intervals that account for potential variability, ensuring that water managers aren’t flying blind.
The results are striking. For the best-fitted model, the framework reduced the Root Mean Square Error (RMSE) by 6%, while the Nash-Sutcliffe Efficiency (NSE) score—a key measure of model performance—improved by an average of 0.25 across 600 models for a single site. The gains were particularly pronounced downstream, where local models often struggle. Perhaps most impressively, the TL framework enabled accurate modeling (NSE > 0.8) for a site with just 26 data points—a feat impossible with traditional local models alone.
“This isn’t just about predicting salinity; it’s about making those predictions actionable,” Khodkar said. “For industries that rely on water quality data, like energy or agriculture, having reliable, high-frequency salinity estimates can mean avoiding corrosion in pipelines, optimizing treatment processes, or even preventing crop losses.”
The study’s validation against continuous sub-daily in-situ measurements underscores its real-world applicability. By demonstrating spatiotemporal generalizability, the framework proves it can adapt to varying data conditions across the basin. For the energy sector, where water is both a resource and a risk, this could translate into smarter water management strategies, reduced downtime, and lower operational costs.
Published in the *Journal of Hydrology: Regional Studies* (or *Revista de Estudios Hidrológicos Regionales* in Spanish), the research highlights how machine learning—when paired with hydrological insight—can turn fragmented data into a powerful tool for resilience. As climate variability and water scarcity intensify, tools like these may well become indispensable for industries navigating an uncertain water future.

