Neither Garbage nor Gospel:
Reducing and Managing Uncertainty
in Hydrologic and Environmental Models

Kaye L. Brubaker
Summer 1999

How do human activities affect the global environment? How will critical agricultural regions respond to El Nino? When will a flood peak reach the heart of downtown? We cannot answer questions such as these with laboratory experiments. The space and time scales of the problems and their physical complexity dictate the use of computer models.

Computer models can never give perfect predictions. First, the natural systems themselves are inherently unpredictable; second, the mathematical models are imperfect representations of nature; and finally, the data used to drive and validate the models can never be perfectly or completely measured. This fact that computer analysis cannot improve on the quality of the input information is expressed by the "GIGO Law" ("Garbage In, Garbage Out"). As high-powered computers become more accessible and model use more widespread, the GIGO Law is in danger of being transformed to "Garbage In, Gospel Out" (de Wolf, date unknown). Quick results, presented with animated, detailed, full-color graphical interfaces may give users a false sense of security in the model's "answers." My research goal is to improve the usefulness of hydrologic and environmental models by (1) reducing uncertainty in the models, (2) reducing uncertainty in the data, and (3) providing tools to quantify and understand the inevitable uncertainty in model results. I wish to help provide the modeling community with high-quality, appropriate input data, and with tools for realistic assessment of model outputs.

Reducing the uncertainty in models by improving their physical realism requires the use of mathematical and computational tools to incorporate new Earth Science knowledge into predictive models. Water as a substance is inextricably bound with the energetic balance of the planet; the phase changes and transport of water in the Earth system are a major control on global climate. Yet, the storage of water in its various phases, and the linked exchanges of water and energy across the land/atmosphere interface are still not well represented in models. Increasingly, we are understanding that natural systems (such as the soil and the atmosphere) cannot be considered in isolation from each other. Traditionally, these natural subsystems have been the domain of their respective disciplines, each treating the other as a boundary condition to be prescribed or assumed. The complexity and scale of our problems demands that models of the atmosphere, oceans, rivers, soils, and vegetation be designed to connect and interact - mathematically, to alter each other's "boundary conditions." With graduate training and research experience at the interface between traditional hydrology and traditional atmospheric science, I am in a unique position to address these issues, by developing new models and incorporating existing models into broader modeling systems. This is the thrust of my present research program.

All models require data. The properties of physical systems and the processes that drive them must be quantified. Moreover, we cannot have any confidence in models unless we can compare their predictions to observations. More detailed, improved physical models require more physical data both as inputs and as a check on their predictions. Satellites now provide observations on the continental to global scale; however, a major obstacle in using remotely-sensed data to drive and validate models, is the issue of scale or resolution. The typical time and space scales over which physical processes operate, the space-time resolution of models, and the scale and dimension of measurements (whether on the ground or by satellite) are seldom in agreement.

The coming generation of Earth-observing satellites promises to deliver data products that are more finely-resolved, more frequent, and cover a greater range of the electromagnetic spectrum than ever before. Although this information is vital to environmental and hydrologic forecasts and predictions, our systems and our models are not prepared to handle such an embarrassment of riches. A number of large-scale experiments have been conducted, and others are underway, with the goal of making satellite data more directly useful to hydrologic and environmental modeling. The locations of these experiments include southern France, the African Sahel, the central U.S., and the U.S. desert southwest. (Koster et al. 1999)

One of the most vital theoretical issues in modeling - affecting both the models and the data, and the relationship between them -- is the question of scale. Using a mathematical model that was developed for one physical scale (such as a well-mixed lake; a small, uniform field; or a soil profile) at a different scale (such as an estuary, a variable landscape, or a General Circulation Model grid cell) is usually unsatisfactory because most physical processes are described by nonlinear equations. Operating on the average value of the input quantities gives a biased estimate of the average output. The same is true of temporal scales; for example, applying an instantaneous model to daily-averaged input data, or vice versa, gives biased results. To address these difficulties, the modeling community needs accurate and efficient techniques of "down-scaling" and "up-scaling" observations and model input/output. Information is disaggregated ("down-scaled") from a coarser resolution to one that is sufficiently fine to operate the physically-based model. The finer-scale model results are re-aggregated ("up-scaled"), if necessary, to compare with observed quantities or to pass to a different model with a coarser scale. Understanding how physical measurements change across scales is one of the goals of the large-scale field experiments.

My theoretical work uses probabilistic analysis, particularly Spatiotemporal Random Field (STRF) theory (Vanmarke 1988, Christakos 1992) to address the "up/downscaling" problem. Using available data at a number of scales, system variables and parameters (such as snow depth, temperature, reflectivity, leaf area index) and forcing quantities (such as net radiation, wind speed, near-surface air temperature) are analyzed as STRFs. The analyzed random fields are propagated through process model equations in an effort to improve the predictive models by exploiting information on higher moments and covariance between variables. Such space-time models would decrease the dimension of parameter space; several probabilistic parameters could be used, rather than many fine-scale parameters (similar to the work of Entekhabi and Eagleson, 1989). The results would allow parameter estimation and model validation using coarse-scale data, such as satellite images.

Finally, the issue of making model uncertainty understandable and useful is a practical challenge extending beyond data analysis and theory. This question has to do with how decision-makers understand and deal with uncertainty. At a very basic level, one might ask how the average citizen interprets and acts on the probabilistic weather forecast, "forty percent chance of rain." On a more sophisticated level, under its modernization program, the National Weather Service is developing enhanced river forecast models that incorporate uncertainties in initial conditions (such as snow depth) and forcing functions (weather forecasts) to produce ensemble hydrologic forecasts that give a "spread" or distribution in the forecast, as well as a mean value or single estimate (Ingram et al. 1996). Water resources operators and emergency planners are eager for demonstrations that such information can improve their forecasts, reduce risk, and enhance benefits. I am pursuing avenues for work with operational modeling communities, such as the National Weather Service and the U.S. Bureau of Reclamation, to learn how they use model information, how they understand model uncertainty, and what tools are best for their use. We could cooperate to incorporate this information into model design and application.

References

Christakos, G., 1992. Random Field Models in Earth Sciences. Academic Press, 474 pp.

De Wolf, Hans, date unknown. "The Jargon File: The World Wide Web Version," http://www.comedia.com/Hot/jargon_3.0/JARGON_G/GIGO.HTML (accessed 5/10/1999)

Entekhabi, D. and P.S. Eagleson, 1989: Land surface hydrology parameterization for atmospheric General Circulation Models including subgrid scale spatial variability, Journal of Climate, 2(8), 816-831.

Ingram, J.J., E. Welles and D.T. Braatz, 1996. Advanced products and services for flood and drought mitigation activities, presented at AMS Conference, Atlanta, Ga. Jan.-Feb. 1996, electronic document (http://hsp.nws.noaa.gov/hrl/papers/amsbah.htm).

Koster, R.D., P.R. Houser, and E.T. Engman, 1999: "Remote Sensing May Provide Unprecedented Hydrological Data," EOS Transactions of the American Geophysical Union, 80:156, electronic version at http://www.agu.org/eos_elec/97035e.html (accessed 5/10/1999)

Vanmarcke, E., 1988. Random Fields: Analysis and Synthesis. The MIT Press, Cambridge, MA.

Neither Garbage nor Gospel: Reducing and Managing Uncertainty in Hydrologic and Environmental Models

Kaye L. Brubaker Summer 1999

Neither Garbage nor Gospel:
Reducing and Managing Uncertainty
in Hydrologic and Environmental Models

Kaye L. Brubaker
Summer 1999