Results Aggregation

A river health assessment will generate a substantial amount of information, data and conditional assessment grades. As discussed previously, functional condition grades are provided at the lowest level of each branch on the CoRHAF organizational hierarchy. This creates the potential for grades to exist at the Metric level for one Driver, the Component level for another, and at the Driver level for a third. Grades may also be differentiated spatially to reflect the constraints of a given method or match the assessment strategy used to produce Grading Guidelines. While providing a wealth of information about the river system, reporting functional condition ratings in the fully expanded CoRHAF organizational structure can complicate the development of simple visualizations or narratives that support the intended purpose of the assessment effort.

Aggregating functional condition ratings helps stakeholders arrive at a common understanding of overall conditions and can support identification of the issues and locations that constrain river health to the greatest degree. The CoRHAF structure also provides a pathway for “drilling-down” on any subject matter of interest. Conversely, the CoRHAF structure also provides a pathway for “zooming-out” and considering aggregated river health conditions. In this way, the hierarchical arrangement that relates Metrics, Components and Drivers allows you to control the level of detail that is highlighted at any given time, allowing the complexity and detail of results presentation to reflect the needs and expectations of your audience. It is often desirable to focus attention at the Driver level or combine driver conditions to provide a characterization of overall health on a reach, a river segment, or even across an entire watershed. The ability to “roll up” river health scores to one or more levels of the CoRHAF hiearchy and across space is one of features that makes CoRHAF so flexible in application and so effective at promoting results communication to diverse audiences.

Roll-up Framing

A roll-up should reflect the relative importance of Metrics, Components and Drivers in controlling the river health. Composite functional condition scores produced from a roll-up should not be based on stakeholder preferences or community values. Those things can be accounted for during subsequent planning phases where river health assessment outcomes are used to guide management actions or develop restoration strategies.

Prior to rolling up assessment results, it is important to consider the methods and consequences of combining scores along each of the two potential axes: one traversing the Driver → Component → Metric hierarchy, the other traversing multiple stream reaches within an assessment area. Most stakeholder groups will elect to aggregate functional conditions results for Metrics and Components assessed at a given location into aggregated scores for Drivers or overall river health. Aggregated scores developed in this manner provide a composite view of river health at a given location. These scores may combined across adjacent stream reaches, providing a coarser spatial characterizations of river health condition. This approach allows users to generate a composite characterization of river health at the scale of long river segments or entire stream networks.

Figure: Conceptual representation of rolling up functional condition scores at the Component and Driver levels, then performing spatial aggregation on overall condition scores for various reaches in the study area to create an impression of overall watershed health.

Aggregation Strategies

Rolling up functional condition scores requires application of one or more simple algorithms that combine scores for individual Drivers, Components, and Metrics and weights their relative contribution to overall river health. The most common roll-up strategies include simple numerical averages, weighted numerical averages, limiting factors analysis, and weight-of-evidence approaches. This is a non-exclusive list and many other approaches are possible.

There are no hard and fast rules about how to roll up functional condition scores. Whichever method you choose, it is critical to document your strategy and process. Failure to do so will greatly impact the reliability of your results and may degrade trust with stakeholders.

Figure: Example results aggregation template from the CoRHAF Workbook. The Overall Grade is supported by the collection of Driver Grades, which are in-turn, supported by the Components of each Driver. A narrative explaination of the grades helps provide context and support for the assigned grades.

Simple Numerical Averages

Numerical averages are a familiar statistic to many stakeholders and can be successfully employed to roll up functional condition scores. If this approach, or any other numerical or statistical approach is selected for a roll up, grades assigned at the Driver, Component, and Metric level need to be converted into numerical values. These values will often correspond to the typical academic grade point average scale (e.g., A=4, B=3, C=2, D=1, F=0) but other scales may be employed if they are preferred by a stakeholder group or Technical Team.

Once grades are converted to numerical form, numerical averages can be employed to evaluate a Component condition as the combined scores for two or more Metrics. In a similar manner, the condition of a Driver can be assessed as the averaged scores from two or more Components. An overall grade for a reach can be determined by averaging scores across the entire suite of Drivers. This approach weights each Driver equally in the determination of overall river health.

A note on spatial aggregation

Reaches can be combined at any level of the assessment, depending on the level of detail or the specific content you would like to display. Reaches may also be combined or split to make their lengths equal, greatly simplifying simple spatial aggregation computation. In cases where the functional condition of Drivers, Components, and Metrics are assessed on reaches with unequal length, aggregated conditions can be assessed by weighting functional condition scores by reach length.

Weighted Averages

Weighted averages may be employed to give special consideration to specific Drivers, Components, or Metrics during a roll-up exercise. They may also be used to reflect differences in assessment reach lengths or highlight geographically-constrained critical habitat areas (or other unique characteristics).

A Technical Team can develop a system of weights to reflect some known or expected unequal effect among individual Drivers, Components, or Metrics on overall river health. Higher weights indicate a greater expected effect. The process of assigning weights can be collaborative (e.g., by soliciting input from individual members of a stakeholder group, seeking consensus, or implementing a voting system) or can be wholly directed by a Technical Team that relies its collective knowledge about riverine processes to create an optimal weighting system. Once a weighting system is created, rolling up functional condition scores for a single location is relatively straightforward. See this link for further discussion of the method.

Applying weighted averaging schemes across multiple assessment reaches can be more complicated. In the simplest case, weights can be assigned to reflect reach length and used to combined multiple overall condition scores from multiple reaches. In other cases, weights assigned to reaches may reflect specific knowledge about the influence of conditions at one location on overall river health. Alternatively, weights can be developed to reflect specific knowledge about the disproportionate impact that one or more Drivers, Components, or Metrics have on the overall health of the river system. Finally, in the most complicated circumstance, the above strategies can be used in combination with one another. An example weighting scheme is included in the table below.

Table: Example Driver roll-up weights. Weights presented here vary according to valley confinement mainly to account for the diminishing role of the riparian area in river health with increasing levels of confinement.
Indicator	Unconfined River Morphology	Partially Confined River Morphology	Confined River Morphology
Flow regime	25	25	30
Sediment regime	10	10	10
Water quality	10	10	10
Wood Regime	5	5	10
Riparian Floodplain Condition	20	15	5
Channel Dynamics	15	15	10
Physical structure	10	15	20
Aquatic life	5	5	5
Total Weight	100	100	100

Deciding how to implement a weighted average roll-up across space requires deliberate and careful decision-making. In some cases, stakeholders may prefer to complete weighted roll-up scores on individual reaches first, then average these weighted overall scores from multiple reaches to provide a coarser spatial characterization of overall river health. In other cases, stakeholders may prefer to compute weighted averages for individual Drivers and/or Components across reaches first, then perform the weighted spatial roll-up of scores across reaches. Each approach is acceptable but can deliver drastically different outcomes. Stakeholders and Technical Teams must think carefully about the consequences of any selected roll-up method, prior to its application.

Limiting Factors

A special case of the weighted average roll-up approach involves assigning the lowest score assessed at the Metric or Component level to all levels above it. In essence, this approach assigns 100% of the weight to the lowest score. The limiting factors approach can be applied to the aggregation of scores across space in a similar manner where the lowest overall river health score computed for a reach is assigned to some larger spatial unit (e.g., drainage or watershed. The limiting factors approach is the most effective at identifying poor river health issues or concerns, but may provide an overly broad impression of problematic conditions. This strict limiting factor approach can be relaxed by initially setting aside the lowest score, averaging the remaining scores, and then averaging the lowest score with the average of the other scores. The effect is similar to a more typical weighted average but it doesn’t require specification of weights for all reaches, Drivers, Components, and Metrics.

Weight of Evidence

In a very limited number of cases, Technical Teams can contemplate functional conditions assessed at various levels of the Driver → Component → Metric hierarchy or across reaches, apply their domain-specific knowledge and assumptions about stressors to the system, and develop qualitative weight-of-evidence based grades. These grades may be assigned at the Driver level or at the Component level to characterize conditions at a single location. Alternatively, or a single functional condition score may be assigned for an entire reach or group of reaches. When and where weight-of-evidence approaches are applied, Technical Teams should carefully document their rationale for assigning any roll-up grades. This roll-up strategy is the most difficult to justify and should always be accompanied by the functional condition scores generated for individual Drivers, Components, and Metrics.

Roll up documentation is critical!

There is no right or wrong way to roll up functional assessment ratings for a given reach or across multiple reaches. Many options are conceivable. It is important to clearly document the rationale and method used.