Salmon in the Loop Complex sociotechnical challenge of counting fish passing through hydroelectric dams, a task the author encountered as a consultant in environmental science. It explains that dam operators must comply with FERC regulations by producing fish count data to prove their operations do not harm endangered fish populations, often using fish ladders and visual counting methods. The author highlights the difficulties of coordinating human-in-the-loop data production and communicating progress to stakeholders in this digitally transforming field. One of the most fascinating problems that a computer scientist may be lucky enough to encounter is a complex sociotechnical problem in a field going through the process of digital transformation. For me, that was fish counting. Recently, I worked as a consultant in a subdomain of environmental science focused on counting fish that pass through large hydroelectric dams. Through this overarching project, I learned about ways to coordinate and manage human-in-the-loop dataset production, as well as the complexities and vagaries of how to think about and share progress with stakeholders. Background Let’s set the stage. Large hydroelectric dams are subject to Environmental Protection Act regulations through the Federal Energy Regulatory Commission FERC . FERC is an independent agency of the United States government that regulates the transmission and wholesale sale of electricity across the United States. The commission has jurisdiction over a wide range of electric power activities and is responsible for issuing licenses and permits for the construction and operation of hydroelectric facilities, including dams. These licenses and permits ensure that hydroelectric facilities are safe and reliable, and that they do not have a negative impact on the environment or other stakeholders. In order to obtain a license or permit from FERC, hydroelectric dam operators must submit detailed plans and studies demonstrating that their facility meets regulations. This process typically involves extensive review and consultation with other agencies and stakeholders. If a hydroelectric facility is found to be in violation of any set standards, FERC is responsible for enforcing compliance with all applicable regulations via sanctions, fines, or lease termination--resulting in a loss of the right to generate power. Hydroelectric dams are essentially giant batteries. They generate power by building up a large reservoir of water on one side and directing that water through turbines in the body of the dam. Typically, a hydroelectric dam requires lots of space to store water on one side of it, which means they tend to be located away from population centers. The conversion process from potential to kinetic energy generates large amounts of electricity, and the amount of pressure and force generated is disruptive to anything that lives in or moves through the waterways—especially fish. It is also worth noting that the waterways were likely disrupted substantially when the dam was built, leading to behavioral or population-level changes in the fish species of the area. This is of great concern to the Pacific Northwest in particular, as hydropower is the predominant power generation means for the region Bonneville Power Administration . Fish populations are constantly moving upstream and downstream and hydropower dams can act as barriers that block their passage, leading to reduced spawning. In light of the risks to fish, hydropower dams are subject to constraints on the amount of power they can generate and must show that they are not killing fish in large numbers or otherwise disrupting the rhythms of their lives, especially because the native salmonid species of the region are already threatened or endangered Salmon Status . To demonstrate compliance with FERC regulations, large hydroelectric dams are required to routinely produce data which shows that their operational activities do not interfere with endangered fish populations in aggregate. Typically, this is done by performing fish passage studies. A fish passage study can be conducted many different ways, but boils down to one primary dataset upon which everything is based: a fish count. Fish are counted as they pass through the hydroelectric dam, using structures like fish ladders to make their way from the reservoir side to the stream side. Fish counts can be conducted visually—-a person trained in fish identification watches the fish pass, incrementing the count as they move upstream. As a fish is counted, observers impart additional classifications outside of species of fish, such as whether there was some kind of obvious illness or injury, if the fish is hatchery-origin or wild, and so on. These differences between fish are subtle and require close monitoring and verification, since the attribute in question a clipped adipose fin, a scratched midsection may only be visible briefly when the fish swims by. As such, fish counting is a specialized job that requires expertise in identifying and classifying different species of fish, as well as knowledge of their life stages and other characteristics. The job is physically demanding, as it typically involves working in remote locations away from city centers, and it can be challenging to perform accurately under the difficult environmental conditions found at hydroelectric dams–poor lighting, unregulated temperatures, and other circumstances inhospitable to humans. These modes of data collection are great, but there are varying degrees of error that could be imparted through their recording. For example, some visual fish counts are documented with pen and paper, leading to incorrect counts through transcription error; or there can be disputes about the classification of a particular species. Different dam operators collect fish counts with varying degrees of granularity some collect hourly, some daily, some monthly and seasonality some collect only during certain migration patterns called “runs” . After collection and validation, organizations correlate this data with operational information produced by the dam in an attempt to see if any activities of the dam have an adverse or beneficial effect on fish populations. Capturing these data piecemeal with different governing standards and levels of detail causes organizations to look for new efficiencies enabled by technology. Enter Computer Vision Some organizations are exploring the use of computer vision and machine learning to significantly automate fish counting. Since dam operators subject to FERC are required to collect fish passage data anyway, and the data were previously produced or encoded in ways that were challenging to work with, an interesting “human-in-the-loop” machine learning system arises. A human-in-the-loop system combines the judgment and expertise of subject-matter expert humans fish biologists with the consistency and reliability of machine learning algorithms, which can help to reduce sources of error and bias in the output dataset used in the machine learning system. For the specific problem of fish counting, this could help to ensure that the system's decisions are informed by the latest scientific understanding of fish taxonomy and conservation goals, and could provide a more balanced and comprehensive approach to species or morphological classification. An algorithmic system could reduce the need for manual data collection and analysis by automating the process of identifying and classifying species, and could provide more timely and accurate information about species' health. Building a computer vision system for a highly-regulated industry, such as hydropower utilities, can be a challenging task due to the need for high accuracy and strict compliance with regulatory standards. The process of building such a system would typically involve several steps: 1. Define the problem space: Before starting to build the system, it is important to clearly define the problem that the system is intended to solve and the goals that it needs to achieve. This initial negotiation process is largely without any defining technical constraints, and is based around the job to that needs to be done by the system: identifying specific tasks that the system needs to perform, such as identification of the species or life stage of a fish. This may be especially challenging in a regulated industry like hydropower, as clients are subject to strict laws and regulations that require them to ensure that any tools or technologies they use are reliable and safe. They may be skeptical of a new machine learning system and may require assurances that it has been thoroughly tested and will not pose any risks to the environment or to through data integrity, algorithmic transparency, and accountability. Once the problem space is defined, more technical decisions can be made about how to implement the solution. For example, if the goal is to estimate population density during high fish passage using behavioral patterns such as schooling, it may make sense to capture and tag live video, to see the ways in which fish move in real time. Alternatively, if the goal is to identify illness or injury in a situation where there are few fish passing, it may make sense to capture still images and tag subsections of them to train a classifier. In a more developed hypothetical example, perhaps dam operators know that the fish ladder only allows fish to pass through it, all other species or natural debris are filtered out, and they want a “best guess” about rare species of fish that pass upstream. It may be sufficient in this case to implement generic video-based object detection to identify that a fish is moving through a scene, take a picture of it at a certain point, and provide that picture to a human to tag with the species. Once tagged, these data can be used to train a classifier which categorizes fish as being the rare species or not. 2. Establish performance goals: The definition of the problem space and the initial suggested process flow should be shared with all stakeholders as an input to the performance goals. This helps ensure all interested parties understand the problem at a high level, and what is possible for a given implementation. Practically, most hydropower utilities are interested in automated fish count solutions that meet an accuracy threshold of 95% as compared to a regular human visual count, but expectations around whether these metrics are achievable and at what part of the production cycle will be a highly negotiated series of points. Establishing these goals is a true sociotechnical problem, as it cannot be done without taking into account both the real-world constraints that limit the data and the system. These constraining factors will be discussed later in the Obstacles section of the paper. 3. Collect and label training data: In order to train a machine learning model to perform the tasks required by the system, it is first necessary to produce a training dataset. Practically, this involves collecting a large number of fish images. The images are annotated with the appropriate species classification labels by a person with expertise in fish classification. The annotated images are then used to train a machine learning model. Through training, the algorithm learns the features characteristic of each subclass of fish and identifies those features to classify fish in new, unseen images. Because the end goal of this system is to minimize the counts that humans have to do, images with a low “confidence score” a metric commonly produced by object-detection models may be flagged for identification and tagging by human reviewers. The more seamless an integration with a production fish counting operation, the better. 4. Select a model: Once the training data has been collected, the next step is to select a suitable machine learning model and train it on the data. This could involve using a supervised learning approach, where the model is trained to recognize the different categories of fish after being shown examples of labeled data. At the time of this writing, deep learning systems based on pretrained models like ImageNet are popular choices. Once trained, the model should be validated against tagged data that it has not seen before and fine-tuned by adjusting the model parameters or refining the training dataset and retraining. 5. Monitor system performance: Once the model has been trained and refined, it can be implemented as p