Research Experience for Undergraduates (REU) Summer 2020: An environmental justice approach to data-intensive lake research

By Jessica Díaz Vázquez

I joined the Data Intensive Landscape Limnology Lab in October 2018 to gain research experience in the general field of ecology. As I learned more about the database LAGOS and the openness of the lab for interdisciplinary research, I saw an opportunity to incorporate my interest in environmental justice.

I grew up in Northeast Houston, Texas in a predominantly latinx and low-income community that is adjacent to petrochemical plants and oil refineries. Living in a ‘frontline’ or environmental justice community means that the topics of health, racial/ethnic identity, economic status, and natural environment are extremely interconnected. Just like any other community, we love our backyard gardens, neighborhood parks, and local bayous. However, the disproportionate burden of air and water pollution make outdoor activities much less pleasant or healthy. From my lived experiences and as a rising senior in MSU’s Department of Fisheries & Wildlife, I seek to improve the habitat of wildlife and expose and correct environmental injustices. I am excited to apply my combined knowledge in fisheries & wildlife and environmental justice through this REU position.

The overall goals of this REU position are to integrate information about lake watersheds and lake water quality with human demographics and apply an environmental justice lens. I hope to answer the question: Are people and communities within marginalized demographics (e.g., low income, people of color, younger/older people) disproportionately affected by low water quality lakes and their watersheds?

For my research, I am using lake and watershed data from the LAGOS database that covers the conterminous U.S. Therefore, the human demographic data used must be compatible with this large scale. I am using tract-level data from the 2010 Decennial Census and the American Community Survey (ACS). The main variables that I will focus on for lakes are those that together serve as a measure of water quality: water clarity, phosphorus, and nitrogen. For the human demographic variables, I will choose those of interest in the environmental field, such as median household income, race/ethnicity, population, and sex. Figure 1 is an example of a visual output resulting from linking watersheds and median household income for LAGOS-NE.

Although I expect challenges to arise from working with two unique databases (LAGOS and ACS), I look forward to bringing a new perspective to the research group. Stay tuned for an update at the conclusion of my summer 2020 REU!


Highlight on Research Experiences for Undergraduates (REU): MSU math major applies his skills to data-intensive lake research

By Sam Polus

I was motivated to apply for this particular REU position as, growing up in northern Michigan, I have always been interested in nature and ecology, and I wanted to be able to apply my math degree in areas that would allow me to pursue these interests. It has been an amazing learning opportunity for me to apply things I have been learning in my math, computer science, and statistics classes into areas where I did not expect to apply them. Being able to work in such a diverse research group has helped me greatly in learning how to translate and apply mathematical skills into different useful applications.

The project that I mainly focused on over the course of the summer involved classification of lakes in LAGOS-NE ( into two categories: natural lakes, and reservoirs.  Since this involved such a large number of lakes (~50,000), much of my work revolved around training a deep-learning algorithm with the help of a computer science REU student Laura Danilla. Manually, I classified a subset of the lakes using GIS layers and satellite imagery.  We then used this subset of lakes with confirmed types to train our deepmind AI to identify lake types based solely on the shape of the lake.

Throughout the course of the summer, I created a training set containing 5334 lakes, roughly half natural lakes and half reservoirs.  Using these lakes of known types, we estimated the performance of our model as we prepared to apply it to all lakes in LAGOS-NE.  After this testing, we estimated our model to have around 80%-85% accuracy when determining lake type for a given lake in LAGOS-NE.  Then, we applied our model to the ~45,000 remaining unclassified lakes in LAGOS-NE and obtained the following results: 63% of lakes in LAGOSNE (28,733) are natural lakes, and 39% of lakes (16,864) are reservoirs. For reservoirs, the average predictive confidence was 45% and for natural lakes the average confidence was 61%. This metric of confidence is estimated by the model as it is determining the type of a lake.  It develops a probability of each lake being from either category (natural lake or reservoir).  The confidence metric is the absolute value of the difference between the probabilities of a lake being in each category.

Figures 1 and 2 show the distributions of the model’s reservoir predictions and natural lake predictions, respectively.  Note that natural lakes tend to appear in clusters, whereas reservoirs are more evenly distributed.  Also note that regions with many lakes such as Minnesota have high concentrations of both reservoirs and natural lakes.  Both of these results are very promising for our model because they match up well with what we expect.

For instance, we expect natural lakes to be found in clusters from processes such as glaciation, and reservoirs we expect to be more evenly distributed as they can form anywhere that we can pool water.  Finally, Figure 3 shows the count of lakes in each category for each state.Working in this REU position over the summer has been a great experience for me. I’m still working in the lab this semester, hoping to extend my work to the entire conterminous US.  After graduating in May 2020, I hope to continue to apply the skills I have developed working in the lab in similar areas.  I have become particularly interested in deep-learning algorithms after working so closely with one, and I hope to find a position where I can continue to pursue this interest.