Using machine learning to understand species diversity in tropical forest


Tropical forests have many more species, and many more rare species compared to temperate forests. Despite much research, we do not understand the reasons for this latitudinal gradient in species richness, nor how so many rare species survive in the tropical forest community. Understanding the underlying drivers of species diversity and structure in forests is a significant challenge. Overcoming this challenge will help us to predict the impact of climate change, extreme weather events, and pests and disease, and may also inform the potential for forest restoration and sustainable management.
We propose that you will investigate the use of different machine learning approaches to help us understand the drivers of the patterns and processes that structure tropical forest, and improve our understanding of the ecological theories that might explain tropical forest diversity. Machine Learning methods have been used to identify areas of disturbance in forests, analyse hyperspectral images taken from aerial surveys of forest canopies and estimate biomass, but as yet these methods have not been applied to study forest community dynamics and forest structure. One successful machine learning method is deep learning. Deep learning models are especially suitable to work with large datasets and are capable of learning visual and temporal hierarchical dependencies and to predict future patterns. In addition, deep learning provides ways to combine multiple modes of information into a unified model for prediction, optimisation or classification.
At first you will use deep learning techniques to investigate already collected forest census data on tree species identity, location, growth and survival data from the Luquillo forest dynamics plot in Puerto Rico. Later there will be an opportunity for field work in Puerto Rico to collect forest data.

The overarching question is: What determines species distribution patterns and dynamics in tropical forests? The focus of the study will be determined by the candidate, through discussions with the supervisors.

As a result of this collaborative project, you will:
1. Understand theories of plant and forest ecology and their application to forest community dynamics.
2. Have practical experience in tropical forest and data collection protocols.
3. Develop skills in handling large data sets.
4. Interact with an international community of forest ecologists involved in large forest plots
5. Understand different machine learning methods and modelling techniques.
6. Undertake research related to the design of advanced deep learning models to discover patterns hidden in the data and automatically search for key drivers of forest dynamics.
7. Apply these models to similar large data repositories of forest census plots.

Click on an image to expand

Image Captions

Tropical forest photograph by Jill Thompson
Convolutions diagram modified from


Initially you will learn about tropical forest ecology and the protocols that have been used to collect the forest tree data. You will also learn the theories that might explain the patterns and processes that drive forest structure, tree species distribution and dynamics.
After training in machine learning techniques and advanced statistics you will apply at least two advanced types of deep learning architectures, such as Convolutional Neural Networks and Generative Adversarial Networks, to multiple datasets of tropical forest plot censuses. The spatial and temporal heterogeneity of tropical forests data makes understanding the mechanisms and drivers of species community dynamics are very difficult to understand. In this regard, there is a growing need for advanced analytical and decision-making tools in the forestry domain. After initial trials of deep learning techniques you will identify if additional forest data is required to improve the forest models, and will have the opportunity to collect the additional data on forest structure in Puerto Rico or another forest site.

The Luquillo Forest Dynamics Plot is part of the network of large forest plots ForestGeo from whom you can obtain other forest plot data and have opportunities to develop collaborations to compare different forest types with different histories to look for common mechanisms.

Project Timeline

Year 1

Literature review to learn the ecological theory involved in tropical forest ecology, and the population models that have been used to understand the species community dynamics. Training in machine learning methods and advanced statistical analytical techniques. Explore deep learning techniques using the census data already available. Visit Puerto Rico to learn tropical forest census protocols and collect data on forest structure.

Year 2

Design multiple deep learning models and train them with the already available forest census data from Puerto Rico. Test the models produced with data from other forest census plots to investigate generality and measure robustness.

Year 3

Repeat forest structure data collection in Puerto Rico to look at temporal relationships and changes in forest structure over time. The forest in Puerto Rico is undergoing rapid structural changes while recovering from Hurricane Maria. Integrate the data collected on forest structural change into the initial models. Start thesis preparation.

Year 3.5

Continue thesis preparation and papers for publication. Present the results at national and international conferences.

& Skills

The student will benefit from spending time at each project partner location and interacting with DTP students at IAPETUS conference and training events. Experience field work in Puerto Rico, gain an appreciation of tropical ecology and the methods for collection of tree data, tree species population modelling, data analysis using deep learning and machine learning techniques, and advanced statistical techniques.

In addition, the student will benefit from transferable skills through IAPETUS2 DTP core training, and courses at CEH and Heriot-Watt University. These skills will include scientific writing, presentation skills scientific methodologies, statistics and research ethics. There will be opportunities for the student to present their work at national and international conferences.

References & further reading

Chen, L. et al (2017) Forest tree neighborhoods are structured more by negative conspecific density dependence than by interactions among closely related species. Ecography.
Feng, X. et al (2017) Improving predictions of tropical forest response to climate change through integration of field studies and ecosystem modelling Global Change Biology (doi/10.1111/gcb.13863/full)
Hu, W., Huang, Y., Wei, L., Zhang, F. and Li, H. (2015): Deep convolutional neural networks for hyperspectral image classification. Journal of Sensors 258619,
Johnson, DJ et al (2018). Climate sensitive size-dependent survival in tropical trees. Nature Ecology & Evolution. VOL 2 pp 1436-1442. (
Le Cun, Y., Bengio, Y. and Hinton, G. (2015): Deep learning. Nature, 521(7553), 436.
See videos at:

Further Information

Dr Jill Thompson: +44 (0)131 445 8518. Jill is a forest ecologist who has worked with the Luquillo Forest Dynamics plot and ForestGeo network since 1995.

Dr Marta Vallejo: +44 (0)131 451 3081. Marta has expertise in machine learning and predictive modelling and has worked on several research projects in the areas of deep learning for image analysis.

Dr. Michael Lones +44 (0)131 451 8434. Michael has worked in predictive modelling and machine learning for over 15 years, and has particular expertise in the area of biologically-motivated computing models.

Apply Now