Genomics: Insight

The role of machine learning-based gene identification in increasing native plant resistance against invasive alien plants.

Research Question: How can machine learning optimize gene selection to increase native plant metabolite production against invasive species?
Invasive Alien Plants
Invasive alien plants (IAPs) disrupt ecosystems by interrupting natural processes and outcompeting existing plants, ultimately leading to acute species composition imbalances in affected regions.1,2,3 A study conducted in the Czech Republic analyzing the effects of 13 invasive plant species on ecosystem health found that all but two of the selected IAPs had ecologically adverse effects as defined by declines in species richness, diversity, and composition. In the most extreme cases, IAPs contributed to a nearly 90% reduction in the total number of species in a community.4 Indeed, invasive species at large (as opposed to widely established native species) rank far ahead of hunting, harvesting, and agriculture as primary drivers of plant and animal extinction. Anthropological disruptions such as climate variability, land-use changes, and the unintentional or intentional distribution of invasive species will continue to exacerbate their environmental detriment.5
IAPs also debilitate human health, especially in communities with limited availability of non-chemical and chemical control methods.6 They create favorable microhabitats for pests and pathogen vectors, leading to their increased survival and longevity, especially in normally unsuitable areas, resulting in a broader range of pathogen distribution and density than would be possible otherwise.7 In one Malian habitat manipulation study, flowering branches of the invasive tree Neltuma juliflora were removed, resulting in a subsequent 69.4% decline in malaria vector population density; this highlights the critical role of invasive plants in increasing disease transmission.8
Moreover, IAPs decimate farming and fishing industries through land and natural resource degradation, destroying livelihoods and threatening food security. The invasion of water hyacinth (Eichhornia crassipes) in Lake Victoria, for example, rendered fishing in the area impossible, impacting over 3 million jobs while harming Africa’s most valuable inland fishery industry valued at 1.1 billion USD as of 2021.9,10,11
Allelopathy: A Natural Defense Mechanism
Allelopathy is a well-known behavior in which plants produce biochemicals with distinct positive or negative effects on the growth and survival of neighboring plant species. Attacks such as microbial infection, predation by herbivores, or plant competition often provoke such mechanisms.12 Allelopathy is a naturally evolved defense mechanism that could play a major role in controlling future IAP disruptions.
Allelopathic control mechanisms have been implemented in agricultural contexts for weed suppression with strategies such as intercropping, crop rotation, cover crop integration, mulching, and the use of biochemicals as naturally derived herbicides. Allelopathy is known to be mediated by a number of factors, including specific genomic regions influencing the production of biochemicals.13 However, the potential of genetic engineering to augment allelopathic potential in native plants remains unexplored despite the growing use of CRISPR-Cas9 technologies in gene drive systems and other management strategies.14
Ho et al. 2020 isolated methanol extracts of biochemicals produced by nine common rice crops and assessed their negative impact on the invasive weed Echinochola crus-galli, finding that the shoot length of the weed was significantly inhibited (4.63–21.27%) even at the lowest biochemical concentration of 0.01 g mL−1.15 The study implies the effectiveness of a management approach to edit native plant species to increase their resistance to IAPs, as even minor differences in a native plant genome conferring a small allelochemical production increase could have sizeable negative impacts on invasive plant health.
Machine Learning and Plant Genomics
Machine learning (ML) technologies are rapidly increasing our ability to derive outputs from large datasets. In studies of plant invasion or allelopathy where multiple complex ecological, genomic, proteomic, and metabolomic factors are at play, a ML model can easily synthesize such complex data to produce an output. Thus, these frameworks have significant future potential in developing our understanding of invasion biology.
Bai et al. 2024 built and trained a machine-learning model to predict the genes responsible for the production of specific biochemicals in the model species Arabidopsis thaliana, ultimately concluding that an ML model based on metabolic gene data with proteomics and genomics features from 3 or more species to calibrate output provided the most accurate predictions.14 Due to the vast quantity of genomic, proteomic, and metabolomic data available, a ML model could be utilized to efficiently determine which areas of the genome must be modified to enhance allelopathic production. Additionally, proteomics and genomics data was most readily accessible from the literature, further contributing to the model’s scalability in analyzing a wide range of biochemicals and plant species.14 These findings suggest a significant availability of associated data that could quickly and accurately develop an ML model.
Conventional IAP Management Techniques
Common non-chemical strategies including the pulling or cutting of invasive plants are most frequently used due to their affordability and ease of implementation; however, such strategies are time-consuming, labor intensive, costly, and contested in their effectiveness as resprouting and further invasion due to soil disturbance are common after treatment.16 Chemical control methods are far more effective than non-chemical approaches in controlling alien plant invasions, but despite their perceived benefits, they often contain harmful active ingredients that are correlated with health risks and unintended ecological harm.17,18
Emerging Genomics-Based IAP Management Strategies
While genetically editing native plant species to increase their resistance to IAPs has not yet been investigated as a management strategy, there have been significant advances in genetically modifying IAPs to weaken them. The emerging management technique of gene drives uses CRISPR–Cas9 gene editing technology to modify or disable particular genes in a target invasive species to suppress their population growth or inhibit their ability to transmit disease (in the common case of mosquito vectors).19 Although not yet tested on a large scale, gene drives show promise in controlled laboratory settings and represent a crucial step forward in applying genomics to invasive species management.20 ML models have proven effective in predicting and evaluating the effects of these drives on a target population, but they have not been used to identify specific genomic regions to edit to create the desired effect.21 Further exploration of ML applications to gene drives will continue to augment their effectiveness in the future.
Proposed Model: A Machine Learning and Allelopathy-Based Approach
We propose a gene modification process to increase native plant allelochemical production and increase resistance to IAP disruption.
The proposed model features a two-layered system architecture. Input variables would be existing data about the native and invasive plant species, in addition to other ecological data such as soil composition, temperature, microbial community makeup, and pre existing level of ecological disturbance.22 The ML program would identify correlations between these input features and metabolomics data (produced biochemicals, effect of these biochemicals on IAPs) available through publicly accessible databases22,23, ultimately outputting the degree to which the native plant harms the IAP. In the model’s second layer, the outputted data of the first layer is paired with proteomic and genomic data available from publicly accessible databases22,24,25 to produce the final output. The final output will state the most effective genes to target in order to confer increased native plant allelochemical production and resistance to IAPs.
An allelopathic approach holds multiple benefits compared to conventional control methods. It requires little manual labor and is thus cost-effective and efficient. Furthermore, our understanding of the plant genome allows for precise modification of select plant species’ allelopathic output without any further intervention, a scalable design suitable for delicate ecosystems where repeated management visits are not an option. Moreover, ecological damage is minimized in many ways. Firstly, the locus of action in the proposed approach is in the genome of native species instead of that of the invasive species. In gene drive systems, where the locus of action is on the genome of an invasive species, this dynamic means that detrimental gene drive traits that have been optimized for proliferation throughout an invasive population can easily spread to and decimate non-target species. The proposed approach limits this potential environmental damage. Additionally, control agents are produced by the plants themselves; thus, all management comes from within the ecosystem. In contrast, externally derived compounds such as synthetic herbicides can pose a higher risk of ecological damage.
Future Directions
We believe that this work could be leveraged for significant management of IAPs worldwide.
References
- Kumar Rai P, Singh JS. Invasive alien plant species: Their impact on environment, ecosystem services and human health. Ecol Indic. 2020 Apr;111:106020. doi: 10.1016/j.ecolind.2019.106020. Epub 2020 Jan 9. PMID: 32372880; PMCID: PMC7194640.
- Schirmel J, Bundschuh M, Entling MH, Kowarik I, Buchholz S. Impacts of invasive plants on resident animals across ecosystems, taxa, and feeding types: a global assessment. Glob Chang Biol. 2016 Feb;22(2):594-603. doi: 10.1111/gcb.13093. Epub 2015 Dec 11. PMID: 26390918.
- Gioria M, Hulme PE, Richardson DM, Pyšek P. Why Are Invasive Plants Successful? Annu Rev Plant Biol. 2023 May 22;74:635-670. doi: 10.1146/annurev-arplant-070522-071021. Epub 2023 Feb 7. PMID: 36750415.
- Hejda, M., Pyšek, P. and Jarošík, V. (2009), Impact of invasive plants on the species richness, diversity and composition of invaded communities. Journal of Ecology, 97: 393-403. https://doi.org/10.1111/j.1365-2745.2009.01480.x
- Blackburn, T. M., Bellard, C., & Ricciardi, A. (2019). Alien versus native species as drivers of recent extinctions. Frontiers in Ecology and the Environment, 17(4), 203–207. https://doi.org/10.1002/fee.2020
- Pyšek P, Hulme PE, Simberloff D, Bacher S, Blackburn TM, Carlton JT, Dawson W, Essl F, Foxcroft LC, Genovesi P, Jeschke JM, Kühn I, Liebhold AM, Mandrak NE, Meyerson LA, Pauchard A, Pergl J, Roy HE, Seebens H, van Kleunen M, Vilà M, Wingfield MJ, Richardson DM. Scientists' warning on invasive alien species. Biol Rev Camb Philos Soc. 2020 Dec;95(6):1511-1534. doi: 10.1111/brv.12627. Epub 2020 Jun 25. PMID: 32588508; PMCID: PMC7687187.
- Agha SB, Alvarez M, Becker M, Fèvre EM, Junglen S, Borgemeister C. Invasive Alien Plants in Africa and the Potential Emergence of Mosquito-Borne Arboviral Diseases-A Review and Research Outlook. Viruses. 2020 Dec 27;13(1):32. doi: 10.3390/v13010032. PMID: 33375455; PMCID: PMC7823977.
- Muller, G.C., Junnila, A., Traore, M.M. et al. The invasive shrub Prosopis juliflora enhances the malaria parasite transmission capacity of Anopheles mosquitoes: a habitat manipulation experiment. Malar J 16, 237 (2017). https://doi.org/10.1186/s12936-017-1878-9
- Nyamweya, C., Lawrence, T. J., Ajode, M. Z., Smith, S., Achieng, A. O., Barasa, J. E., Masese, F. O., Taabu-Munyaho, A., Mahongo, S., Kayanda, R., Rukunya, E., Kisaka, L., Manyala, J., Medard, M., Otoung, S., Mrosso, H., Sekadende, B., Walakira, J., Mbabazi, S., Kishe, M., Shoko, A., Dadi, T., Gemmell, A., & Nkalubo, W. (2023). Lake Victoria: Overview of research needs and the way forward. Journal of Great Lakes Research, 49(6), 102211. https://doi.org/10.1016/j.jglr.2023.06.009
- Cho Mujingni, J. T. (2012). Quantification of the impacts of water hyacinth on riparian communities in Cameroon and assessment of an appropriate method of control: The case of the Wouri River Basin (Master’s thesis). World Maritime University. https://commons.wmu.se/all_dissertations/29/
- Nakiyende, H., Basooma, A., Upendo, H., Nsinda, P., Bakunda, A., Kibona, O., Mrosso, H., & Kayanda, R. (2022). Regional catch assessment survey synthesis report for Lake Victoria June 2005 to June 2021 [Report]. Lake Victoria Fisheries Organization. https://lvfo.org/sites/default/files/field/JUNE%202021%20REGIONAL%20CAS%20REPORT_FINAL.pdf
- Khamare Y, Chen J and Marble SC (2022) Allelopathy and its application as a weed management tool: A review. Front. Plant Sci. 13:1034649. doi: 10.3389/fpls.2022.1034649
- Aci, M.M.; Sidari, R.; Araniti, F.; Lupini, A. Emerging Trends in Allelopathy: A Genetic Perspective for Sustainable Agriculture. Agronomy 2022, 12, 2043. https://doi.org/10.3390/agronomy12092043
- Bai W, Li C, Li W, Wang H, Han X, Wang P, Wang L. Machine learning assists prediction of genes responsible for plant specialized metabolite biosynthesis by integrating multi-omics data. BMC Genomics. 2024 Apr 29;25(1):418. doi: 10.1186/s12864-024-10258-6. PMID: 38679745; PMCID: PMC11057162.
- Ho TL, Nguyen TTC, Vu DC, Nguyen NY, Nguyen TTT, Phong TNH, Nguyen CT, Lin CH, Lei Z, Sumner LW, Le VV. Allelopathic Potential of Rice and Identification of Published Allelochemicals by Cloud-Based Metabolomics Platform. Metabolites. 2020 Jun 15;10(6):244. doi: 10.3390/metabo10060244. PMID: 32549240; PMCID: PMC7344986.
- Manning, S., & Miller, J. (2011). Manual, mechanical, and cultural control methods and tools. In R. Westbrooks et al. (Eds.), Invasive plant management issues and challenges in the United States: 2011 overview (ACS Symposium Series 1037). American Chemical Society. https://www.srs.fs.usda.gov/pubs/ja/2011/ja_2011_manning_002.pdf
- Mohd Ghazi R, Nik Yusoff NR, Abdul Halim NS, Wahab IRA, Ab Latif N, Hasmoni SH, Ahmad Zaini MA, Zakaria ZA. Health effects of herbicides and its current removal strategies. Bioengineered. 2023 Dec;14(1):2259526. doi: 10.1080/21655979.2023.2259526. Epub 2023 Sep 25. PMID: 37747278; PMCID: PMC10761135.
- Zaller, J., Heigl, F., Ruess, L. et al. Glyphosate herbicide affects belowground interactions between earthworms and symbiotic mycorrhizal fungi in a model ecosystem. Sci Rep 4, 5634 (2014). https://doi.org/10.1038/srep05634
- Naidoo, K., Oliver, S.V. Gene drives: an alternative approach to malaria control?. Gene Ther 32, 25–37 (2025). https://doi.org/10.1038/s41434-024-00468-8
- Bier E. Gene drives gaining speed. Nat Rev Genet. 2022 Jan;23(1):5-22. doi: 10.1038/s41576-021-00386-0. Epub 2021 Aug 6. PMID: 34363067; PMCID: PMC8344398.
- Champer SE, Oakes N, Sharma R, García-Díaz P, Champer J, et al. (2021) Modeling CRISPR gene drives for suppression of invasive rodents using a supervised machine learning framework. PLOS Computational Biology 17(12): e1009660. https://doi.org/10.1371/journal.pcbi.1009660
- https://www.knapsackfamily.com/KNApSAcK_Family/ (database)
- https://metacyc.org/ (database)
- https://www.ncbi.nlm.nih.gov/gene/ (database)
- https://plants.ensembl.org/index.html (database)
About the Author

A student at Polytechnic School, Matteo is fascinated by many areas of the life sciences such as microbiology, plant biology, ecology, and biochemistry. Outside of school, he enjoys playing the clarinet, running, and hiking.
Mentor: Dr. Balakrishnan Selvakumar Affiliation: Polytechnic School