Learning nature’s coding language

Here lies the great tome of life, the DNA. A massive book of instructions for everything an organism needs to exist and continue living, be it a plant, animal, fungus or bacteria. Every cell in an organism contains a book of their own. Each of them carries the same set of instructions. Instructions that tell it how to make tools (proteins). Tools to build itself, from the walls of the cell to the transport within it. Tools that become the building material itself. Tools to gain the energy required for everything. Tools to fight off things that would destroy the organism and tools to help it adapt to all the changes in the world around it.

We would like to read this book, understand its secrets, so that we may make people’s lives better. For example, by contributing to their food security. We want to figure out which set of instructions, which book, gives us strong, healthy plants and a big harvest. We want to know which parts of a book make the plant strong. Which parts let it defend itself against things that want to eat it, which parts make it survive drought and many other things. Once we know, we can then pick the best plants or combine the books of two plants, for example through breeding.

The challenge is that nature does not speak our language, it has one of its own. Every book is written in codes. A combination of four letters, A, T, G and C (or speaking chemically, the four bases that appear in the DNA, Adenin, Thymin, Guanin and Cytosin). That means to understand the instructions, we have to de-code nature’s language. Part of that is understanding which instructions make which tool and what the tool is used for. The other part is understanding the instructions that describe which tools to make at what time.

For example: Why, if every cell of an organism has the same set of instructions, can they look so different? Why can they perform different functions in a body?

The answer is that not every part of the book is read in every cell and also not at every time. There are a multitude of marks on the books which tell the cell where to read or where not to read, sometimes even preventing access to multiple pages (the scientific terms here are: transcription factors, histones and their modifications, and DNA methylation). These marks are then able to be changed depending on the type of cell, its age or what it senses in the environment around it. This way, simple cells and organisms can both develop and mature, as well as respond to opportunities and threats.

While different species usually have vastly different books and every individual has their own personal book, the books of individuals of the same species tend to be similar. For my project, we are working with barley. Like many other plants, the species barley is divided into different varieties. These are sub-groups of the species that are very similar within the group, but differ a little between them. We want to compare the books of different barley varieties and see if differences in their texts cause some of the marks to fall off or be put in completely new places. By creating a list of all the differences and the effect they have on the marks in the books, we want to help find those differences that have a strong effect on how well a plant grows. This new information will hopefully make it easier for plant breeders to breed plants that are fit for any challenges the future may bring.

      

Planter’s Punch

Under the heading Planter’s Punch we present each month one special aspect of the CEPLAS research programme. All contributions are prepared by our early career researchers.

About the author

During her Bachelors studies in Quantitative Biology at HHU and the University of Cologne, Amelie Kok developed a special interest in bioinformatics, quantitative genetics and plant sciences. She has a particular interest in deepening her knowledge of bioinformatics and utilising it towards the improvement of crop plants. For her master's and doctoral thesis she is now working on annotating transcription factor binding sites in barley and measuring the influence of genetic variants on transcription factor binding.

Further Reading

Engelhorn J, Snodgrass SJ, Kok A, Seetharam AS, Schneider M, Kiwit T, Singh A, Banf M, Khaipho-Burch M, Runcie DE, Sanchez-Camargo VA, Torres-Rodriguez JV, Sun G, Stam M, Fiorani F, Beier S, Schnable JC, Bass HW, Hufford MB, Stich B, Frommer WB, Ross-Ibarra J, Hartwig T, 2023. Genetic variation at transcription factor binding sites largely explains phenotypic heritability in maize. (accepted) doi: 10.1101/2023.08.08.551183

Savadel SD, Hartwig T, Turpin ZM, Vera DL, Lung PY, Sui X, Blank M, Frommer WB, Dennis J, Jonathan H, Zhang J, Bass HW, 2021. The native cistrome and sequence motif families of the maize ear. PLOS Genetics 17(8): e1009689. doi: 10.1371/journal.pgen.1009689

Liang Z, Myers ZA, Petrella D, Engelhorn J, Hartwig T, Springer NM, 2022. Mapping responsive genomic elements to heat stress in a maize diversity panel. Genome Biology 23(234). doi: 10.1186/s13059-022-02807-7

Ricci WA, Lu Z, Ji L, Marand AP, Ethridge CL, Murphy NG, Noshay JM, Galli M, Mejía-Guerra MK, Colomé-Tatché M, Johannes F, Rowley MJ, Corces VG, Zhai J, Scanlon MJ, Buckler ES, Gallavotti A, Springer NM, Schmitz RJ, Zhang X, 2019. Widespread long-range cis-regulatory elements in the maize genome. Nature Plants 5, 1237-1249. doi: 10.1038/s41477-019-0547-0