A biological sample contains thousands of cells, each of which is unique and can be examined individually, cell by cell. They can be classified into clusters based on gene activity. But which genes are distinctive to a given cluster, i.e. what are its “marker genes”? The determination and analysis of these marker genes are aided by a new statistical method known as Association Plot.

Cell Cluster
Association Genes are represented by dots in this plot of a cell cluster generated with the APL package. The strength of the gene-cluster association is represented by the color scale on the right side. The red genes are the most cluster-specific.© MPIMG/ E. Gralinska

Which genes are specific to a cell type and thus “mark” its identity? With the increasing size of datasets today, answering this question is frequently difficult. Marker genes are frequently simply genes found in specific cell populations. However, many more genes may be specific to a cell type but have yet to be discovered.

“Association Plots (APL),” a new statistical method for visualizing gene activity within a cell cluster, facilitates the identification of its marker genes. The plots compare the activity of genes in a given cluster to the activity of genes in all other clusters in the data set. They also make it simple to see which genes are shared by other clusters.

“Association Plots allow us to discover new marker genes. It also works the other way around: we can match clusters of unknown identity in a dataset to cell types using a list of marker genes provided “Elzbieta Gralinska of Berlin’s Max Planck Institute for Molecular Genetics agrees.

The biotechnologist is part of Martin Vingron’s team, which developed the technique, tested it on two publicly available datasets, and published the results. Furthermore, APL has been made available as a free module for the statistical environment R. The APL package allows researchers to visually inspect their single-cell data and use the cursor to select individual genes to learn more in-depth details.

Single-cell analysis and grouping

What is the point of identifying marker genes in the first place? Individual RNA molecules in individual cells can be deciphered using modern sequencing technologies. Each cell in a blood sample, for example, can be separated and a sample of the cell’s RNAs decoded. These data from single cells represent active genes that were transcribed into RNA molecules.

The benefit is that instead of wondering which cell type a specific RNA belongs to, it can be traced back to its cell of origin. The disadvantage is that sequencing thousands of RNAs in each of tens of thousands of cells generates massive amounts of data.

One solution is to sort the cells according to their RNA content. “Single-cell data are made up of a diverse range of cell types. We’re looking for cells of the same type, which should all behave similarly “Martin Vingron explains. As a result, he believes it makes sense to group similar cells computationally. “Marker genes define a cell type for us.”

Interactively explore cell clusters

The team demonstrated how the new algorithm works using publicly available data from white blood cells. The various types of white blood cells, such as T-cells, B-cells, and monocytes, are organized into distinct clusters. The researchers confirmed known marker genes and demonstrated that close relatives among blood cells have a high degree of gene activity similarity.

Interactively investigate cell clusters

The team demonstrated how the new algorithm works by using publicly available data from white blood cells. T-cells, B-cells, and monocytes are all grouped together in separate clusters. The researchers confirmed known marker genes and demonstrated that close relatives among blood cells share a high degree of gene activity similarity.

In contrast, the new method allows her to visualize these genes, click on each one, and examine its activity in greater detail, she says. “We’re not just providing lists of marker genes; we’re also allowing users to examine how these genes function,” the researcher explains. “They can dive into their data with Association Plots to learn more about each cell type.” Furthermore, she claims that it is very simple to decipher the biological role of the most interesting genes in a subsequent step using Gene Ontology terms enrichment analysis, which is compatible with the APL software – a “very useful feature.”

The mathematical model that underpins everything

High-dimensional data containing information on gene activity cannot be represented visually without information loss. The same holds true for clustered data, further complicating analysis. “Our trick is that we take into account many more dimensions than just two or three dimensions,” Gralinska explains.

The Association Plots are derived from a mathematical technique that embeds both genes and cells in a common, high-dimensional space at the same time. Measuring the distances between genes and a given cell cluster in this space yields pairs of values that reflect a gene’s association with a given cluster while also providing insights into its association with other clusters.

“One limitation of APL is that we rely on pre-clustered data, which means we have to rely on other clustering techniques,” says Martin Vingron. “Nonetheless, we hope that our new method will attract a large number of new users. We have discovered that a visual and interactive process simply produces a better analysis.”

Source: Materials provided by Max-Planck-Gesellschaft.

Reference:  DOI: 10.1016/j.jmb.2022.167525

10 thoughts on “How to Locate Cell Cluster Marker Genes

  1. First of all I want to say excellent blog! I had a quick question that I’d
    like to ask if you do not mind. I was curious to find out how you center yourself and clear
    your head before writing. I have had a tough time clearing my thoughts in getting my
    thoughts out. I truly do take pleasure in writing but it just seems like the first 10 to 15 minutes are wasted simply just trying to figure
    out how to begin. Any recommendations or hints? Cheers!

  2. Asking questions are actually fastidious thing if you are not understanding anything entirely, except this article offers nice understanding even.

  3. You actually make it appear so easy with your presentation however I to find this matter to be actually something
    which I feel I would never understand. It seems too complex and very large for me.
    I am looking ahead in your subsequent publish, I will try to get the hold of it!

  4. Hmm is anyone else having problems with the pictures on this blog loading?
    I’m trying to determine if its a problem on my end or
    if it’s the blog. Any feedback would be greatly appreciated.

  5. Hello there, I found your web site by the use of Google while searching for a
    related subject, your website came up, it appears good. I’ve
    bookmarked it in my google bookmarks.
    Hello there, simply become alert to your blog thru Google, and located that it’s really informative.
    I am going to watch out for brussels. I’ll appreciate if you proceed this
    in future. Numerous people shall be benefited
    from your writing. Cheers!

  6. Excellent post. I was checking constantly this blog and I am
    inspired! Very helpful info specifically the last part 🙂 I take care of such info much.
    I was looking for this particular info for a long time.
    Thank you and good luck.

  7. I have been exploring for a little bit for any high-quality articles or blog posts on this sort of house .
    Exploring in Yahoo I finally stumbled upon this site. Studying this info So i’m glad to exhibit that
    I’ve a very excellent uncanny feeling I came upon just
    what I needed. I so much no doubt will make
    sure to don?t put out of your mind this website and give it a look regularly.

Leave a Reply

Your email address will not be published.