beeswarm_plot_by_cancer

Bee-swarm plot by cancer

In this webpage, you can create Bee-swarm plots of the distributions of different genes in different samples in one cancer type.

As we mentioned above, there are three ways to get into the 'Bee-swarm plot_By Cancer' page.

1) Through the Navigation bar at the Home page, select “Bee-swarm plot_By Cancer” under “Data Analysis”;

2) Go to “Data Analysis” page, then go to “Data visualization” area, select “Bee-swarm plot_By Cancer” under Bee-swarm plot area;

3) Through the link in “Link area” at the Home page, go to “Data Analysis” page, then go to “Data visualization” area, select “Bee-swarm plot_By Cancer” under Bee-swarm plot area.

For 'Bee-swarm plot_By Cancer' page, there are five areas:

♦ Navigation bar: Users can switch to other pages through this navigation bar.

♦ Setting area: Users can specify genes, cancer types, data types, cutoff values and other function details here.

♦ Plotting area: The Beeswarm plot will be plotted in this area.

♦ Figure Downloading and DIY area: Users can download Beeswarm plot in certain format in certain size. Users can also customize line color, line shape, marker color, marker shapes through the option buttons in this area.

♦ Link area: Necessary links are available for users to switch to other pages or websites.

Note: quick help can be available through putting your mouse on the small question marks besides certain options in this pages.

1. It reminds you which kind of Bee-swarm plot you are working on.

2. You can select mRNA expression, copy number variation.

3. You can select concern cancer types through the drop-down cancer list here. One cancer type at a time.

In TCGA/GDC dataset, non-malignant samples and tumor samples are not both always available for all cancer types. Available sample types vary for different data type even for the same cancer type. For example, for acute myeloid leukemia (LAML) cancer, no non-malignant samples of mRNA expression values are available, but both non-malignant and tumor samples are available for copy number variation data. Different legends are added before cancer names to tell you which kind of samples of the given cancer types can be available.

⚠: without non-malignant which means only tumor samples of this cancer type are available for the data type specified in (2).

❌: not available which means neither tumor samples nor non-malignant samples of this cancer type are available for the data type specified in (2).

In the plotting area, *_T is used as symbols of groups of tumor samples; *_N is used as symbols of groups of non-malignant samples.

4. You can specify concern genes here by inputting gene symbols. Only HUGO (Human Genome Organization) symbols are accepted. For multiple gene plotting, a comma and a space must be input between gene symbols, for example: 7:EGFR, 3:TP63…

Note: small case and big case are all acceptable. For example, kRAS, kras, KRas, KRAS are all treated as the same gene.

5. You can specify concern transformation type, boxplot and other popular statistics by checking one or several of them. Log: after checking this option, log transformation will be applied to the data before Beeswarm plot (for mRNA expression values, it’s log2 transformation; for CNV (copy number variation) values, it’s log2(CNV/2) transformation).

6. When samples are available, you can choose to plot Beeswarms plot for only tumor samples, non-malignant samples or both by checking the boxes before them.

7. When Bee-swarms of more than one groups of samples are plotted, unpaired two-tailed Student’s significance test can be calculated for you when you input the group number here. Note, the group number starts from 1 from the left to the right.

8. You can input cutoff values here, then the percentage of samples compared with cutoffs will show up in the figures.

♦ For one cutoff value, percentages of samples whose values larger and equal (≥) than it and of ones whose values smaller than it will be calculated.

♦ For two cutoff values, if cutoff1 > cutoff2, then

percentages of ≥ cutoff1

percentages of ≥ cutoff2 and < cutoff1

percentages of < cutoff2

will be calculated and shown in the plotting area beside the corresponding bee-swarm plots.

After setting all these necessary parameters, click “GO” button at the bottom of this area, the Bee-swarms will be plotted in the plotting area. There are no limits on how many cancer types you want to select. But due to the configuration of your computer, the internet speed and the data sizes need to be transmitted for plotting (several times of the sample sizes), it may take a while to transmit and to load the data for plotting. The more cancer types you want to plot, the longer the response time will it cost.

Bee-swarm figures will be shown in this area.

A toolbar will show up at the top right of this plotting area when Bee-swarm plots are created.

1. Zoom in: Rectangular zoom in tool. This tool allows you to select a region to display at full application size. After clicking this botton, your mouse will turn into a small cross. Then click and hold the left mouse button and drag a rectangle around a portion of the screen and have it zoom in.

2. Zoom out: Zoom back to the status you have up step by cliking it.

3. Restore: Show the plots in the original portion.

4. Save as Image: You can click it to swich into a image saving webpage then click right mouse button to save this image. You also can specify the image format and size by selecting the options in the Figure downloading and DIY area.

5. Data table: If you want to download the sample data in a table, you can click this button.

Then a table containing all data will show up in the plotting area like the following figure. You can select and copy the whole table or any part of it into a word or excel file by clicking and holding right mouse button as you usally do.

You can scroll down to see the information of other samples. You also can click the “close” button at the bottom left of this page to close the table page and go back to the default page with the plotting area.

For your convinence, the sample ID and other details of each individual sample will show up when you put your mouse on the corresponding marker. For example: in the following figure, aftering putting the mouse on a marker, a catalog showed up is:

→ First row: the group message of this sample;

→ Second row: x-axis, y-axis, sample ID:

x-axis: calculated from a program to separate samples in a bee-swarm like plot, no biological meaning;

y-axis: mRNA expression, copy number variation values depends on which data type you are working on;

sample ID: for the data provided by our website which were downloaded from TCGA/GDC public portal, it was given by GDC portal; for your own data, you can name your own sample ID.

Therefore, in this example: the sample ID is TCGA-53-7624-01, and it's mRNA expression value of EGFR in Glioblastoma multiforme is 19967.61133.

You can specify image format (png or jpg) and size/dimensions for the image to download .

You can modify colors, shapes and other details of this figure for you own style.

Select the group whose style you want to modify, then modify the color, shape of the markers, box, static lines here.

Example 1: Plotting Bee-swarm of mRNA expression values of EGFR in lung adenocarcinoma tumor samples and calculate the percentages of samples between 1000 and 2000, modify the marker into red diamond. See below:

In this figure, two cutoff values are given: 1000 and 2000, according to the digital figures in the plot, we can get to know that for EGFR mRNA expression values in lung adenocarcinoma cancer, the percentages are

≥2000, 21.67%

<2000 and ≥1000, 27.85%

<1000, 50.48%

Example 2: Plotting Beeswarm of mRNA expression values of gene EGFR, KRAS, SOX2, MYC in lung adenocarcinoma tumor samples.

The parameters in Setting area are shown as the following picture. The default setting for the shapes, colors and sizes of markers, boxes and mean lines for all groups are the same.

To make it easier to distinguish groups by a glance, you can use the Figure DIY options to change the color, size and shapes of markers.

Example 3: Plotting Bee-swarms of mRNA expression values of gene CPSF3L and KRAS in both lung adenocarcinoma tumor and non-malignant samples and do the t-test between CPSF3L and KRAS tumor samples.

Because CPSF3L and KRAS tumor samples are the second and forth groups from the left, separately. Therefore input 2 and 4 in T-Test input boxes, then click 'GO'. The p-value as what is shown in the figure is: 0.567 which means their mRNA expression values in lung adenocarcinoma tumor samples are not significantly different.

  • beeswarm_plot_by_cancer.txt
  • Last modified: 2019/07/06 12:59
  • (external edit)