beeswarm_plot_by_gene

This is an old revision of the document!


Bee-swarm plot_By Gene

In this webpage, you can create Bee-swarm plots of the distributions of a specific gene in samples across different cancer types.

As we mentioned above, there are three ways to get into the Bee-swarm plot_By Gene page.

1) Through the Navigation bar at the Home page, select “Bee-swarm plot_By Gene” under “Data Analysis”;

2) Go to “Data Analysis” page, then go to “Data visualization” area, select “Bee-swarm plot _ By Gene” under Bee-swarm plot area;

3) Through the link in the “Link area” at the Home page, go to “Data Analysis” page, then go to “Data visualization” area, select “by gene” at the bottom of Bee-swarm plot area.

For 'Bee-swarm plot _ By Gene' page, there are five areas:

→ Navigation bar: You can switch to other pages through this navigation bar.

→ Setting area: You can specify genes, cancer types, data types, cutoff values and other parameters here.

→ Plotting area: Bee-swarm plot will be shown in this area.

→ Figure Downloading and DIY area: You can download Bee-swarm plot in certain format and certain size. You can also customize line color, line shape, marker color, marker shape and other details through the option buttons in this area.

→ Link area: Necessary links are available for you to switch to other pages or websites.

Note: quick help can be available through putting your mouse on the small question marks besides certain options in this pages.

1. It reminds you which kind of bee-swarm plot you are working on.

2. You can select mRNA expression, copy number variation.

3. You can specify a concern gene here by inputting gene symbol. One gene at a time. Only HUGO (Human Genome Organization) symbols are accepted. For example: 7:EGFR,3:TP63

Note: small cases and big cases are all acceptable. For example, kRAS, kras, KRas, KRAS are all treated as the same gene: KRAS.

4. You can select your concern cancer types through the drop-down cancer list here. Multiple cancer types are acceptable.

In TCGA/GDC dataset, non-malignant samples and tumor samples for the same cancer type are not both always available. Available sample types vary for different data type even for the same cancer type. For example, for acute myeloid leukemia (LAML) cancer, no non-malignant samples of mRNA expression values are available, but both non-malignant and tumor samples are available for copy number variation data. Different legends are added before cancer names to tell you which kind of samples of the given cancer types are available.

⚠: without non-malignant' which means only tumor samples of this cancer type are available for the data type specified in (2).

❌: not available' which means neither tumor samples nor non-malignant samples of this cancer type are available for the data type specified in (2).

In the plotting area, *_T is used as symbols of groups of tumor samples; *_N is used as symbols of groups of non-malignant samples.

5. You can specify concern transformation type, boxplot and other popular statistics by checking one or several of boxes here.

→ Log2: after checking this box, log2 transformation will be applied to the data before Bee-swarm plot (for mRNA expression values, it's log2 transformation; for CNV (copy number variation) values, it's log2(CNV/2) transformation).

6. When samples are available, you can choose to plot Bee-swarms plot for only tumor samples, only non-malignant samples or both of them by checking the boxes before them.

7. When bee-swarms of more than one groups of samples are plotted, unpaired two-tailed Student's significance test can be calculated when you input the group number here. Note, the group number starts from 1 from the left group to the right.

8. You can input cutoff values here, then the percentage of samples compared with cutoffs will show up in the figures.

♦ For one cutoff value, percentages of samples whose values larger and equal (≥) than it and of ones whose values smaller than it will be calculated.

♦ For two cutoff values, if cutoff1 > cutoff2, then

∇ percentages of ≥ cutoff1

∇ percentages of ≥ cutoff2 and < cutoff1

∇ percentages of < cutoff2
will be calculated and shown in the plotting area beside the corresponding bee-swarm plots.

After setting all these necessary parameters, click “GO” button at the bottom of this area, the Bee-swarms will be plotted in the plotting area. There are no limits on how many genes you want to plot. But due to the configuration of your computer, the internet speed and the data sizes need to be transmitted for plotting (several times of the sample sizes), it may take a while to transmit and to load the data for plotting. The more genes you want to plot, the longer the response time will it cost.

Bee-swarm figures will be plotted in this area as follows.

A toolbar will show up at the top right of this plotting area when Bee-swarm plots are created.

1. Zoom in: Rectangular zoom in tool. This tool allows you to select a region to display at full application size. After selecting this botton, your mouse will turn into a small cross. Then click and hold the left mouse button and drag a rectangle around a portion of the screen and have it zoomed in.

2. Zoom out: Zoom back to the status it was a step before by cliking it.

3. Restore: Show the plots in the original portion.

4. Save as Image: You can click it to swich into a image saving webpage then click right mouse button to save this image. You also can specify the image format and size by selecting the options in the Figure downloading and DIY area.

5. Data table: If you want to download the sample data, you can click this button. Then a table containing all data will show up in the plotting area like this:

The first part of this table shows the usefull statistics of the samples, then the value and sample ID of each individual case in each group. You can select and copy the whole table or any part of it into a word or excel file by clicking and holding right mouse button as you usally do.

You can scroll down to see the information of other samples. You also can click the “close” button at the bottom left of this page to close the table page and go back to the default page with the plotting area.

♦ Vertically zoom in: There is a zoom bar at the right edge of the plotting area. Click and hold on either one of the two buttons on it, you can zoom in or zoom out vertically.

♦ Horizontally zoom in: Slide the mouse wheel (for apple magic mouse, slide up or down) up or down, you can zoom in or zoom out horizontally.

For your convinence, the sample ID and other details of each individual sample will show up when you put your mouse on the corresponding marker. For example: in the following figure, aftering putting the mouse on a marker, a catalog showed up is:

♦ First row: the group message of this sample;

♦ Second row: x-axis, y-axis, sample ID:

Δ x-axis: calculated from a program to separate samples in a bee-swarm like plot, no biological meaning;

Δ y-axis: mRNA expression, copy number variation values depends on which data type you are working on;

Δ sample ID: for the data provided by our website which were downloaded from TCGA/GDC public portal, it was given by GDC portal; for your own data, you can name your own sample ID.

Therefore, in this example: the sample ID is TCGA-06-0187-01, and it's mRNA expression value of EGFR in Glioblastoma multiforme is 86384.59375.

You can specify image format (png or jpg) and size/dimensions for the image to download.

You can modify colors, shapes and other details of this figure for you own style.

Select the group whose style you want to modify, then modify the color, shape of the markers, box, static lines here.

Example 1: Plotting Bee-swarm of mRNA expression values of EGFR in lung adenocarcinoma tumor samples and calculate the percentages of samples between 1000 and 2000. See below:

In this figure, two cutoff values are given: 1000 and 2000. According to the digital figures in the plot, we can get to know that for EGFR mRNA expression values in lung adenocarcinoma cancer, the percentages are

≥2000, 21.67%

<2000 and ≥1000, 27.85%

<1000, 50.48%

Example 2:Plotting Bee-swarms of mRNA expression values of gene KRAS in adrenocortical carcinoma, bladder urothelial carcinoma, colon adenocarcinoma, lung adenocarcinoma and lung squamous cell carcinama tumor samples.

The parameters in the Setting area are as the following picture. The default setting for the shapes, colors and sizes of markers, boxes and mean lines for all groups are the same.

Example 3:Plotting Bee-swarms of mRNA expression values of gene KRAS in glioblastoma multiforme, esophageal carcinoma and lung adenocarcinoma tumor samples and do the t-test between esophageal carcinoma and lung adenocarcinoma tumor samples.

Because esophageal carcinoma and lung adenocarcinoma tumor samples are the second and third groups from the left separately, therefore input 2 and 3 in the input boxes of T-Test, then click 'GO'.

The p value shown in the figure is 2.56e-6 which means in these two cancer types, the KRAS mRNA expression is significantly different.

  • beeswarm_plot_by_gene.1562403684.txt.gz
  • Last modified: 2019/07/06 09:01
  • by tongyifan