Home > Help
Help
   
 

Here is a detailed instruction on how to use EpiRegNet.

For the “static” page (this is the main platform of this web server):

1. Input a gene list and set the parameters

  1. Assign a title for your project. It should be alphanumeric with less than 10 characters. If not filled, the default name "epiregnet" will be used.
  2. If an email address is filled, a notice will be sent to the mailbox when the project finishes.
  3. fig1.png
  4. Choose one format for gene names. Gene names could be RefSeq Accession Numbers, Official Gene Symbols, Affymetrix Probes, Agilent Probes or Illumina Probes.
  5. Input the data. Users could copy and paste the data or upload a simple text file to provide the data. The data file format is quite simple. It is expected on each line there are one gene and a corresponding label to this gene, separated by spaces. The labels are decided by user on how to categorize the genes and any string of characters could be used as labels, such as “-1”,”5”,”3333”, “u” and “Up”. It is possible that for all genes the labels are identical or not provided. In this way, we treat input genes as one group and same number of the other randomly picked genes in our dataset as the other group. Example of input data is available by clicking “Example 1” or “Example 2” (two case studies in the paper).
  6. A description... "Label Add Tool" could help you to format a gene list immediately. Click it; paste gene names in "Your Gene List" box, one name on each line; fill the "Label to Add" box to add a label to all genes in the list above; click "Add Label" and "Append to the Gene List Box". Continue working with it until all groups of genes are rightly formatted. A description...
  7. Choose a cell line to analyze histone modification profile, or whole genome-wide histone modification data could be uploaded by clicking "Upload your own data". If the information is on an existed cell line in our list, please specify it and the system provides curated histone modification marks of this cell line to add to candidate marks. Otherwise please fill in the blank for "Cell line name" and no mark will be handpicked for display. The next time this cell line is analyzed by same user, information for this cell line is kept and retrieved. For each mark, one file should be uploaded. mark name is necessary, and in results these marks would have a "_" prefix in the name to differentiate from curated marks. The file should be in the form of plain text, composing of four columns separated by spaces. Each line depicts a region or a peak, and in order the four fields are chromosome, start point, end point and reads number (or peak signal). It should be noticed the start point should be smaller than end point in one line and the signal number should be greater than zero. Possible files could be generated from bed files which describe histone mofidication peaks yielded from ChIP-seq, or wig files counting reads number by window sliding after ChIP-seq. And one important thing is that our system is of human GRCh37/hg19 version, so the uploaded histone modification peaks or reads information should be of GRCh37/hg19 version. If it is not, the tool "liftOver" in UCSC genome browser (http://genome.ucsc.edu/cgi-bin/hgLiftOver) is available to resolve the problem.


    An example file for uploaded histone modification in pancreas is here:

    chr1 564602 564759 1.75
    chr1 565463 565647 4.35
    chr1 566695 566874 0.49
    chr1 567508 567796 2.02
    chr1 755739 758787 16.55
    chr1 774643 774767 3.78
    chr1 787641 787883 3.74
    chr1 834131 834406 3.77
    chr1 835098 835592 3.1
    chr1 840534 840975 3.52
    ......
  8. Choose a type of transcription factor binding sites (TFBS), either “ChIP-seq” or “Motif”. “ChIP-seq” means the available data of TFBS are obtained from ChIP-seq experiments while “Motif” means the peaks of TFBS are computationally putative ones.
  9. Choose the promoter range. The two numbers suggest how far this promoter spans around TSS. For example, “-500 ~ +100” means the 600bp long promoter is from 500bp upstream to TSS to 100bp downstream to TSS.
  10. Set the p value cutoff. For each statistical test in the following procedures, a p value would be calculated. This cutoff would be used as the alpha level for the tests. Generally speaking a smaller cutoff would lead to more stringent filtering when judging whether the histone modification mark enrichment is different among groups of genes.
  11. Set the number of genes for each histone modification mark shown in "network". By default five genes with highest scores on one mark will be shown.
  12. A description...
  13. If a user is interested in some types of histone modification marks and would like to know the test result of them. By clicking “Handpick histone modification”, all histone modification marks of the chosen cell line are listed and any number of marks could be selected. These marks would be included and highlighted in the final results, including network and heatmap, despite their statistical test results.
  14. A description...
  15. Click “Submit” to start analysis or “Reset” to reset all parameters. After clicking “Submit”, if a window called “INFORMATION” pops up, the input data may contain some errors. If less than 20% genes are not matched, the analysis could still continue. While if more than 20% genes are not matched, user has to revise the list to start analysis.
  16. fig6.png
  17. After submission, the project would be put in the queue. When it is finished, it would be moved from “Pending/Running” to “Finished”. Clicking the project name the results would be shown. All the projects run by a user would be listed in “Your Projects”. The user could retrieve the results for past projects by clicking the project names. In the queue all projects runned on this web server are listed, but only the user’s own projects could be linked to results.
  18. A description...

2. How to interpret the results

  1. Network
  2. This network gives a picture on how several histone modifications regulate gene expression in a synergetic way. Nodes of blue color are top ten functional histone modification marks with p values less than alpha. Nodes of red color are arbitrarily added marks by user. The size of a mark is inversely proportional to its p value from enrichment test. Nodes of yellow color are genes which are most enriched with one or more histone modification marks. The size of a gene node is proportional to the numbers of marks which regulate it. Blue lines between histone modification mark nodes stand for their correlative relations. Only if the coefficients have absolute values greater than 0.8, the lines would be shown. Red lines between histone modification mark and gene node means this gene is highly regulated by this mark. Lines with deeper color and more thickness denote stronger correlative relations. A description... By clicking any histone modification node a pop-up window would tell which transcription factors (TFs) are functional. On the top of the table the p value is obtained from enrichment test on the chosen histone modification mark. It is shown as 1.00E+0 if greater than alpha. In the table, in second column the p values are for transcription factors (TFs), to denote whether the TF contributes to gene expression differentiation. In third column, p values and coefficients from the correlation tests between these TFs and the chosen HMM are shown (the coefficient is not given if p value is greater than alpha level).Colors of rows in the table denote the correlative relations. Deeper color stands for stronger relation (red for positive correlation, while blue for negative correlation). A description... By clicking any gene, in the pop-up window five strongest histone modification signals and all detected transcriptional factor binding sites with their p values from enrichment tests in the promoter region would be visualized in tracks. By clicking "Download data" at the bottom of the pop-up window, all data files on this gene are available. Details of the files are described in "Download" tab of Project Results. For one gene name there might be several records in RefSeq database, so all records are listed here in different tabs. "Target" shows the coordinates of the promoter region. Followed are top five histone modification marks (HMMs) for which this gene has strongest signals. "Detected TFs" track shows all detectable TF binding sites in this promoter with their p values from chi-square test or fisher's exact test. P value less than alpha suggests the TF binding enrichment differs among several gene categories. A description...
  3. Heatmap
  4. This heatmap is based on correlation tests on histone modification marks. Row names are full names of all marks with an ordered number. Marks of pink color are functional ones with p values less than alpha. Marks of white color are nonfunctional ones. Marks of red color are arbitrarily selected by user. By clicking each mark users would read further information, including the p value for enrichment test on this mark and the correlation test result on this mark and functional transcription factors (same as that by clicking mark nodes in “Network”, except that here all histone modification marks are available). A description... Column names are ordered numbers for marks. Squares in the matrix explain the correlative relations between a pair of histone modification marks. When the mouse hovers on one square, the two marks’ names would be highlighted with cyan color. The p value and correlation coefficient would be displayed. Deeper color stands for stronger relation (red for positive correlation, while blue for negative; white color means this pair of marks has no correlation due to a p value greater than alpha) A description... User could re-filter the heatmap by adjusting p value cutoff. This cutoff functions as the alpha level in correlation test. A description...
  5. Download
  6. All the data produced by our analysis could be downloaded. Details of each file are explained in the tab of “Download”. A description...

3. Instant sample

An example named “static_sample” is available for quick view of result. The sample project is included in “Your Projects” automatically when you start the server. A description...

For the “dynamic” page (in this platform we haven’t applied any data for analysis, but we wish users could test your data and help us to improve the function):

1. Input a gene list and set the parameters

  1. Assign a title for your project. It should be alphanumeric with less than 10 characters. If not filled, the default name "epiregnet" will be used.
  2. If an email address is filled, a notice will be sent to the mailbox when the project finishes. A description...
  3. Choose one format for gene names. Gene names could be RefSeq Accession Numbers, Official Gene Symbols, Affymetrix Probes, Agilent Probes or Illumina Probes.
  4. Input the data. Users could copy and paste the data or upload a simple text file to provide the data. The data file format is quite simple. It is expected on each line there is one gene name. A description...
  5. Upload whole genome-wide histone modification data on one mark. For each state, one file should be uploaded. The file should be in the form of plain text, composing of four columns separated by spaces. Each line depicts a region or a peak, and in order the four fields are chromosome, start point, end point and reads number (or peak signal). It should be noticed the start point should be smaller than end point in one line and the signal number should be greater than zero. Possible files could be generated from bed files which describe histone modification peaks yielded from ChIP-seq, or wig files counting reads number by window sliding after ChIP-seq. And one important thing is that our system is of human GRCh37/hg19 version, so the uploaded histone modification peaks or reads information should be of GRCh37/hg19 version. If it is not, the tool "liftOver" in UCSC genome browser (http://genome.ucsc.edu/cgi-bin/hgLiftOver) is available to resolve the problem. An example file for uploaded histone modification signals is here: chr1 564602 564759 1.75 chr1 565463 565647 4.35 chr1 566695 566874 0.49 chr1 567508 567796 2.02 chr1 755739 758787 16.55 chr1 774643 774767 3.78 chr1 787641 787883 3.74 chr1 834131 834406 3.77 chr1 835098 835592 3.1 chr1 840534 840975 3.52 ...... A description...
  6. Set the number of direct target genes shown in resulting "network". At most 20 genes are picked as direct targets. A description...
  7. Click “Submit” to start analysis or “Reset” to reset all parameters. After clicking “Submit”, if a window called “INFORMATION” pops up, the input data may contain some errors. If less than 20% genes are not matched, the analysis could still continue. While if more than 20% genes are not matched, user has to revise the list to start analysis. Similarly, the content and format of histone modification mark signal data file are also checked. A description...
  8. After submission, the project would be put in the queue. When it is finished, it would be moved from “Pending/Running” to “Finished”. Clicking the project name the results would be shown. All the projects run by a user would be listed in “Your Projects”. The user could retrieve the results for past projects by clicking the project names. In the queue all projects run on this web server are listed, but only the user’s own projects could be linked to results. A description...

2. How to interpret the results

  1. The result is displayed as a network. In this network, at the hub is the histone modification mark. By clicking it a pop-up window would show the numbers of direct target genes and indirect target genes. Direct target genes regulated by the histone modification mark are the ones with most dramatic change of mark signal in the promoter region. And they are connected to the mark with lines, which thickness denotes how different the mark signals in promoter region between two states are. If the direct target gene encodes a TF, it is colored blue, otherwise red. For the TFs, we search in all the genes and define the ones with TFBSs as TF target genes, and also the indirect target of the histone modification mark. Similarly, these target genes are connected to the TF. It is possible a TF regulates itself. A description... By clicking a direct target gene, the pop-up window would show two histograms based on mark patterns in the gene`s promoter region under two states. For one gene name there might be several records in RefSeq database, so all records are listed here in different tabs. A description... By clicking a TF coding gene(blue node) in the pop-up window there would be a table containing information on genes harboring TFBSs to this TF. A description...
  2. Download
  3. All the data produced by our analysis could be downloaded. Details of each file are explained in the tab of “Download”. A description...

3. Instant sample

  1. An example named “dynamic_sample” is available for quick view of result. The sample project is included in “Your Projects” automatically when you start the server. A description...