At present, 160 kinds of plants, 195 corresponding genomes, and 255 herbs (different parts from the same plant), together with the related information on plant species, genomes, and herbal medicines were included in TCMPG.
Users can quickly view the list of all plants, genomes and herbs in TCMPG through the "Browse" button on the toolbar.
The detailed information about each plant, genome, and herb was displayed in three types of report pages. Users can visit relevant pages through internal links in the full list page or search results page.
Users can input TCMPG ID, Latin name, herb name, gene ID, GO, KEGG, Pfam, SMART, PANTHER, or functional annotation keywords in the search bar for quick search.
The search page provides two drop-down options and an input box. The first drop-down option contains the Latin names of all species in the database. Users can directly select the species they want to view and click the search button to search. The second drop-down option is the search field, users can select a field, and then enter a keyword in the third input box to perform the search.
Some search examples are provided on the right side of the page, and the word cloud on the page can also be clicked to view.
The search results are displayed in a table, where users can visit the detailed page by the internal links.
TCMPG embedded five analysis tools, including BLAST, JBrowse, SSR Finder, Synteny Viewer, and HmmSearch.
The BLAST was based on the SequenceServer application, where users can search the list of genes of interest.
Learn more about BLAST
NCBI BLAST Home page
3-1-1 Perform
The following shows the interface to run BLAST, users only need four steps to perform a BLAST search.
Step 1. Upload the query sequences
To perform a sequence search, users can paste their sequences in the query region or drag a sequence file (<10Mb)
to the query region. The sequence type (protein/nucleotides) can be automatically detected.
Note: Original sequences or multiple fasta formats are supported when pasting from the clipboard,
but sequences uploaded from a file can only be in fasta format. Learn more about fasta format
here.
Step 2. Select databases
After uploading query sequences, users can select one or more databases to search.
Note: Only one type of database (Nucleotide or Protein) can be selected.
Step 3. Select parameters
The "Advanced parameters" input box allows users to run BLAST with their custom parameters. Users could
click on the “?” button to view the available parameters. If users leave the box empty, the default parameters will be used.
Learn more about BLAST parameters here.
Step 4. Perform BLAST
After completing the previous steps, the BLAST button will automatically change to the corresponding BLAST subroutine
(BLASTn, BLASTx, BLASTp, etc.) depending on the query sequence and the type of database. Click on the BLAST button to perform the
analysis, and users will see a status page like this:
After the BLAST is done, it will automatically lead users to the result page.
3-1-2 Result
Below is an example of the BLAST result page, the result page can be generally divided into 5 sections:
Section A: Category of query sequences
This section indicates the result of each sequence of the query sequence.
Clicking on the query sequence ID will lead users to the details of the query sequence.
Section B: General information
This section contains the general information of the mission, including the version of the program, the
submitting time, the database information, and the parameters used.
Section C: Queries and their top hits: chord diagram
This section is a chord diagram that represents the mapping information between the query sequence and
similar sequences in the database. Hovering over the ribbon will display the identity and E-value of this alignment.
Section D: Download Category
This section allows users to download all or the selected results in different formats to their local machine.
Section E: BLAST hits information
1. Graphical overview of hits
This part shows the BLAST hits for each query sequence. Each bar represents a hit in the database,
and the color of the bar deepens when the hit is stronger. Hovering over the bar displays the sequence
ID and E-value for the hit, and clicking on the bar displays detailed alignment information for the hit.
2. Length distribution of hits
This is a histogram of the length of similar sequences in the database. Hovering over the histogram will
display the ID, E-value, and length of the sequence.
3. Summary table of hits
This is a list view of BLAST hit results, including sequence name, query coverage, total score, E-value,
and identity. Clicking on a sequence name will lead you to a detailed alignment of that hit.
4. Details of each hit
Check the "Select" box so that users can download only the results of the selected records in Section D.
Clicking on "Sequence" will display the detailed sequence. Clicking on "FASTA" and "Alignment" will download
the results for fasta format sequence and the alignment result, respectively. At the bottom is an
overview of the graphs and alignments for this hit.
TCMPG provides JBrowse to visualize genomic data, and there are many tutorials online on how to use JBrowse, such as JBrowse Documentation. We have provided two tracks (reference sequence and genome annotation) in JBrowse.
The SSR Finder page has two parts, the first part is the Web SSR Finder, which identifies SSRs based on the sequences provided by the users, and the second part is all SSRs identified in the CDS sequences of all genomes in this database.
3-3-1 Perform
The following shows the interface to run SSR Finder, users only need three steps to perform an SSR search.
Step 1. Upload the query sequences
To perform the SSR Finder, users need to click on the "Choose File" button and then upload a sequence file (<10Mb).
Note:Sequences file can only be in fasta format. Learn more about fasta format
here.
Step 2. Change parameters
After uploading sequences, users can change one or more parameters to search.
Step 3. Perform SSR Finder
When the above steps are completed, users can click the "start" button to start an analysis.
After the analysis is done, it will lead users to the result page automatically.
3-3-2 Result
Below is an example of the SSR Finder result page. The results include the number, type, sequences, size, and start and end positions of the SSRs in the query sequence.
Displays all the syntenic blocks between two selected genomes in a dotplot and karyotype figure.
3-4-1 Perform
To get the synteny block information, users need to select a query genome (① in the figure below) and a target genome (② in the figure below). Then click on the "Submit" button (③ in the figure below) to get the results.
3-4-2 Result
Below is an example of the Synteny Viewer result page.
The first graph is a dotplot of the collinear genes between the two selected genomes.
The second picture is the karyotype of the two genomes, and the collinear genes are connected by Bezier curves.
TCMPG provides HmmSearch to identify protein domain.
3-5-1 Perform
The following shows the interface to run HmmSearch, users only need four steps to perform an HmmSearch.
Step 1. Select or upload a profile-HMM
To perform the HmmSearch, users need to select a profile-HMM name from the drop down box or click on the "Choose File" button to upload a profile-HMM file (<10Mb).
Note:The profile-HMM file can only be in HMMER3 format. Learn more about HMMER3 format
here.
Step 2. Select or upload a sequence database
After selecting or uploading the profile-HMM, users need to select or upload a sequence database.
Note:The sequences file can only be protein sequence in fasta format. Learn more about fasta format
here.
Step 3. Set cut-offs
Users can change the reporting and inclusion thresholds to control which hits are reported in the results.
Step 4. Perform HmmSearch
When the above steps are completed, users can click the "Submit" button to start an analysis.
After the analysis is done, it will lead users to the result page automatically.
3-5-2 Result
In the results, we show the domain hits table. There is one line for each domain. There may be more than one domain per sequence. The domain table has 18 fields followed by a text target sequence description. Users can mouse over the information icon on the right side of the field to view the full field description.
To better show what is inside TCMPG, we have created some data visualizations. Users can view them through the visualization buttons on the toolbar. It will serve as a starting point for users to explore the database. Above each chart is a short description of how to use the visualization tools.