At present, 274 kinds of plants, 324 corresponding genomes,414 herbs (different parts from the same plant), and 13809 ingredients, together with the related information on plant species, genomes, herbal medicines, and ingredients were included in TCMPG.
Users can quickly view the list of all plants, genomes, herbs, and ingredients in TCMPG through the "Browse" button on the toolbar.
The detailed information about each plant, genome, herb, and ingredient was displayed in three types of report pages. Users can visit relevant pages through internal links in the full list page or search results page.
Users can input TCMPG ID, Latin name, herb name, ingredient name, gene ID, GO, KEGG, Pfam, SMART, PANTHER, or functional annotation keywords in the search bar for quick search.
The search page provides two drop-down options and an input box. The first drop-down option contains the Latin names of all species in the database. Users can directly select the species they want to view and click the search button to search. The second drop-down option is the search field, users can select a field, and then enter a keyword in the third input box to perform the search.
Some search examples are provided on the right side of the page, and the word cloud on the page can also be clicked to view.
Search results will be redirected to the detail page of the corresponding entry.
TCMPG embedded nine analysis tools, including BLAST, JBrowse, SSR Finder, Synteny Viewer, HmmSearch, Heatmap, Primer3, PlantiSMASH, and CRISPRCasFinder.
The BLAST was based on the SequenceServer application, where users can search the list of genes of interest.
Learn more about
BLAST
NCBI BLAST
Home page
3-1-1 Perform
The following shows the interface to run BLAST, users only need four steps to perform a BLAST search.
Step 1. Upload the query sequences
To perform a sequence search, users can paste their sequences in the query region or
drag a sequence file (<10Mb)
to the query region. The sequence type (protein/nucleotides) can be automatically
detected.
Note: Original sequences or multiple fasta formats are supported when pasting
from the clipboard,
but sequences uploaded from a file can only be in fasta format. Learn more about
fasta format
here.
Step 2. Select databases
After uploading query sequences, users can select one or more databases to
search.
Note: Only one type of database (Nucleotide or Protein) can be selected.
Step 3. Select parameters
The "Advanced parameters" input box allows users to run BLAST with their custom
parameters. Users could
click on the “?” button to view the available parameters. If users leave the box
empty, the default parameters will be used.
Learn more about BLAST parameters here.
Step 4. Perform BLAST
After completing the previous steps, the BLAST button will automatically change to
the corresponding BLAST subroutine
(BLASTn, BLASTx, BLASTp, etc.) depending on the query sequence and the type of
database. Click on the BLAST button to perform the
analysis, and users will see a status page like this:
After the BLAST is done, it will automatically lead users to the result page.
3-1-2 Result
Below is an example of the BLAST result page, the result page can be generally divided into 5 sections:
Section A: Category of query sequences
This section indicates the result of each sequence of the query sequence.
Clicking on the query sequence ID will lead users to the details of the query
sequence.
Section B: General information
This section contains the general information of the mission, including the version
of the program, the
submitting time, the database information, and the parameters used.
Section C: Queries and their top hits: chord diagram
This section is a chord diagram that represents the mapping information between the
query sequence and
similar sequences in the database. Hovering over the ribbon will display the
identity and E-value of this alignment.
Section D: Download Category
This section allows users to download all or the selected results in different
formats to their local machine.
Section E: BLAST hits information
1. Graphical overview of hits
This part shows the BLAST hits for each query sequence. Each bar represents a hit in
the database,
and the color of the bar deepens when the hit is stronger. Hovering over the bar
displays the sequence
ID and E-value for the hit, and clicking on the bar displays detailed alignment
information for the hit.
2. Length distribution of hits
This is a histogram of the length of similar sequences in the database. Hovering
over the histogram will
display the ID, E-value, and length of the sequence.
3. Summary table of hits
This is a list view of BLAST hit results, including sequence name, query coverage,
total score, E-value,
and identity. Clicking on a sequence name will lead you to a detailed alignment of
that hit.
4. Details of each hit
Check the "Select" box so that users can download only the results of the selected
records in Section D.
Clicking on "Sequence" will display the detailed sequence. Clicking on "FASTA" and
"Alignment" will download
the results for fasta format sequence and the alignment result, respectively. At the
bottom is an
overview of the graphs and alignments for this hit.
TCMPG provides JBrowse to visualize genomic data, and there are many tutorials online on how to use JBrowse, such as JBrowse Documentation. We have provided two tracks (reference sequence and genome annotation) in JBrowse.
The SSR Finder page has two parts, the first part is the Web SSR Finder, which identifies SSRs based on the sequences provided by the users, and the second part is all SSRs identified in the CDS and whole genome sequences of all genomes in this database.
3-3-1 Perform
The following shows the interface to run SSR Finder, users only need three steps to perform an SSR search.
Step 1. Upload the query sequences
To perform the SSR Finder, users need to click on the "Choose File" button and then
upload a sequence file (<10Mb).
Note:Sequences file can only be in fasta format. Learn more about fasta
format
here.
Step 2. Change parameters
After uploading sequences, users can change one or more parameters to search.
Step 3. Perform SSR Finder
When the above steps are completed, users can click the "start" button to start an
analysis.
After the analysis is done, it will lead users to the result page automatically.
3-3-2 Result
Below is an example of the SSR Finder result page. The results include the number, type, sequences, size, and start and end positions of the SSRs in the query sequence.
Displays all the syntenic blocks between two selected genomes in a dotplot and karyotype figure.
3-4-1 Perform
To get the synteny block information, users need to select a query genome (① in the figure below) and a target genome (② in the figure below). Then click on the "Submit" button (③ in the figure below) to get the results.
3-4-2 Result
Below is an example of the Synteny Viewer result page.
The first graph is a dotplot of the collinear genes between the two selected genomes.
The second picture is the karyotype of the two genomes, and the collinear genes are connected by Bezier curves.
TCMPG provides HmmSearch to identify protein domain.
3-5-1 Perform
The following shows the interface to run HmmSearch, users only need four steps to perform an HmmSearch.
Step 1. Select or upload a profile-HMM
To perform the HmmSearch, users need to select a profile-HMM name from the drop down
box or click on the "Choose File" button to upload a profile-HMM file (<10Mb).
Note:The profile-HMM file can only be in HMMER3 format. Learn more about
HMMER3 format
here.
Step 2. Select or upload a sequence database
After selecting or uploading the profile-HMM, users need to select or upload a
sequence database.
Note:The sequences file can only be protein sequence in fasta format. Learn
more about fasta format
here.
Step 3. Set cut-offs
Users can change the reporting and inclusion thresholds to control which hits are
reported in the results.
Step 4. Perform HmmSearch
When the above steps are completed, users can click the "Submit" button to start an
analysis.
After the analysis is done, it will lead users to the result page automatically.
3-5-2 Result
In the results, we show the domain hits table. There is one line for each domain. There may be more than one domain per sequence. The domain table has 18 fields followed by a text target sequence description. Users can mouse over the information icon on the right side of the field to view the full field description.
TCMPG provides Heatmap to create heatmap.
3-6-1 Perform
The following shows the interface to run Heatmap, users only need three steps to perform Heatmap.
Step 1. Upload an expression matrix file
To perform the Heatmap, users need to upload an expression matrix file (<10Mb).
X-axis and Y-axis grouping information file are optional.
Note:The file must be tab-delimited txt file.
Step 2. Setting the plotting parameters
After uploading the expression matrix file, users need to setting the plotting parameters.
Step 3. Perform Heatmap
When the above steps are completed, users can click the "Submit" button to plotting a
heatmap.
When the heatmap is finished, it will be displayed in the top right corner of the page.
3-6-2 Result
Users can download the obtained heatmap by clicking on the 'Download' button below the image.
TCMPG provides a simple and reliable way for users to better automate the design of primers.
3-7-1 Perform
The following shows the interface to run Primer3, users only need three steps to perform Primer3.
Step 1. Input the query sequence
To perform the Primer3, users need to input the query sequence.
Step 2. Setting the parameters
After input the query sequence, users need to setting the general primer picking conditions.
Step 3. Perform Primer3
When the above steps are completed, users can click the "Submit" button to design PCR primers from the input sequence.
After the analysis is done, the results are displayed in a table at the bottom of the page.
3-7-2 Result
Users can export the results of interest.
TCMPG identified known clusters of secondary metabolic genes in all available chromosome-level genomes using plantiSMASH. Users can click "View details" to view all biosynthetic gene clusters in the genome of interest.
TCMPG uses CRISPRCasFinder to identify CRISPR arrays and Cas proteins in user submitted sequences.
3-9-1 Perform
The following shows the interface to run CRISPRCasFinder, users only need three steps to perform CRISPRCasFinder.
Step 1. Input the query sequence
To perform the CRISPRCasFinder, users need to input the query sequence.
Step 2. Setting the parameters
After input the query sequence, users need to setting the advanced parameters.
Step 3. Perform CRISPRCasFinder
When the above steps are completed, users can click the "Submit" button to identify CRISPR arrays and Cas proteins from the input sequence.
After the analysis is done, the results are displayed at the bottom of the page.
3-9-2 Result
Users can view all CRISPR sequences and Cas proteins identified.
To better show what is inside TCMPG, we have created some data visualizations. Users can view them through the visualization buttons on the toolbar. It will serve as a starting point for users to explore the database. Above each chart is a short description of how to use the visualization tools.