STRING
Content | |
---|---|
Description | Search Tool for the Retrieval of Interacting Genes/Proteins |
Contact | |
Research center | Academic Consortium |
Primary citation | PMID 25352553 |
Access | |
Website | STRING website |
Download URL | url |
Web service URL | rest |
Miscellaneous | |
Version | 10 (January 2015) |
In molecular biology, STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) is a biological database and web resource of known and predicted protein–protein interactions. [1] [2] [3] [4] [5]
The STRING database contains information from numerous sources, including experimental data, computational prediction methods and public text collections. It is freely accessible and it is regularly updated. The latest version 10.0 contains information on about 9.6 millions proteins from more than 2000 organisms. STRING has been developed by a consortium of academic institutions including CPR, EMBL, KU, SIB, TUD and UZH.
Usage
Protein–protein interaction networks are an important ingredient for the system-level understanding of cellular processes. Such networks can be used for filtering and assessing functional genomics data and for providing an intuitive platform for annotating structural, functional and evolutionary properties of proteins. Exploring the predicted interaction networks can suggest new directions for future experimental research and provide cross-species predictions for efficient interaction mapping. [6]
Features
The data is weighted and integrated and a confidence score is calculated for all protein interactions. Results of the various computational predictions can be inspected from different designated views. There are two modes of STRING: Protein-mode and COG-mode. Predicted interactions are propagated to proteins in other organisms for which interaction has been described by inference of orthology. A web interface is available to access the data and to give a fast overview of the proteins and their interactions. A plug-in for cytoscape to use STRING data is available. Another possibility to access data STRING is to use the application programming interface (API) by constructing a URL that contain the request.
Data sources
Like many other database that store protein association knowledge STRING imports data from experimentally derived protein–protein interactions through literature curation. Furthermore, STRING also store computationally predicted interactions from: (i) text mining of scientific texts, (ii) interactions computed from genomic features, and (iii) interactions transferred from model organisms based on orthology. [7]
All predicted or imported interactions are benchmarked against a common reference of functional partnership as annotated done by KEGG (Kyoto Encyclopedia of Genes and Genomes).
Imported data
STRING imports protein association knowledge from databases of physical interaction and databases of curated biological pathway knowledge (MINT, HPRD, BIND, DIP, BioGRID, KEGG, Reactome, IntAct, EcoCyc, NCI-Nature Pathway Interaction Database, GO). Links are supplied to the originating data of the respective experimental repositories and database resources.
Text mining
A large body of scientific texts (SGD, OMIM, FlyBase, PubMed) are parsed to search for statistically relevant co-occurrences of gene names.
Predicted data
- Neighborhood: Similar genomic context in different species suggest a similar function of the proteins.
- Fusion-fission events: Proteins that are fused in some genomes are very likely to be functionally linked (as in other genomes where the genes are not fused).
- Occurrence: Proteins that have a similar function or an occurrence in the same metabolic pathway, must be expressed together and have similar phylogenetic profile.
- Coexpression: Predicted association between genes based on observed patterns of simultaneous expression of genes.
References
- ↑ Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, Bork P, von Mering C (2009). "STRING 8—a global view on proteins and their functional interactions in 630 organisms". Nucleic Acids Res 37 (Database issue): D412–6. doi:10.1093/nar/gkn760. PMC 2686466. PMID 18940858.
- ↑ von Mering, C and Jensen, LJ and Kuhn, M and Chaffron, S and Doerks, T and Kruger, B and Snel, B and Bork, P (2007). "STRING 7—recent developments in the integration and prediction of protein interactions". Nucleic Acids Res 35 (Database issue): D358–62. doi:10.1093/nar/gkl825. PMC 1669762. PMID 17098935.
- ↑ von Mering, C and Jensen, LJ and Snel, B and Hooper, SD and Krupp, M and Foglierini, M and Jouffre, N and Huynen, MA and Bork, P (2005). "STRING: known and predicted protein–protein associations, integrated and transferred across organisms". Nucleic Acids Res 33 (Database issue): D433–7. doi:10.1093/nar/gki005. PMC 539959. PMID 15608232.
- ↑ von Mering, C and Huynen, M and Jaeggi, D and Schmidt, S and Bork, P and Snel, B (2003). "STRING: a database of predicted functional associations between proteins". Nucleic Acids Res 31 (1): 258–261. doi:10.1093/nar/gkg034. PMC 165481. PMID 12519996.
- ↑ Snel, B and Lehmann, G and Bork, P and Huynen, MA (2000). "STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene". Nucleic Acids Res 28 (18): 3442–4. doi:10.1093/nar/28.18.3442. PMC 110752. PMID 10982861.
- ↑ Schwartz, AS and Yu, J and Gardenour, KR and Finley Jr, RL and Ideker, T (2008). "Cost-effective strategies for completing the interactome". Nature Methods 6 (1): 55–61. doi:10.1038/nmeth.1283. PMC 2613168. PMID 19079254.
- ↑ Wodak, SJ and Pu, S and Vlasblom, J and Séraphin, B (2009). "Challenges and rewards of interaction proteomics". Mol Cell Proteomics 8 (1): 3–18. doi:10.1074/mcp.R800014-MCP200. PMID 18799807.
External links
- STRING site
- STITCH website, related database on interactions of proteins with small molecules