The TAB-delimited files in the directory TSV are species-specific networks of functional couplings. In other words, they are collections of predicted gene-gene links, where each line corresponds to an edge in the network. The files in the XML directory have identical content, but are XML-formatted. Content of this file describes the TSV format: > File name format > Files' content > Input datafiles employed for integration > What is FC class > What is Cutoff > What is Confidence score > Cytoscape users ################################################# > The names of files with processed (with Bayesian predictions made) output have the format .tsv ################################################# > The files contain the following information: Each protein pair was evaluated in respect of FC of 2...4 different FC classes (see below), and if at least one of them gave a final Bayesian score > CUTOFF (see below), the pair was saved. The maximal one of the class-specific FBS scores is in the column #1 "FBS_max" (which namely was the class is said in the LAST column), the specific FC-class FBS scores follow in the next 2..4 columns. Then protein/gene names are given in two columns Protein1 and Protein2. The next 15 columns (header format LLR_*) are the FBS components that might have contributed to the FBS_max: fractions of FBS due to particular types of evidence OR species. For example, everything from mouse, everything from mRNA co-expression etc. The species components are: -hsa -mmu -rno -dme -cel -sce -ath must sum up (to the rounding error) to (FBS - phylo) (as the latter was not a species-specific evidence). The data type components are: -ppi -pearson (mRNA co-expression) -coloc (co-localization in the cell) -phylo (phylogenetic profiling) -hpa (protein co-expression) -domains likely to interact -tf (transcription factor-DNA binding) -miRNA (miRNA-mRNA binding) These must sum up to the FBS_max. I.e. in total ALL the LLR_* columns must sum up to 2*FBS - phylo. The detailed FC output is followed with raw data in the remaining columns. Particular patterns in the column titles mean: hsa____: a column with human gene/protein IDs for which the prediction was made. sce_a2a___ : a column with yeast orthologs to the human gene; _a2a_ means that ALLxALL inparalog pairs between two genes were processed, and the BEST value has been recorded. _pearson: correlation of mRNA expression in a dataset (e.g. GSE### mean GEO database entries) _ppi_: a protein-protein interaction score _colocalization : a mutual information measure of sub-cellular co-localization profiles in an organism _phylosign : a phylogenetic profile over 6-10 eukaryotes _score : BLAST bit score of sequence similarity (between the human genes) _chrodistbp : distance in b.p. between the two genes, if on the same chromosome (not used for prediction). For example, the column 41:mmu_a2a_pearson_funcoup_resources/cns_u74v2.gse3594.mouse.txt_ with raw value 0.541 contains the Pearson linear correlation over mRNA profiles of the mouse orthologs in the dataset cns_u74v2.gse3594. ############################################### Learn more about input datasets in http://funcoup.sbc.su.se/inputdata.html Metrics that apply to each set are described in http://funcoup.sbc.su.se/funcoupmetrics.pdf ################################################# > "FC class" means a class of functional coupling used for specific training, e.g. - PPI_MT (protein-protein interactions); - MET_MT (metabolic links); - SIG_MT (signaling links); - UP_COMPLEX (protein complex members); the 1st and the 4th classes are not available for every species, whereas MET and SIG are. ################################################# > CUTOFF: the FBS score value used to filter the output. Usually equals 3 (which corresponds to confidence 0.02). ################################################## > The confidence score: it is impossible to estimate exactly the probability of a link to hold in reality. However, there is a FunCoup confidence score (Pfc) ranging from 0 to 1, where 1 means "~100% sure". The score is used in the dinamically generated sub-network web pages at http://FunCoup.sbc.su.se, and the functional dependence of pfc on FBS is given in the graphic file in this same directory: Pfc.GIF (we use the red line in the middle: prior probability of functional coupling P(FC)=0.001). As some examples: FBS = 5.9 corresponds to Pfc = 0.25 FBS = 7 corresponds to Pfc = 0.50 FBS = 8 corresponds to Pfc = 0.75 and so on. All this follows from the plot Pfc.GIF ################################################# > Cytoscape users: Although the text files are not in the genuine Cytoscape format, they are compatible with Cytoscape (at least since version 2.5) while using the "Import" option (menu "File" -> "Import" -> "Network from Table (Text / MS Excel)"). 1. Check the boxes "Show Text File Import Options" and "Transfer 1st line as attribute names". 2. Select the columns Protein1 and Protein2 as "Source interaction" and "Target interaction" (the order does not matter). "Interaction type" can be left undefined ("Default"). We recommend using our format via "Import" because it does not make the work of Cytoscape slower, but does provide additional information. For example, you can use "hsa", "ppi" and whatever other column as an edge attribute. It also easy to filter out weaker (in terms of confidence, using FBS) interactions from our file - in accordance with your needs.