RegulonDB

Evidence Classification in RegulonDB


Scientific knowledge advances incrementally. At any point, we base broad conclusions on assertions of varying degrees of confidence. RegulonDB classifies evidence supporting particular assertions essentially based on the methods used to generate them. We do so to make explicit the complex mixture of more or less well supported specific claims that support broader conclusions (Weiss et al., 2013) In RegulonDB version 11.1 we made several changes to the collection of evidence types. We made changes to the evidence codes to make them more informative and more precise since we indicate the corresponding method. We increased the number of high throughput evidence types. Finally, we updated the collection of combinations of independent methods that increase the confidence levels. The details can be found in the sections on additive evidence below.

We classify the evidence supporting knowledge as ’Weak’, ’Strong ’ or ’Confirmed ’.

Weak evidence: Single evidence with more ambiguous conclusions, where alternative explanations, indirect effects, or potential false positives are prevalent, as well as computational predictions; for instance gel mobility shift assays with cell extracts or gene expression analysis.
Strong evidence: Single evidence with direct physical interaction or solid genetic evidence with a low probability for alternative explanations; for instance, footprinting with purified protein or site mutation.
Confirmed: is assigned, if objects are supported by at least two independent types of strong evidence with mutually excluding false positives. This approach is based essentially on the methods used to validate results and exclude alternative explanations in scientific research.

Confidence is assigned in two stages:

In stage I: we classify single evidence into weak or strong.
In stage II: we define combinations of independent evidence in a process termed “additive evidence” (previously described as Cross-Validation in (Weiss et al., 2013), that enable multiple weak evidence types to support an object with a strong (S) confidence level, as well as multiple strong types to support the higher “confirmed (C)” level of an object.

Stage I. Classification of Individual Evidence Types

As mentioned before, in RegulonDB version 11.1 we made changes to the collection of evidence types; we changed the evidence codes so now they indicate the corresponding method ((for instance Transcription initiation mapping is now encoded as primer extension assay for transcription start site determination evidence or S1 nuclease protection assay evidence for transcription start site determination evidence ).

Description   
Single evidence is classified into weak or strong evidence (see above), depending on the confidence level of the associated methodologies.

1. Promoters and transcription start sites (TSSs)   
Promoters are defined in bacteria by the DNA region specifically bound by RNA polymerase to initiate transcription.
A TSS is the precise first nucleotide that is transcribed, different methods identify promoters or TSSs. They are jointly classified here.
Evidence Code Evidence Category Evidence Group
2. Regulatory interactions   
A regulatory interaction is defined, depending on the type of evidence, as the transcription factor (TF)-regulated gene interaction (TF-gene), or more specifically as the TF-DNA binding site interaction. Evidence Code Evidence Category Evidence Group
3. Transcription factor functional conformation    
Most dedicated TFs have usually two conformations, one with a non-covalent bound allosteric metabolite, or a covalent phosphorylation (holo conformation), and one as a free protein or multimer (the apo conformation). There are exceptions to this statement. We call functional conformation the one that is capable of binding to its specific binding sites and perform its activation or repression activity. For the sake of functional conformation evidence the experiments below have to be considered with and without effector. Evidence Code Evidence Category Evidence Group
4. Transcription units
Evidence Code Evidence Category Evidence Group


Stage II. Assignment of confidence level based on additive evidence types
Following the same logic described in (Weiss et al., 2013) we integrate multiple evidence by combining independent types of evidence, with the intention to confirm individual objects and mutually exclude false positives. It follows the same principles of science as applied by wet-lab scientists, where data are confirmed by repetitions on the one hand, and by additional experimental strategies to exc