RegulonDB Releases
Release 11.2
May 22th, 2023.
RegulonDB version 11.2 contains a specific update of the evidence and confidence level of the regulatory interactions set
The dataset "Complete RIs set" was updated. It contains exactly the same set of RIs of the previous version of RegulonDB (v11.1), but their evidence was enriched, based on the mapping of all current peaks contained in the files of the HT-TFBSs collections available in RegulonDB. Whenever an existing regulatory site of the RI collection matches with any peak of the HT-TFBSs, the new HT evidence was added.
The datasets "Confirmed RIs", "Strong RIs" and "Weak RIs" are new; they were created by filtering the "Complete RIs set" by the confidence level. The dataset "RIs set without ChIP-seq evidence" is new, this was created by removing the ChIP-seq evidence and recalculating the confidence level. This file can be used to compare with new ChIP-seq experiments avoiding any circularity. We will generate the sets excluding each of the other HT-binding methods.
Release 11.1
December 12th, 2022.
In RegulonDB version 11.1, we made important evidence-related changes, including:
- Evidence code update. The evidence codes were made more informative, i.e. BPP was changed to EXP-IDA-BINDING-OF-PURIFIED-PROTEINS.
- Evidence type update. All evidence codes were classified as "weak" or "strong" according to the confidence level.
- Additive evidence update. As was described by Weiss et al. (2013), the confidence level of a biological entity depends on the conjunct of independent evidence derived from different methods. An object with several weak evidence codes can become an object with a strong confidence level and an object with at least two strong evidence codes can become an object with a confirmed confidence level. In this version all evidence codes were classified by method type (additive evidence class). Then, we analyzed and updated the list of combinations of independent methods that increase the confidence levels of the objects, with the intention to confirm individual objects and mutually exclude false positives (see Stage II evidence classification).
- Objects confidence level update. The confidence level for each regulatory interaction, promoter and transcription units was calculated and updated using their linked evidence and their additive evidence. These changes are reflected in the RegulonDB interface as well as in the downloadable datasets: RISet.txt, Network-tf-tu.txt, network-tf-tf.txt and network-tf-gene.txt.
Release 11
August 22th, 2022. This release corresponds to Release 25.5 and 26.0 of EcoCyc.
In this version of RegulonDB, the collection of high-throughput datasets has been radically upgraded as reported by Tierrafría et al. (2022). It is now composed of 502 datasets of TF-binding sites, 5 datasets of transcription units (TUs), 16 datasets of transcription start sites (TSS), 5 datasets of transcription termination sites (TTSs), and 1,864 datasets of gene expression profiles; these all can be explored and downloaded via the menu "Integrated Views & Tools" in the "Browse RegulonDB" section, under the option "RegulonDB-HT datasets". The TF-binding sites collection includes datasets generated by ChIP-seq, ChIP-exo, gSELEX-ChIP, and DAP-seq methodologies. For ChIP-seq experiments, the data, as published by the authors as well as data uniformly processed in-house, are available to enhance their comparability. The collection of gene expression profile datasets was generated in-house by processing uniformly raw data gathered from the indicated databases. For details see Tierrafría et al. (2022).
In addition, we have added the first regulatory interactions for the transcription factors (TFs) YfeC, YidZ, YciT, YgbI, AaeR, and PunR. The TF RcdA was annotated as a dimer, and two new inactive conformations of the TF bound to metabolites were also annotated: RcdA-Tris and RcdA?trimethylamine N-oxide. The acetylated conformation of RcsB was annotated. The sigma factor RpoN was annotated as a transcriptional repressor of sigma 70 promoters in the absence of enhancers. We have annotated the full names of metabolites for all the conformation names of TFs bound to metabolites.
This new RegulonDB version contains the following total new objects: 4 promoters, 2 transcription units, 58 regulatory interactions, and 4 new TF conformations. New data, including evidence, references, comments, and new names, among others, were added to some objects. For that, 22 editions were made for promoters, 155 for RIs, 181 for TFs, 75 for TUs, and 2 for sigma factors.
Some summaries were updated with the following information:
New cellular processes regulated by the TFs CadC, PunR, FrmR, GadE, AaeR, CytR, AlpA, PhoB, YidZ, YciT, YbcM, and YnfL.
Family name of the TF PunR.
Information on DNA motifs that are recognized by the TFs PunR, YfeC, YciT, and YcjW.
Location in the genome of the genes encoding the TF PunR.
Cellular effects of mutants of the TFs PunR, CRP, HNS, BasR, RpoB, PdhR,IclR, DnaA, YciT , YbcM, and YgbI.
Growth conditions in which the TF PunR functions.
Additional roles, other than transcription, of HU, IHF, and CRP.
Data related to oligomerization of HU, RcdA, MarR, YidZ, UxuR, and YfeC.
Information about the mechanism of DNA binding of IHF, Rob, MarA, Lrp, NorR, HrpR, CsqR, ZraR, CusR, AaeR, MarR, UxuR, and HU.
Updates of the members of the regulons of BasR, NorR, HrpR, CsqR, ZraR, PutA, YqhC, PepA, RspR, UvrY, PdeL, YidZ, YfeC, YciT, YbcM, and YgbI.
Regulation of the activities of CRP, H-NS, LexA, TyrR, Fis, Cra, and CRP.
Regulation of expression of genes encoding TFs, such as cdaR, soxS, rob, lexA, lrhA, fliA, rcsAB, flhDC, gadX, and phoP.
Data related to the protein structures of RdcA, KdpE, UxuR, and FeaR.
Information related to ligands of RdcA, CRP, TyrR, UxuR, and FeaR.
Evolutionary data for RcdA.
Use of the regulatory mechanism of AtoC in industrial, medical, and environmental applications.
Information related to regulatory mechanisms of CpxR, CpxR, AgaR, and TyrR.
Tierrafría VH, Rioualen C, Salgado H, Lara P, Gama-Castro S, Lally P, Gómez-Romero L, Peña-Loredo P, López-Almazo AG, Alarcón-Carranza G, Betancourt-Figueroa F, Alquicira-Hernández S, Polanco-Morelos JE, García-Sotelo J, Gaytan-Nuñez E, Méndez-Cruz CF, Muñiz LJ, Bonavides-Martínez C, Moreno-Hagelsieb G, Galagan JE, Wade JT, Collado-Vides J. RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12. Microb Genom. 2022 May;8(5). doi: 10.1099/mgen.0.000833. PMID: 35584008.
Release 10.10
February 28th, 2022. This release corresponds to Release 25.1 of EcoCyc.
Dimers of NanR and ChbR trancription factors were included in this database version Christopher R Evans et al. and Jacqueline Plumbridge et al. .
Some summaries were updated with the following information:
Growth conditions that affect genes that encodes the sigma factor Sigma32 and the transcription factor MarA Michael Machas et al.
Growth conditions that affects the activity of Sigma32, OxyR Nicolas Barraud et al. , IscR Heather S Deter et al. , Sigma70, Sigma38, Sigma24, CRP, FNR, IHF, FIS, ArcA, NarL and Lrp, Fur, DksA, OxyR and IscR Heather S Deter et al.
Cellular effects of mutants of the transcription factors OmpR Michael Machas et al. , OxyR Nicolas Barraud et al. , ArcA, FNR, IHF Mahesh S Iyer et al. , FhlA Heghine Gevorgyan et al. and FlhDC Jae-Ho Han et al.
New cellular processes that XylR regulates Manon Barthe et al.
Description of the mechanism of NarR regulation Christopher R Horne et al.
The protein mobility on DNA for some proteins like RpoC, LacI, HNS and HU Mathew Stracy et al.
Covalent modification of ArcA Nicolas Barraud et al.
Negative effect of the CRP regulation process under non-targets genes, mainly genes of anabolism Karl Kochanowski et al.
Release 10.9
April 2nd, 2021. This release corresponds to Release 24.5 of EcoCyc.
A new conformation of the transcription factor BolA was included in this version. The new phosphorylated conformation was identified for Galego et al. as essential for the activity of the transcription factor. All the regulatory interactions of the protein that were previously contained in the database are now linked to this BolA-phosphate conformation.
A transcription factor binding site (TFBS) located 50 bp upstream of the transcriptional start site (TSS) of lexAp that was identified previously as a putative TFBS to bind LexA and regulate lexA-dinF expression was removed from the database because Kozuch BC et al. demonstrated that it is nonfunctional; there are now only two functional TFBS that regulate lexA-dinF expression, and they are located at -9 and +13 bp from the TSS .
Release 10.8
October 05th, 2020. This release corresponds to Release 24.0 of EcoCyc.
Comments, obtained from the results of high-throughput analyses, related to RNA G-quadruplex structures [31964733]| and RNA polymerase-binding RNA aptamers (RAPs) [31535128]| affecting gene expression; and data about kinetics of gene expression of many genes after σE induction Caroline Lacoux et al. (2020) and data about the location of mRNAs inside the cell Shanmugapriya Kannaiah et al. (2019) has been added to many promoters, transcription units, sRNAs and proteins.
MqsA responds to degradation upon bile acid stress and increased production upon heat stress; its degradation is followed by regeneration upon amino acid starvation |CITS:[31670944]|. Inhibition of SgrR and activation of DhaR in vivo by glutamate and DHAP (Dihydroxyacetone Phosphate) respectively, has been shown Martin Lempp et al. (2019). BtsR inhibits the formation of curli and biofilm through direct repression of the csgBAC operon and indirect repression via the csgD gene Hiroshi Ogasawara et al. (2019). The crystal structures of the isolated HicB antitoxin and full-length HicAB complex were determined Melek Cemre Manav et al. (2019).
Release 10.7
April 05th, 2020.
The HicB protein, known as the antitoxin of the HicA-HicB toxin-antitoxin system, was recently described as a transcription factor. It regulates the transcription of the operon that encodes the toxin-antitoxin system to which it pertains Kathryn J Turnbull and Kenn Gerdes (2017).
The new complex CRP-Sxy, which regulates some genes related to DNA uptake (natural competence), was added to the database. The CRP-Sxy complex recognizes a DNA-binding site that has an asymmetric organization: it has two distinct half-sites, one of which is similar to the canonical CRP DNA-binding site, whereas the other half-site is less conserved Emilie Søndberg et al. (2019).
H-NS changes DNA rigidity and forms bridges between DNA molecules in the presence of Mg2+ Yan Liang et al. (2017). There are two DNA-bending steps during site recognition by IHF. The fast-phase step entails nonspecific DNA bending, while the slow phase involves specific DNA kinking during site recognition Yogambigai Velmurugu et al. (2018). A novel regulatory role of RpoS in protecting cells against heat stress was suggested Christopher R Evans et al. (2019). OxyR also senses sulfane sulfur (H2Sn) under both aerobic and anaerobic c onditions Ningke Hou et al. (2019).
Notes or summaries for 117 objects were updated, of which 11 are associated with Transcription Factors.
Release 10.6
July 26th, 2019. This release corresponds to Release 23.0 of EcoCyc.
IRs (inverted repeats) and amino acid residues were identified as important for DgoR to bind its cis-acting element Singh B et al. (2019). D-galactonate induced a conformational change in DgoR to derepress the dgoRKADT operon Singh B S et al. (2019), and glutarate selectively relieved repression of the glaH-lhgO-gabDTP operon by GlaR Knorr S et al. (2018).
DksA was found to be critical for aerobic nitric oxide (NO) detoxification Chou WK et al. (2019), and OmpR was identified as a regulator of bacterial virulence, growth, and metabolism, in addition to its role in regulating outer membrane proteins Chakraborty S et al. (2018).
Finally, 129 regulatory interactions from high-throughput analysis for the Nac transcriptional dual regulator were added to the Nac regulon Aquino P et al. (2017).
Summaries for 48 objects were updated. We have curated the published literature through the end of December 2018.
Release 10.5
September 13th, 2018. This release corresponds to Release 22.0 of EcoCyc.
We have kept manual curation up to date and we have also curated HT data from 51 articles of ChIP and gSELEX experiments in conjunction with the corresponding expression profile experiments. A proof of concept pipeline to merge binding and expression evidence to identify regulatory interactions was developed and the datasets can be visualized in the RegulonDB JBrowse. We implemented Microbial Conditions Ontology (MCO) with a controlled vocabulary for the minimal properties to reproduce an experiment, which contributes to integrate data from high throughput and classic literature. We contiune with the integration of the GENSOR Units for 200 transcription factors, including their regulation at the metabolic level, and include summaries for close to 70 of them. Finally, we offer a link within RegulonDB with a summary our research with natural language strategies to enhance our biocuration work.
We have curated the published literature through the end of July 2018.
Release 10.0
Jun 18th, 2018. This release corresponds to Release 22.0 of EcoCyc.
Classical Annotation.
Two new transcriptional regulators were identified: YhaJ, a newly identified transcriptional regulator controlling genes involved in different processes, and XynR, a regulator of xylonate catabolism in Escherichia coli K-12 W3110. These two identifications were based on analysis using a SELEX screening system. X-ray structures were determined for FrmR, a formaldehyde sensor, at 2.7 Å resolution, and for RcdA at 2.55 Å resolution, and the amino acids in the CadC DNA-binding domain (DBD) for DNA recognition and function were identified as was the crystal structure of the CadC DBD. The YedVW was renamed HprSR. The OmpR and CutR regulon were determined based on ChIP-exo and transcript profiling, respectively. Alteration of the Gly184 residue of the CRP affects its DNA binding and probably its RNA polymerase interaction, and acetylation of the lysine (K100) residue is a mechanism by which the cell downregulates CRP-dependent class II promoter activity while elevating CRP steady-state levels, thus indirectly increasing class I promoter activity.
Summaries for more than 180 objects were updated.
The molecular biology and physiological-level descriptions were included for GENSOR Units. Also, the signal names for the two-component system GENSOR Units were annotated.
High-throughput experiment annotations. The RegulonDB database contains the rich legacy of decades of classic molecular biology experiments supporting what we know about gene regulation and operon organization in E. coli K-12. We now include high-throughput data set collections from 32 ChIP techniques and 19 gSELEX publications, respectively. There are three essential features for the integration of this information coming from different methodological approaches: first, a controlled vocabulary within an ontology for precisely defining growth conditions; second, the criteria to integrate separate elements with enough evidence to consider them involved in gene regulation and part of our golden standard elements; third, an expanded computational model supporting this knowledge. Altogether, this constitutes the basis for adequately gathering and enabling the comparisons and integration strongly needed to manage and access such a wealth of knowledge and to allow advances into the postgenomic era.
These curated HT-supported regulatory interactions are now present within RegulonDB and can be found on the regulon page of the corresponding TF. The most direct way to access them is to type the TF name followed by "regulon," go to the link of the regulon, and display the TF regulon page.
Furthermore, via the ?Downloads? main page menu, HT datasets and any of the TF-specific HT binding datasets can be selected. For more details of this work see (Santos-Zavaleta et al., submitted)
Microbial Condition Ontology(MCO). We curated terms related to experimental conditions that affect gene expression in Escherichia coli K-12. Since this is the best-studied microorganism, the collected terms are the seed for the MCO, a controlled and structured vocabulary that can be expanded to annotate microbial conditions in general. Moreover, we developed an annotation framework to describe experimental conditions. Furthermore, we will disseminate MCO throughout the Open Biological and Biomedical Ontology (OBO) Foundry in order to set a standard for the annotation of gene expression data.
The ontology can be accessed in the ?Integrated Views & Tools? menu. Alternatively, the user can do a search using the search text box for a very specific term. In this case, the number of terms matching the query term can be seen in the summary section, in the "Growth Conditions" link. For further details see (Tierrafría et al., submitted)
Additional features.
The Web services to get different views of the genetic network was updated to include REST (JSON) architectural style.
Release 9.4
May 8th, 2017. This release corresponds to Release 20.5 of EcoCyc.
Several promoters and transcription factors were identified that control rpoE-rseABC operon expression under different growth conditions Klein G et al. (2016). PdeL is a bifunctional protein, since it was identified as a transcriptional regulator due its capability to bind to its own promoter region and stimulate its expression in response to c-di-GMP, in addition to its enzymatic activity as a phosphodiesterase Reinders A et al. (2015). The HigBA toxin-antitoxin (TA) complex and HigA antitoxin were also identified as transcriptional repressors of their own expression. The crystal structure of the TA HigBA complex was solved and displays a hetero-tetramer, (HigBA)2, comprised of two HigB and two HigA subunits Yang J et al. (2016).
The summaries for HigAB, HigA, GlrR and PdeL transcriptional regulators were updated.
We have curated the published literature through the end of September 2016.
Release 9.3
February 14th, 2016. This release corresponds to Release 20.1 of EcoCyc.
The affinities and interaction types for two transcription factors in two different promoters were described. The yhjX promoter has two YpdB-binding sites, and the site with higher affinity is important for stability and for binding of a second YpdB molecule to the lower-affinity site; together they enhance protein-DNA interactions Behr S et al. (2016). On the other hand, the copA promoter can interact with RNA polymerase independently of the holo- or apo-conformation of CueR, due to two types of interactions between RNA polymerase and the copA promoter: one is favored by the holo-CueR conformation to activate transcription, and the other one is favored by apo-CueR to repress transcription Martell DJ et al. (2015).
CecR (Cefoperazone and chloramphenicol Regulator of sensitivity), which belongs to the TetR family, was identified as a new transcriptional regulator involved in the control of sensitivity to cefoperazone and chloramphenicol Yamanaka Y et al. (2016). DecR was also identified as a new transcriptional regulator involved in cysteine detoxification Shimada T et al. (2016).
The summaries for PhoB, RcsB, LacI, AraC, NarL, RpoH, CRP, MqsR, RcsA, CpxR, MarA, YpdB, CueR, MarRAB, RstA, BasR (PmrA), TyrR, ExuR, UxuR, CspA, NrdR, and DecR (YbaO) transcriptional regulators were updated.
We have curated the published literature through the end of June 2016.
Release 9.2
September 9th, 2016. This release corresponds to Release 20.0 of EcoCyc.
New structural properties of two TFs were identified: the NarL receptor domain is able to stimulate gene transcription in a nitrate-responsive manner Katsir G et al. (2015) and CRP whose Thr127 and Ser128 residues provide high cAMP affinity and play a key role in stabilization of the CRP inactive form Gunasekara SM et al. (2015). On the other hand, the response regulators KdpE and RcsB are capable of driving gene expression Narayanan A et al. (2014) and form complexes with other proteins in a unphosphorylated manner Pannen D et al. (2016). Also, under anaerobic and iron-dependent conditions, Fur binds to more sites across the genome, increasing the number of target genes Beauchene NA et al. (2015).
The notes for rrsE, rrsH, rrsD, and rrsB rRNAs Maeda M et al. (2015), ExuR Tutukina MN et al. (2016), UxuR Tutukina MN et al. (2016), BaeR Yao Y et al. (2015), GlpR Vimala A et al. (2016), RpoS Guo M et al. (2015), CsrB Zere TR et al. (2015), CspA, CsgD Soo VW et al. (2013), Dps Lee SY et al. (2015), IHF Lee SY et al. (2015), CueR Szunyogh D et al. (2015), Mlc Bréchemier-Baey D et al. (2015), UvrY Zere TR et al. (2015), and NarL Katsir G et al. (2015) transcriptional regulators, and CsrB small regulatory RNA Zere TR et al. (2015) were updated.
We have now curated the published literature through the end of December 2015.
Release 9.1
April 7th, 2016. This release corresponds to Release 19.1 and 19.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
NimR (formally YeaM) confers resistance to 2-nitroimidazole, an antibacterial and antifugal agent and plays a regulatory role in divergent transcription of the nimT and nimR genes Ogasawara H et al. (2014). Based on Genomic SELEX screening, the two-component system (TCS) YedVW was characterized Urano H et al. (2015). The YedVW and CusSR TCSs form a unique regulation system, where both TCSs recognize the same DNA sequence for binding in the hiuH; YedVW sensing H2O2 and CusSR sensing Cu(II) Urano H et al. (2015). YdeO regulon plays an important role in survival under, both acidic and anaerobic conditions Durban J et al. (2013).
Two transcriptional regulators were identified. YjjQ, a transcriptional repressor of genes required for flagellar synthesis, capsule formation, and other genes related to virulence Wiebe H et al. (2015), as well as YebK, a transcriptional regulator implicated in the adaptation to the transition from rich medium to cellobiose minimal medium, reducing the length of the lag phase Parisutham V et al. (2015). On the other hand, it also was determined that both YbiB and Dam bind to DNA and could play a role in the transcriptional regulation Schneider D et al. (2015), Horton JR et al. (2015).
Summaries for MazE, McbR, CRP, RutR, BolA, FadR, Zur, H-NS, OmpR, H-NS, CRP, LacI, FadR, NemR, DksA, SdiA, PhoB, HU, CadC, and SoxR transcriptional regulators were updated.
We have curated the published literature through the end of July 2015.
Release 9.0
September 15, 2015. This versión of RegulonDB(9.0) uses the same release data from Ecocyc (19.0) as the previous versión (8.8).
Updated TF families, position-weight matrices and their grouping in clusters
We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families.Comprehensive semiautomatic curated elementary Gensor Units
We redesigned theWeb page for GENSOR units, and this page now contains three sections: the graphical map of the elementary GENSOR unit, its general properties, including the written summary and a section for the properties of each reaction.Coexpression distance around the Regulatory Network
We have implemented tools for a full comparison of expression of groups of genes across all conditions. The 'Coexpression' page can be reached directly from the search option. A single query gene or a group of genes are added either manually, based on the set of interest to the user, or are automatically uploaded as a collection of genes defining operons or regulons. In addition, we offer a coexpression overview for two groups of input genes: operons and regulons.
Release 8.8
May 5th, 2015. This release corresponds to Release 19.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
SutR (formally YdcN) was identified as a transcriptional dual regulator of genes involved in the utilization of sulfur metabolism Yamamoto K et al. (2014).
The crystal structure of the DinJ-YafQ complex was resolved at 1.8 Å Ruangprasert A et al. (2014). Notes for BaeR, AcrR, Fur, SoxS, PspF, and CpxR were updated Srivastava SK et al. (2014), Lee JO et al. (2014), Méhi O et al. (2014), Molina-Quiroz RC et al. (2014), Darbari VC et al. (2014), Vogt SL et al. (2014).
Searching
A new view was added to the display of search results by regulon. When the user selects "regulon search" without giving a term, all the regulons are displayed. The table shows the regulon name, the total of regulated genes, the total of regulated operons, the total of binding sites, and the total of regulatory interactions.
Downloads
A link to download the weight matrices in consensus format was added to the downloads page of the website.
Release 8.7
February 2nd, 2015. This release corresponds to Release 18.1 and 18.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
MraZ was identified as a transcriptional repressor involved in the control of cell division and cell wall genes Eraso JM et al. (2014). It binds to a region of DNA containing three successive TGGGN direct repeats that are separated by two consecutive 5-nt-spacer close to mraZp promoter Eraso JM et al. (2014). Also, the summaries for MraZ, CRP, RcsB- BglJ, HipB, LacI, PhoB, YehT, YpdB, HypT, MarR transcriptional regulators were updated.
On the other hand, YdcI was also identified as a transcriptional repressor involved in the survival, stress response, and cell interactions in Salmonella enterica serovar Typhimurium Solomon L et al. (2014). Based on N- and C-terminal exchange between S. Typhimurium and Escherichia coli, it was also possible to determine that YdcI is a transcriptional repressor in E. coli Solomon L et al. (2014).
The crystal structure of LsrR, with its native signal (phosphor-Al-2), SdiA, and DhaR, has been determined Ha JH et al. (2013), Kim T et al. (2014), Shi R et al. (2014). Summaries for OmpR, H-NS, CRP, IHF, PhoB, transcriptional regulators, and the specialized sigma in response in heat shock and misfolded proteins, σE; were updated.
We have curated the published literature through the end of September 2014.
Release 8.6
April 11th, 2014. This release corresponds to Release 18.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
RclR (formerly YkgD) has been experimentally determined to be a redox-sensitive transcriptional activator of essential genes for survival under reactive chlorine stress Parker BW et al. (2013). In addition, ArcA was shown to utilize its diverse binding site architecture for global control of carbon oxidation pathways Park DM et al. (2013). Also, the summaries for MlrA, ArcA, FeaR, MalT, OmpR, CspA, BaeR, Crp, AraC, H-NS, and FadR transcriptional regulators were updated.
In addition, we have added data from high-throughput experiments to RegulonDB as a dataset, with data for LeuO, H-NS, and CRP transcriptional regulators from genomic SELEX analysis and for transcriptional start site mapping based on dRNA-seq, see High-throughput Datasets.
We have curated the published literature through the end of December 2013.
Release 8.5
November 28th, 2013. This release corresponds to Release 17.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We added the tetrameric conformation for the transcriptional regulator LsrR. In the presence of phosphorylated autoinducer 2 (AI-2), the tetramer dissociates into dimers, and the interaction of LsrR with DNA is greatly reduced Wu M et al. (2013). We also added its inactive conformation, LsrR-AI-2 . Two new conformations, MetJ-MTA and MetJ-adenine, for the MetJ transcriptional regulator were also added. The metabolites 5´-deoxy-5´-(methylthio) adenosine (MTA) and adenine (Ade) bind with high affinity to MetJ, but their biological effects are not known Martí-Arbona R et al. (2012).
The summaries for NemR,RbsR, MarR, NrdR, ArcA, LsrR, PspF, and YpdB transcriptional regulators were updated.
In version 8.5, we made a major change to the main pages of RegulonDB. The pages were reorganized to provide a more structured access to the data, based on the two dominant types of users: those conducting individual search queries and those accessing the data collections.
We also added the option "Gensor Unit Groups" within the Integrated views & Tools menu, which enables display of all Gensor Units so far reviewed in RegulonDB. Currently, we have 53 GUs, and they are grouped into 5 categories.
Release 8.3
July 29th, 2013. This release corresponds to Release 17.1 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
As part of our curation on transcriptional regulation, we have finished linking references to their corresponding evidence for 225 promoters in which this relationship did not exist.
We have corrected and relocated the transcription factor binding sites of PuuR. The BSs of PuuR identified by Nemoto et al. consist of 15 nucleotides, with the following recognition sequence: AAAATATAATGAACA Nemoto et al. (2012). Analysis done by the curator on the experimental assays and the sequences identified by Nemoto et al., showed that the binding sites of PuuR may have a length of 20 nucleotides with an inverted repeated symmetry (ATGGACAATATATTGACCAT). The consensus sequence identified by Nemoto et al. in 2012 is included in the consensus sequence proposed by the curator and the nucleotides conserved between the two sequences are underlined.
Release 8.2
April 22th, 2013. This release corresponds to Release 17.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
Four new transcription factors have been identified, PgrR, RcdA, YdfH, and YpdB; the functional conformation for IscR has been included and we enriched summaries for nine TFs as detailed below.
PgrR, a repressor of the expression of genes related to peptidoglycan degradation Shimada et al. (2013). RcdA is involved in the regulation of a number of stress response genes, biofilm formation and of transcription regulators genes Shimada et al. (2012). YdfH belongs to the GntR transcription factor family is a repressor of the rspAB operon |CITS:[22972332]| and YpdB, an activator that participates in the carbon control network and may participate in nutrient scavenging before entry into stationary phase Fried et al. (2013).
The new conformation IscR-2Fe-2S for the transcription factor IscR was included in this release. IscR-2Fe-2S represses the transcription of the operon iscRSUA, which encodes genes for the Fe-S cluster biogenesis pathways Giel et al. (2013).
Summaries for FadR, NikR, BluR, LeuO, HNS, MarA, SoxS, Rob and PspF were enriched. In addition, it was determined that MqsRA complex does not bind to DNA instead it functions to destabilize the MqsA-DNA complex Brown et al. (2013).
We have reclassified the evidence supporting the knowledge in the database as weak, strong, or confirmed Weiss et al. (2013). The level of confidence is assigned in two stages; in stage I we classify single evidence into weak and strong, and in stage II we validate data by integrating multiple evidence items in a process termed "analytical cross-validation," where the result is the confidence of the knowledge (strong, weak, or confirmed), see the page regarding evidence. This process has been automated to report relevant changes in each release.
In the gene page, we have created a new section named "Elements in the selected gene context region unrelated to any object in RegulonDB." In this section are included the biological objects that are not associated with a transcription unit.
In addition, in the same page in the operon section, called operon arrangement, are links to the operon page. Each promoter field is linked with the corresponding operon page. In the submenu related to the data sets, included in downloads, we have integrated new information related to the transcription start sites (TSSs) experimentally determined in the laboratory of Dr. Morett. The TSSs are included in the file named "High-throughput transcription initiation mapping. Illumina directional RNA-seq experiments where total RNA received different treatments to enrich for 5'-monophosphate or 5'-triphosphate ends. "These objects are included in the new section, "Elements in the selected gene context region unrelated to any object in RegulonDB," previously described.
In addition, HTTIM evidence has been removed, and associated promoters with this evidence have been reclassified as follows: 267 as TIM, 42 as ROMA, and 39 as RS (classification based on Weiss HTP evidence; Weiss et al. (2013).
Release 8.1
December 17th, 2012. This release corresponds to Release 16.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We have annotated 35 predictions for TFBSs of 13 regulators. The matrices used for these predictions were constructed by RegulonDB database Medina-Rivera et al. (2010). Four TFs (ArgR, AscG, Cra and Rob) have regulatory interactions with weak evidence and interactions for eight regulators have with strong evidence: CRP, EvgA, ExuR, FIS, LexA, NtrC, PhoP, TorR, and UxuR.
A new response regulator, YehT, of two component system was curated Kraxenberger et al. (2012).
We have identified that mntS is included within the coding region of the rybA gene. rybA transcribes two different functional products, a small RNA (rybA) and a small protein (MntS), both are transcribed from the rybA promoter Waters et al. (2011). RybA is rapidly processed at the 5´end as well as at multiple sites at the 3´end.
We have completed adding references and evidence codes to 85 transcriptional Regulatory_Interactions. Now every manually curated Regulatory_Interaction has a reference and an evidence code associated with it. We continued to enrich the summaries of ten TFs: AraC, ChbR, IscR, LacI, MarA, NorR, PhoB, RcnR, Rob and SdiA.
Release 8.0
October 2nd, 2012.
High-level curation
We describe next, two elements of our efforts toward obtaining higher integration levels: (i) GUs and (ii) the organization of multiple TFBSs into regulatory phrases.Fur, a complex gensor unit
In 2011, we described the new concept of genetic sensory-response units, or "gensor units", (GUs) which are composed of four components: (i) the signal, (ii) the signal-to-effector reactions that end with activation or inactivation of the TF, (iii) the regulatory switch (resulting in activation or repression of transcription of target genes), and (iv) the consequence, i.e., the effects and roles of the regulated genes.RegulonDB contains 25 completed GUs for local TFs and small regulons. We curated a much larger GU as a first step toward eventually compiling information on GUs of global regulators. Fur regulates transcription initiation of 66 TUs, including 9 TFs, a regulatory small RNA (sRNA), and two sigma factors (σ19 and σ38). Its diagram has more than 200 reactions and close to 300 nodes. In order to facilitate interpretation of this GU, we included a high-level illustration that provides an overview of all classes of genes and functions subject to Fur regulation. Search gensor unit in the main menu in RegulonDB and select Fur overview.
Regulatory phrases
For years we have displayed the collection of sites in upstream regions affecting each promoter, leaving it to the user to decipher how these multiple sites, which bind the same or different TFs, work in a coordinated fashion, or not, to regulate transcription. We have implemented the first version of regulatory phrases, grouping transcription factor binding sites (TFBSs) that work together in a single promoter, as well as by grouping all arrangements of the same TF with the same effect in different promoters.Enriched classifications based on classic and HT evidence
We expanded the assignment of quality to various sources of evidence, particularly for knowledge generated via high-throughput (HT) technology. Based on our analysis of most relevant methods, we defined rules for determining the quality of evidence when multiple independent sources support an entry. See the new page of evidence in "About RegulonDB".Tracks display of HT data sets and submission forms for HT data sets
We implemented a new tool in the main menu for use of a browser with the option of several tracks, based on GBrowser v.248.The menu page where users choose which sets to display now contains a variety of data sets, including manually curated RegulonDB collections of objects. We have also included a mechanism that enables the display of "Data Sets" in the GBrowser. On the GBrowser page, a user can proceed to "Select tracks" to see the full set of options currently available, classified by type of object, including operons, TFs, Chip-Seq TFBSs, promoters, HT-mapped TSSs, sRNAs, and TF conformations, among others. An additional category called "Genome regions", for genes as well untranslated regions of 5´and 3´ends of TUs are also included.
Submission forms for HT-datasets
Every single data set can be documented as requested when authors submit their experimental data, with specific formats for each type of source (i.e., TSS, Chip-Seq). We implemented a Web format for those interested in submitting their data sets directly online.Evolutionary conservation of promoters and regulatory interactions
For the first time, we have added the evolutionary evidence for promoters and TFBSs within gammaproteobacteria. These are available from the gene and regulon pages, with graphics showing a summary of the number of genomes where conservation is found and the alignment and conserved sequences available as multiple alignments.A new Regulon page: addressing user needs and suggestions
Based on comments and suggestions by RegulonDB users, we modified the page displaying information about regulons and simplified the search for all TFBSs of a single TF. The new page includes an icon linking a regulon to the GU, the summary for the TF, followed by a section displaying the functional and nonfunctional conformation(s), a classification of the effector based on its source as internal, external, or dual; a category for the TF based on its connectivity, the target regulated genes, and the operon where the TF gene belongs. Subsequent sections describe functional properties of the regulon, the set of TFBSs and their organization patterns and phrases, logos, PWMs, and additional properties.
Release 7.5
August 29th, 2012. This release corresponds to Release 16.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We have updated the lengths and included the consensus sequences of TFBSs for 17 regulators: AgaR, AraC, ArcA, AscG, CaiF, DnaA, FhlDC, IclR, KdpE, LeuO, MalT, MelR, NanR, PrpR, PutA, RhaS, and XylR.
Three new transcription factors have been included: FliZ, MatA, and YjiE.
FliZ is a repressor that contains an α-helix that is similar to helix 3.0 of σS and that represses genes involved in the regulation of the motility system and curli expression. Pesavento et al. in 2012 determined that this regulator binds to regions of σS-dependent promoters, can recognize alternative σS promoter-like sequences, and can also discriminate vegetative promoters Pesavento et al. 2012.
MatA is a transcriptional dual regulator in meningitis isolate E. coli strain IHE 3034, and it interferes with bacterial motility and flagellar synthesis in E. coli K-12 Lehti et al. 2012. in E. coli K-12. Given the high similarity between the two strains, we have added this regulator to the information for E. coli K-12.
QseD, a putative transcriptional LysR-type regulator, was renamed YjiE and is now considered a DNA-binding transcriptional dual regulator. It regulates genes involved in cysteine and methionine biosynthesis, sulfur metabolism, iron acquisition, and homeostasis Gebendorfer et al. 2012. A new function was identified for OxyR, in controlling genes under nitrosative stress during anaerobic respiration (Seth et al. 2012).
Release 7.4
March 29th, 2012. This release corresponds to Release 16.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
In this release we added the consensus sequences, lengths, and symmetries corresponding to 10 TFs. We update the binding sites for 4 TFs that belong to the LysR family (ArgP, IlvY, MetR, and NhaR) and 3 response regulators that correspond to two-component systems (BaeR, CitB, and CpxR); DinJ is included in the toxin/antitoxin system, and PurR regulates genes involved in purine/pyrimidine biosynthesis. Finally, PdhR is involved in central metabolic fluxes and, more recently, has been found to be involved in the utilization of glycolate and cell division.
In these cases we used different strategies to identify the characteristics of the TFBSs. We performed alignments of the sequences upstream of genes regulated by these proteins and compared orthologous intergenic regions, and we also used other databases, such as RegPrecise Novichkov et al. 2010. In addition, the binding sites of the regulator MetR were corrected based on comparisons with homologous sequences reported for Salmonella typhimurium. In all cases we also analyzed the available experimental evidence that corresponded to each regulatory interaction.
On the other hand, we are continuing with the annotation of allosteric regulation of the RNAP by ppGpp and DksA. In this sense we have expanded the notes for GreB, GreA and DksA. In addition we also have enriched notes for different transcriptions factors, such as: AidB, ArgP, AtoC, DcuS, DpiB, Fur, HNS, LacI, MalT, MntR, PaaX, PhoB, PutA and SoxS.
Release 7.3
Nov 1st, 2011. This release corresponds to Release 15.1 and 15.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We continue with the effort to update and assign the correct lengths and central positions of the binding sites of TFs. In this release we have analyzed consensus binding sequences for 20 transcription factors (TFs), and as a result we have corrected and generated new regulatory interactions and updated the consensus sequences, lengths, and symmetries of the transcription factor binding sites (TFBSs). In these cases we used different strategies to identify the characteristics of the TFBSs. We performed alignments of the sequences upstream of genes regulated by these proteins and compared orthologous intergenic regions. In all cases we also analyzed the experimental evidence that corresponded to each regulatory interaction.
We corrected and relocated the TFBSs of 7 response regulators of the two-component systems: DcuR, EvgA, NtrC, OmpR, PhoB, PhoP, and RstA. We updated the sites of 5 TFs involved in the acid resistance system: BglJ, GadE, GadX, GadW, and RcsB. We added new consensus sequences for 4 local TFs: SoxR, YqhC, YqjI, and CspA.
The experimentally characterized TFBSs for the transcriptional regulatory components of the HipBA, MqsAR, RelBE, and YefM-YoeB toxin/antitoxin systems have been updated.
On the other hand, we continue with the annotation of other mechanisms of regulation. In this sense we have curated mechanisms of regulation affecting allosterically RNA polymerase at transcription initiation. ppGpp is a nucleotide that binds RNA polymerase alone or forms a complex with DksA and affects transcription in either a positive or negative manner. Genes involved in responding to nutrient limitation as well as amino acid biosynthesis were positively affected by ppGpp and DksA. The genes related to rRNA promoters and to the stringent response were negatively controlled by both regulators. Currently, 67 promoter interactions regulated by ppGpp, as well as some regulated by DksA, have been curated.
Release 7.2
May 6th, 2011. This release corresponds to Release 14.6 and 15.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We are continuing the analysis of binding sites of different transcription factors (TFs). In this release we have included new consensus sequences for 45 local TFs that have three or fewer binding sites in the database. Most local TFs bind to small sequence motifs (11 to 24 nucleotides) with different symmetries, and these are arranged as inverted repeats (39), direct repeats (2), or asymmetrical (4) sequences with a variable space sequence between them. In these cases we performed alignments of the sequences upstream of the genes regulated by these proteins and evaluated the lengths and symmetries of the consensus sequences. In general, the sequences of unique binding sites are highly conserved, and the length and symmetry are evident.
TFs with inverted repeat symmetry include the following: AcrR, AllR, ArsR, AtoC, BaeR, BirA, BetI, CueR, CusR, EnvR, FabR, GlrR, HcaR, HyfR, KdgR IdnR, ilvY, LacI, LldR, MalI, MarR, MhpR, MntR, MurR, NadR, NemR, NikR, NorR, PrpR, RbsR, RcnR, TdcA, TreR, UhpA, UidR, YiaJ, YoeB-YefM, ZntR, and Zur.
TFs with directed repeat symmetry: CreB and MngR.
TFs with asymmetric symmetry: ChbR, RhaR, XapR, and ZraR.Curation of transcription factors (TFs) for this release included updates to the summaries for MalT, UlaR, ArgR, MlrA, McbR, TreR, and YqhC. In addition, GO terms were updated for different TFs. The names of TFs were revised, and the category "DNA-binding" has been added.
On the other hand, a new TF, YqjI, has been added. The local regulator YqjI was reported to act as a repressor of the synthesis of an NADPH-dependent ferric reductase and its autorepression. Recently, Wang et al. described experimental evidence showing that this regulator maintains iron homeostasis in the presence of high levels of nickel Wang et al. 2011.
In this period we have completed the curation of the new Sensory Response Unit TyrR-L-tyrosine, L-phenylalanine involved in the synthesis and transport of aromatic amino acids.
Our publication concerning the Gensor Units Gama-Castro et al. (2011), corresponding to release 7.0, was chosen by the editors of Nucleic Acids Research to appear on their Featured Articles page: http://www.oxfordjournals.org/our_journals/nar/featured_articles.html.Feature Articles in Nucleic Acids Research represent the top 5% of papers in terms of significance, originality, and scientific excellence.
Our paper contains information for the release corresponding to 2008, 2009, and 2010.
Release 7.0
January 26th, 2011. This release corresponds to Release 14.5 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
In this database release version, our main goal was to model the regulatory pathways, including integration of the metabolic pathways with the different objects represented in the database. For this reason, RegulonDB has expanded the biological context, and we now refer to this integration in terms of genetic sensory response units, or Gensor Units (GUs) Gama-Castro et al. (2011).
The inclusion of Gensor Units brings a dramatic change and expansion of RegulonDB, due to the fact that we are adding several new types of interactions, reactions and superreactions that summarize concatenated sets of reactions, linked to the other databases that contain such information.
Gensor Units: An elementary genetic sensory response unit, or Gensor Unit, is formed by four components, all of them concatenated in a loop of processing of information that initiates with a signal or stimulus (i), which can be of external or internal origin. The second component is represented by the signal transduction pathway (ii), a concatenated set of reactions that affect gene expression. The third component is represented by the core of regulation or genetic switching (iii) and contains all regulatory elements necessary for modifying gene expression, inducing and or repressing a collection of regulated genes, and ends with an response (iv) that corresponds to the collection of biological capabilities derived from the affected gene products Gama-Castro et al. (2011).
We have now initiated the curation of five GUs related to the signal transduction of the sigma factors and 21 related to the two-component systems. And we have completed the curation of 15 GUs involved in carbon source utilization, and five involved in the metabolism of amino acids.
GUs related to the two-component systems.
ArcA AtoC BaeR DcuR DpiA EvgA KdpE NarL NarP OmpR PhoB PhoP QseB RcsB RstA TorR UhpA ZraR
GUs related to carbon source utilization.
AlsR AraC ChbR FucR GatR GntR GutR-SrlR IdnR LacI MelR RbsR RhaS TreR UidR XylR
Previously, the structure of this database was accessible via the internet through four major navigation paths, by Genes, Operon, Regulon, and Growth Condition, combining graphics and literature information. Here, we provide three new types of searches: by Gensor Unit, Sigmulon, and small RNA (sRNA).
On the other hand, we have also corrected and relocated the DNA binding sites for FhlA, Ada, CaiF, NhaR, and YiaJ. Initially, Leonhartsberger et al. in 2000 showed that FhlA binds to inverted repeat sequences of 16 bp (CATTTCGTACGAAATG) Leonhartsberger et al. (2000). However, our alignment results for all the regions that FhlA binds showed that this sequence is not conserved. This result also showed that the motif TGTCGnnnnTGACA is conserved in the sequences examined, and for this reason we have relocated, reassigned, and corrected the binding sites of the FhlA regulon in the database. The FhlA-binding sites are represented in the database by an inverted repeat motif of 14 bp.
In the cases of Ada and CaiF, we performed alignments of the sequences upstream of the genes regulated by these proteins and evaluated the previous consensus sequences of the binding sites Teo et al. (1986), Nakamura et al. (1988), Buchet et al. (1999). In addition, the lengths of the degenerate binding sites of NhaR were defined according to the matrix shown for this regulator in the database RegPrecise. That database contains matrices generated from alignments of orthologous regions Novichkov et al. (2010).
In 2000, Ibañez et al. showed that transcription of the divergent operon yiaJ-yiaKLMNO-lyxK-sgbHUE depends on the YiaJ repressor. However, those authors suggested that this regulator binds a long region of 35 bp Ibañez et al. (2000). The alignment of this region with the orthologous sequence of Klebsiella pneumoniae showed a conserved palindrome of 21 bp Campos et al. (2008). The position and length of the YiaJ-binding site reported in our database have been changed to reflect this.
A new portable drawing tool for genomic features is available, as well as several new forms for downloading the data, including web services, files for several relational database manager systems, and files in BIOPAX format.
Release 6.8
August 18, 2010. This release corresponds to Release 14.1 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
The motif obtained from aligning OxyR binding sites is highly variable due to the length of sequences, even though, through manipulation of the alignment it is possible to detect four conserved regions. For this reason we have relocated, reassigned, and corrected binding sites of the OxyR regulon, corresponding to 19 transcription units. Toledano et al. (1994) showed that OxyR binds in tandem to four ATAG elements and defines a consensus motif, ATAGntnnnanCTATnnnnnnnATAGntnnnanCTAT covering around 40pb (Toledano et al. (1994)).
We now propose a new consensus sequence, GATAGGTTnAACCTATCnnnnnGATAGGTTnAACCTATC, which contains two inverted repeat motifs, GATAGGTTnAACCTATC, of 17 bp separated by 5 bp. This sequence consensus is based on agreement of alignments realized by the curator of these upstream regions and on the corresponding evidence, obtained in the bibliography for every operon, including the similarity to the consensus sequence, data from footprinting assays, computational analysis of these sequences, and profiling of OxyR-dependent gene expression. In the database the OxyR-binding sites are represented by an inverted repeat motif of 17 bp.
During this last period, we have updated curation on transcription initiation including publications until end of April, 2010.
Release 6.7
March 24, 2010. This release corresponds to Release 14.0 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
We have corrected and relocated the binding sites of the CytR transcription factor. This regulator negatively controls the expression of genes that encode the proteins required for transport and utilization of ribonunucleosides and deoxyribonucleosides. The CytR binding sites were previously represented as long regions which were determined by footprinting of several promoter sequences.
Computational analysis of these sequences showed that the optimal CytR binding site consists of two octamer repeats, GTTGCATT, in direct o invert orientation and preferably separated by 2 bp. Experimental support of this consensus sequence was obtained from footprinting, site-directed mutagenesis experiments and gene expression. (Pedersen et al. (1997), Jorgensen et al. (1998) ) We have updated curation on transcription initiation including publications until end of December, 2009.
Genetic NetWorks
We have different genetic networks available by using pre-computed datasets, web services, dump files and direct connection to a mysql database.
1.- Datasets
New dataset files have been created in order to have a complete repertoire of genetic networks. They are available at the Downloads/Data Sets option.
The new files are:The network between TFs and operon. The network between TFs and their regulated TFs. The network between sigma factors and their regulated operons. The network between sigma factors and their regulated sigma factors. 2.- Images
A set of pre-computed images of different genetic networks are available at the Tool menu in Transcriptional Regulatory Network option.3.- Web services
The description of the NetWork Web service is also available. Perl and java clients were developed for this service.4.- Connection to the database
We created an additional public repository in mysql for those users who want to connect directly to the database to have access to these genetic networks. The configuration is the following:Server: yumkax.ccg.unam.mx
Port: 3306
Database: regulondb
User: network_guest
Password: *** (request it by sending an e-mail to [email protected]) Using the mysql database driver.
Release 6.4
Aug 10,2009. This release corresponds to Release 13.1 of EcoCyc. All data on transcriptional regulation curated in our lab is the same in both databases.
Our curation update project is progressing; we have substantially curated information on regulation of transcription initiation up to the end of June 2009.
We have now completed summaries for all 170 Transcription Factors (TFs) that have at least one experimentally characterized binding site or interaction. These regulators represent 33 families of TFs, and the summaries describe relevant characteristics of each regulatory protein. A summary of the functions of these 170 TFs is the following:
Seven TFs are considered to be global regulators and are involved in regulating multiple operons and genes of different functional classes or gene ontologies, including DNA architecture, such as: anaerobiosis (ArcA and FNR), carbon source (CRP), factor for inversion stimulation (FIS), organization, maintenance of nucleoid, as well as other cellular processes (HNS, Lrp, and IHF).
Additionally, 21 response regulators belong to two-component systems, 42 TFs are included in the carbon sources system, 17 TFs are related to processes such as transport, biosynthesis and catabolism of the amino acids, 13 TFs are involved in the transport and metabolism of different nitrogen sources, and 8 TFs are classified as metallo-regulators. Note that the TFs can be involved in more than one function.
The rest of the TFs are considered to be local regulators that control the genetic transcription of different cellular processes and functional classes, for instance, flagellar and chemotaxis systems, metabolism of nucleosides, transport and synthesis of fatty acids, DNA replication, quorum sensing, toxin-antitoxin systems, adaptation and resistance to different conditions of stress, among others.
We have completed adding references and evidence codes to 210 promoters. Now every manually curated promoter has a reference and an evidence code associated with it.
In the web page that displays gene information, users now can get fasta files from gene nucleotides sequence and amino acid sequence of its products. Also in the gene, operon and regulon web page we are including links to M3D database.
We implemented an Object-Relational mapping technology based on Ibatis framework, with the idea of getting better time in retrieving results on a query. Also, we are replacing the network tool application with a new version that implements different interface graphics for network regulation of Genes, Operons and Transcription Factors.
Finally, we are releasing web services to enable to developers to access RegulonDB’s data via SOAP.
Release 6.3
Feb 10, 2009. This release, corresponds to that of 12.5 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
Our general curation update project is progressing; this version contains curated information on regulation of transcription initiation up to end of May 2008.
Release 6.2July 10, 2008. This release, corresponds to that of 12.1 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
In order to expand the information on transcription regulation of E. Coli, the laboratory of Dr. Julio Collado has been making an effort to generate and analyze data coming from high-throughput experimental mapping of promoters. The initial results of this approach are available in this release, which includes a collection of 259 new transcription start sites (TSSs) that have been experimentally determined using a high-throughput experimental modified RACE approach (with the corresponding new evidence code: EV-EXP-IDA-HPT-TRANSCR-INIT-M-RACE-MAP). Of those 259 sites, 110 are from TUs with hypothetical genes for which no function has been inferred.
These promoters were linked to new transcription units with exactly the same number of genes as previous existing ones.
To validate the accuracy of this strategy, we used it to identify the previously published TSSs for 50 TUs, 92% of which showed a perfect match (with a discrepancy of up to one nucleotide with respect to the published TSS). The rest showed slight ambiguity, inherent to the RACE protocol, of up to six nucleotides. We detected more than one TSS in 14 of these TUs. Interestingly, for only two of them, additional TSSs had been reported. Thus, our results are highly accurate and determine additional promoters to >25% of TUs previously determined.
The experiments were performed in the laboratory of Dr. Enrique Morett, Institute of Biotechnology, in collaboration with the laboratory of Dr. Julio Collado-Vides, both at UNAM. This mapping has been supported by NIGMS grant RO1-GM71962.
This version has curated information on regulation of transcription initiation up-to-date as of March, 2008.
Release 6.1April 15, 2008. This release, corresponds to that of 12.0 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
We are currently updating our curation related to transcriptional regulation in E. coli, including the recent literature. In this release we have initiated the annotation of some promoters and DNA binding sites from computational predictions and from high-throughput experiments such as microarrays and ChIP-chip experiments. Only promoters or DNA binding sites that have evidence from at least two of these three types of experiments have been added to EcoCyc. Some examples are: Fur DNA binding sites identified by computational prediction and binding of purified protein in Chen et al. (2007), Sigma32 promoters identified by ChIP-chip, microarray analysis and in vitro transcription assays in Wade et al. (2006), and Sigma32 promoters identified by microarray analysis, transcription initiation mapping and in vitro transcription assays in Nonaka et al. (2006). Promoters identified by libraries of fluorescent transcriptional fusions (Zaslaver et al. (2006)) are also included in this release.
Release 6.0January 15, 2008. This release, corresponds to that of 11.6 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
The current liberation highlights the modifications and improvements that are transforming RegulonDB into a more comprehensive model of gene expression regulation.
In order to expand our knowledge of the regulatory universe of E. coli beyond literature searches, we started a genome-wide project to experimentally map as many promoters as possible in this organism. For this purpose, we made use of a modified 5'RACE protocol with gene-specific oligonucleotides. A total of 317 TSSs for 269 TUs (38 have more than one TSS) have been mapped with the 5'RACE methodology, 110 of which correspond to TUs with hypothetical genes for which no function has been inferred. The newly-mapped TSSs have been included in RegulonDB. A detailed compendium of these findings will be published elsewhere. In addition to the already existing data from σ70 promoters, we have generated computational predictions for four different promoters of the σ70 family: sigma 24, 28, 32, and 38. Promoter predictions have also been generated for the σ54 factor, which defines a different sigma factor family than σ70. The putative +1 of transcription initiation, along with the -35 and -10 boxes, can be downloaded from RegulonDB.
In order to provide a more comprehensive annotation of gene expression regulation, RegulonDB is now modeling not only transcriptional regulation data, but also other kinds of regulatory elements, such as small-RNAs. This inclusion consists of a graphic representation and textual information about their sequences, location, evidence, and references, which are shown on the Operon page.
RegulonDB literature can now be searched using the Textpresso text mining engine, customized for E. coli. Textpresso allows direct exploration of curated literature, both at the level of highly-specific keywords and with entire categories or ontology classes (derived from GO concepts or customized word lists). The user can, for example, search for papers that feature a type of regulation in which a gene or operon and a specific TF are mentioned in the same sentence. Currently, the tool can search through 2472 full-text papers, 3125 paper abstracts, and over 4200 curator notes. The addition of this text mining tool to RegulonDB will expand the possibilities for the end user to traverse the knowledge space of E. coli metabolism and gene regulation, and it will allow our curators to refine and confirm their annotations.
To facilitate the implementation of more regulatory objects, an additional classification of transcription factors has also been included. TFs have been labeled as "global" or "local" regulators based on the number of genes they directly regulate, the number of co-regulators they work with, the number of TFs they regulate, the diversity of types of promoters they regulate, and the number of different functional classes of their regulated genes. From the set of 160 TFs currently annotated in RegulonDB with experimental data, seven TFs are identified as global regulators: CRP, IHF, FNR, FIS, ArcA, Lrp, and HNS, while the rest are termed local regulators. In addition, TFs are classified in RegulonDB according to their "Sensing class" as: internal, external, hybrid or unknown, depending on the origin of their effectors. All genes that code for known and predicted TFs have been annotated with their corresponding Gene Ontology class, and we uploaded those for the rest from EcoCyc.
Furthermore, the objects presented in RegulonDB (promoters, sigma factors, TUs, and regulatory interactions), now feature tables and graphs for different relationships among the database objects.
Evidence associated to RegulonDB objects was classified as strong or weak in accordance to the level of reliability of the experiment that supports the object properties and relationships between them. The more reliable evidence was called "strong" and the less reliable ones, "weak". Evidence strength can be distinguished graphically by a solid (strong) or dashed (weak) line. If the same object has evidence form both types, it is displayed according to the strongest one.
Finally, the RegulonDB WEB application server upgrade from version 8 to 9.0 provides a major performance and stability, which allows a faster response to the user. In addition, RegulonDB users can now download data and schema in dump files for the most popular database management systems like MySQL, Postgres, Oracle, and Apache Derby. It is also available in XML and flat file formats.
Release 5.8September 17, 2007. This release, corresponds to that of 11.5 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
We have corrected and relocated the binding sites of the ArcA response regulator. ArcA is considered to be a global regulator and is involved in respiratory metabolism and controlling the expression of about 60 transcription units. The ArcA binding sites were previously represented as long regions of 60 bp, which were determined by footprinting of several promoter sequences. Computational analysis of these sequences showed a shorter 15 bp site, GTTAnnnnnnnGTTA, consisting of two direct repeats of 4 bp separated by 7 bp. Experimental support of this consensus sequence was obtained from footprinting, site-directed mutagenesis experiments and profiling ArcA-P dependent gene expression.
Release 5.7June 1, 2007. This release, corresponds to that of 11.0 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
We finished curation of information about transcriptional regulation of genes involved in the transport and metabolism of different nitrogen sources (including the preferred source, ammonia). This curation included the annotation of sigma54 promoters and nine trascription factors and their regulons, involved in nitrogen metabolism: NtrC, FlhA, NorR, PspF, PspR, HyfR, NacC, ZraR and RtcR.
• The annotation of transcriptional promoters regulated by the sigma factors sigma19 (FecI), sigma28 (FliA), and sigma54 (RpoN) has been reviewed and updated. These factors are required for the transcription of specific sets of genes involved in the iron stress response, the flagellar system, and in nitrogen metabolism, respectively. Where experimental data was available, appropriate literature citations and notes were added.
• The TyrR, TrpR and Lrp regulons have been updated. These regulons are related to processes such as transport, biosynthesis and catabolism of the amino acids tyrosine, phenylalanine, and tryptophan (aromatic), and also serine, glycine, glutamate, leucine, isoleucine, valine and threonine. The Lrp regulon is also important for the assimmilation of ammonia in poor nitrogen conditions.
• All transcription factors have now been assigned Gene Ontology and MultiFun terms.
Release 5.6January 15, 2007. This release, corresponds to that of 10.6 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
Release 5.6 of RegulonDB includes an update of the genomic sequence of E. coli K-12. GenBank entry U00096.2 now replaces the original U00096.1 deposited by the Blattner laboratory in 1997. The major new feature of this release is the addition of computationally predicted riboswitches and attenuators to RegulonDB, that are now properly curated, with their associated evidence, and displayed in our graphics.
This latest release includes several improvements to the web site and the underlying database:
- A new search mode that supports common names, sentences and even incomplete words. We can now search for terms such as “Lac Z” or “protein source”.
- We have enhanced the graphic display of objects, adding tooltips to genes, promoters, DNA binding sites, terminators, attenuators and riboswitches. For instance, binding site tooltips show their central position when moused-over.
- The names of objects are now visible within diagrams, simplifying their identification by the user.
- We have implemented supervised automatic consistency checks that improve our data integrity.
Curation highlights:
As always, the RegulonDB team ([email protected]) is permanently curating relevant literature to keep the database up to date.
- We have been expanding notes for 30 regulatory proteins and now include short notes about the evolutionary family to which they belong, their domain composition and the cellular processes in which the regulated genes are involved. When available, an indication of the active conformation of a complex (dimer, tetramer...) is given. Relevant physiological data about the effectors of transcription factors is also covered, with the aim of helping the understanding of regulation physiology. These summaries also have descriptive information about binding site features (size, consensus sequence, relative position to the transcription start, spatial arrangement of the site sequences). Appropriate literature citations were added for these 30 regulatory proteins: AraC, AscG, BglJ, BetI, BolA, CdaR, CueR, DicA, FabR, FeaR, GadX, GcvA, HcaR, HdfR, HipB, IdnR, MalI, Nac, NanR, PdhR, PerR, PhnF, PrpR, SlyA, TreR, UidR, YdeO, YeiL, ZntR and Zur.
- The evidence codes attached to 132 transcription factors have been updated, including experimental and computational evidence. Appropriate literature citations were added for 67 of these regulators.
October 30, 2006. This release, corresponds to that of 10.5 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
• We finished curating information on the transcriptional regulation of genes involved in both flagellar and chemotaxis systems; eight new promoters and five new transcription units as well as 21 new DNA-binding sites for the transcriptional regulator FlhDC were added.
• We have completed the curation of 364 transcription units based on single-gene directons. A directon is one or a set of genes transcribed in the same direction, organized into one or several transcription units and operons (Salgado et al. (2000)). In other words, these 364 genes are surrounded by genes that are transcribed in a different direction, and therefore they must be transcribed in isolation.
Release 5.1May 12, 2006. This release, corresponds to that of 10.0 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
EcoCyc and RegulonDB have recently been updated with additional regulatory information and represent the largest comprehensive and constantly curated regulatory network of E. coli K-12. A report on our progress has been published in Salgado et al. (2006).
Regulation of degradation pathways: We expanded a project to curate within EcoCyc information about transcriptional regulation of gene expression for genes involved in the degradation of carbon sources, including the catabolism of sugars, polysaccharides and sugar derivatives. Pathways whose gene regulation has been curated are:
METABOLISM OF SUGAR DERIVATIVES: SUGAR CARBOXYLATES
Methylcitrate cycle
Methylmalonyl pathway
Conversion of succinate to propionate
Acetate utilization
Glycolate degradation
Glyoxylate degradation
L-ascorbate degradation
CATABOLISM OF SUGAR DERIVATIVES: SUGAR ALCOHOLS
Glycerol degradation I
Glycerol degradation II
Superpathway of glycol metabolism and degradation
CATABOLISM OF AROMATIC COMPOUNDS
3-phenylpropionate and 3-(3-hydroxyphenyl)propionate degradation
Regulation of expression of enzymes involved in the degradation or utilization of melibiose, maltose, fructose, chitobiose, N-acetylgalactosamine, and beta-glucosides was curated.
Regulation of the following additional pathways was curated:
Gluconeogenesis
Superpathway of gluconate degradation
Glycogen biosynthesis
Release 5.0March 16, 2006. This release, corresponds to that of 9.6 Release of EcoCyc. Data on transcriptional regulation curated in our lab is the same in both databases.
o We have curated within EcoCyc gene regulatory interactions identified in datasets from Ma et al. (2004) and Shen-Orr et al. (2002) that were not present in EcoCyc.
o Regulation of respiration pathways: We completed a project to curate within EcoCyc information about transcriptional regulation of gene expression for genes involved in respiration pathways in E. coli, which include aerobic and anaerobic phases, as well as those for electron transfer, electron donors and electron acceptors. Specific attention involved modifying all NarL binding sites with new central positions resulting from a consensus sequence of the site now defined by 7 nucleotides. Pathways whose gene regulation has been curated are:
Aerobic electron transfer
Aerobic respiration (electron donors reaction list)
Electron transfer (anaerobic)
Respiration (anaerobic)
Respiration (anaerobic)-electron acceptors reaction list
Respiration (anaerobic)-electron donors reaction list
Regulation of degradation pathways: We completed a project to curate within EcoCyc information about transcriptional regulation of gene expression for genes involved in the degradation of carbon sources, including the catabolism of sugars, polysaccharides and sugar derivatives. Regulation of operons encoding enzymes of glycolysis, the pentose phosphate pathway, the TCA cycle, and the Entner-Doudoroff pathway were also curated in this phase. Pathways whose gene regulation has been curated are:
CATABOLISM OF SUGAR AND POLYSACCHARIDES
Lactose degradation III
D-allose degradation
D-arabinose degradation
Fucose degradation
Galactose degradation I
Glucose and glucose-1-phosphate degradation
Glycogen degradation
L-arabinose degradation
L-xylose degradation
Mannose degradation
Rhamnose degradation
Ribose degradation
Trehalose biosynthesis and degradation-low osmolarity
Xylose degradation
CATABOLISM OF SUGAR DERIVATIVES: SUGAR ACIDS
beta-D-glucuronide degradation
D-galactarate degradation
D-galacturonate degradation
D-glucarate degradation
Galactonate degradation
Ketogluconate metabolism
L-idonate degradation
CATABOLISM OF SUGAR DERIVATIVES: SUGAR ALCOHOLS
Galactitol degradation
Mannitol degradation
Sorbitol degradation
Superpatway of hexitol degradation
Fructoselysine degradation
Glucosamine degradation
CATABOLISM OF SUGAR DERIVATIVES: AMINO SUGARS
Glucosamine degradation
N-acetylglucosamine
N-acetylmannosamine
N-acetylneuraminic acid dissimilation
Regulation of the following additional pathways was curated:
Glycolysis I
Methylglyoxal pathway
Non-oxidative branch of the pentose phosphate pathway
Oxidative branch of the pentose phosphate pathway
TCA cycle
Glyoxylate cycle
Pyruvate dehydrogenase
Pyruvate oxidation pathway
Entner-Doudoroff pathway I
Our general curation update project is progressing. Of the 4471 polypeptides within EcoCyc, 3758 now have comments or citations or are components of a complex that has a comment or citations. The database now contains 12026 citations.
Any comments please send an email to: [email protected]